As a Linux programmer or system admin, it is common to search through text for a given sequence of characters (such as a word or phrase), called a string, or even for a pattern describing a set of such strings; this article contains few hands-on examples for doing these types of tasks. In this article, we first review how grep Linux works while reviewing few basic string searches. It follows by diving into more complex string search using grep with regular expression.
Searching for a Word or Phrase with Grep Command
The primary command used for searching through text is a tool called grep. It outputs lines of its input that contain a given string or pattern.
To search for a word, give that word as the first argument. By default, grep searches standard input; give the name of a file to search as the second argument.
To output lines in the file ‘catalog’ containing the word ‘boy’, type:
$ grep boy catalog
To search for a phrase, specify it in quotes.
To output lines in the file ‘book’ containing the word ‘Java Coding’, type:
$ grep ’Java Coding’ book
The preceding example outputs all lines in the file ‘book’ that contain the exact string ‘Java Coding’; it will not match, however, lines containing ‘java coding’ or any other variation on the case of letters in the search pattern. Use the ‘-i’ option to specify that matches are to be made regardless of case.
To output lines in the file ‘book’ containing the string ‘java coding’ regardless of the case of its letters, type:
$ grep -i ’java coding’ book
This command outputs lines in the file ‘book’ containing any variation of the pattern ‘java coding’, including ‘java coding’, ‘JAVA CODING’, and ‘jaVA coDIng’.
One thing to remember is that grep only matches patterns that appear on a single line, so in the preceding example, if one line in ‘book’ ends with the word ‘java’ and the next begins with ‘coding’, grep will not match either line.
You can specify more than one file to search. When you specify multiple files, each match that grep
outputs is preceded by the name of the file it is in (and you can suppress this with the ‘-h’ option.). A good knowledge of Linux filesystem would be helpful to navigate the right file and folder directories.
To output lines in all of the files in the current directory containing the word ‘JAVA’, type:
$ grep JAVA *
To output lines in all of the ‘.txt’ files in the ‘˜/doc’ directory containing the word ‘Java’, suppressing the listing of file names in the output, type:
$ grep -h Java ˜/doc/*.txt
Use the ‘-r’ option to search a given directory recursively, searching all subdirectories it contains.
To output lines containing the word ‘Java’ in all of the ‘.txt’ files in the ‘˜/doc’ directory and in all of its subdirectories, type:
$ grep -r Java ˜/doc/*.txt
Grep Command with Regular Expressions
In addition to word and phrase searches, you can use grep to search for complex text patterns called regular expressions. A regular expression—or “regexp”—is a text string of special characters that specifies a set of patterns to match.
Technically speaking, the word or phrase patterns described in the previous section are regular expressions—just very simple ones. In a regular expression, most characters—including letters and numbers—represent themselves. For example, the regexp pattern 1 matches the string ‘1’, and the pattern boy matches the string ‘boy’.
There are a number of reserved characters called metacharacters that do not represent themselves in a regular expression, but they have a special meaning that is used to build complex patterns. These metacharacters are as follows: ., *, [, ], ˆ, $, and \. It is good to note that such metacharacters are common among almost all of common and special Linux distributions. Here is a good article that covers special meanings of the metacharacters and gives examples of their usage.
To specify one of these literal characters in a regular expression, precede the character with a ‘\’.
To output lines in the file ‘book’ that contain a ‘$’ character, type:
$ grep ’\$’ book
To output lines in the file ‘book’ that contains the string ‘$14.99’, type:
$ grep ’\$14\.99’ book
To output lines in the file ‘book’ that contain a ‘\’ character, type:
$ grep ’\\’ book
Summary
In this article, we reviewed how to search string in a text file in the Linux using grep command. We also discussed how to combine the power of regular expressions with grep to run complex string searches.
Recent Comments