CS246-F20-01-UnixShell
Lecture 1.11
• More UNIX commands,
– grep, egrep and regular expressions (regexps)
CS246
grep and egrep
• grep is the software engineer’s best friend
– I couldn’t live without it
– And “regular expressions” are one of the most powerful tools you can
master
– “grep was created, in an evening, by Ken Thompson.” [Wikipedia]
• Most of us just use plain old grep most of the time
– egrep adds extra-powerful pattern matching that’s really useful
sometimes, but it’s slower on big searches as it has to do more work
– fgrep is faster than grep, but less flexible
– We will use egrep here
egrep
• egrep : Extended Global Regular Expression Print
– Search & print lines matching pattern in files
• Usage:
• “egrep ” is the same as “grep –E”
egrep [-irnv] pattern-string file-list
• Some options:
-i ignore case ( “fred” matches “FrED”)
-n print line numbers for matches
-r search recursively thru all sub-directories also
-v print lines that do NOT match the pattern (invert search)
• Example: List lines containing “main” in files in the current
directory with suffix “.cc”
egrep
$ egrep main *.cc # why no quotes?
q1.cc:int main() {
q2.cc:int main() {
egrep examples
• List lines with line numbers containing main in files with suffix .cc
$ egrep -n main *.cc
q1.cc:33:int main() {
q2.cc:45:int main() {
• List and count lines containing fred in any case in file names.txt
$ egrep -i fred names.txt
names.txt:Fred Derf
names.txt:FRED HOLMES
names.txt:freddy jones
names.txt:”Right, said Fred”
$ egrep -i fred names.txt | wc –l
4
egrep patterns
• Patterns for egrep are a bit different from globbing patterns.
– Note that “*” and “?” are wildcard qualifiers, not wildcards
. Match any single character
(ab|xyz) Match pattern ab or xyz
[abc] Match any single char (a, b, or c) in the bracketed list
? Preceding item is optional; will be matched at most once
* Preceding item will be matched zero or more times
+ Preceding item will be matched one or more times
^ Start of line
$ End of line
More patterns
• List all lines of even length in flurble.txt
$ cat flurble.txt | egrep ‘^(..)*$’
• List all filenames in the current directory have exactly 1 ‘a’
$ ls -1 . | egrep “^[^a]*a[^a]*$”
• Find all five letter words beginning with ‘e’
$ egrep “^e….$” /usr/share/dict/words
# Want to print all lines in all .cc files in the current
# directory that contain the string “main” or “balloon”
$ grep -E main|balloon *.cc
bash: balloon: command not found
$ grep -E (main|balloon) *.cc
bash: syntax error near unexpected token ‘(‘
$ grep -E “(main|balloon)” *.cc
balloon-copy.cc: cout << colour << " balloon" << endl;
balloon-copy.cc:int main (int argc, char* argv[]) {
balloon.cc: cout << colour << " balloon" << endl;
balloon.cc:int main (int argc, char* argv[]) {
$ grep -E 'main|balloon' *.cc # same as above
balloon-copy.cc: cout << colour << " balloon" << endl;
balloon-copy.cc:int main (int argc, char* argv[]) {
balloon.cc: cout << colour << " balloon" << endl;
balloon.cc:int main (int argc, char* argv[]) {
• Example: List lines that
– match start of line "^", then
– match "#include", then
– match 1 or more space or tab "[ ]+", then
– match either """ or "<", then
– match 1 or more characters ".+", then
– match either """ or ">“, then
– match end of line “$”, and
– are in files with suffix “.h” or “.cc”
$ egrep ‘^#include[ ]+[“<].+[">]$’ *.{h,cc}
q1.cc:#include
q1.cc:#include
q1.cc:#include “q1.h”
egrep
End
CS246