CS计算机代考程序代写 Topic 5

Topic 5

3/27/2021

1

Topic 5
REGULAR EXPRESSIONS

TOPIC 5 – REGULAR EXPRESSIONS 1

Regular Expression
A pattern of special characters used to match strings in a search

Typically made up from special characters called metacharacters

Regular expressions are used thoughout UNIX:
◦ Editors: ed, ex, vi

◦ Utilities: grep, egrep, sed, and awk

2
Topic 5 – Regular

Expressions

3/27/2021

2

Metacharacters

any non-metacharacter matches itself

3
Topic 5 – Regular

Expressions

RE Metacharacter Matches…

. Any one character, except new line

[a-z] Any one of the enclosed characters (e.g. a-z)

* Zero or more of preceding character

? or \? Zero or one of the preceding characters

+ or \+ One or more of the preceding characters

The grep Utility
“grep” command:

searches for text in file(s)

Examples:

% grep root mail.log

% grep r..t mail.log

% grep ro*t mail.log

% grep ‘ro*t’ mail.log

% grep ‘r[a-z]*t’ mail.log

4
Topic 5 – Regular

Expressions

3/27/2021

3

more Metacharacters

5
Topic 5 – Regular

Expressions

RE Metacharacter Matches…

^ beginning of line

$ end of line

\char Escape the meaning of char following it

[^] One character not in the set

\< Beginning of word anchor \> End of word anchor

( ) or \( \) Tags matched characters to be used later (max = 9)

| or \| Or grouping

x\{m\} Repetition of character x, m times (x,m = integer)

x\{m,\} Repetition of character x, at least m times

x\{m,n\} Repetition of character x between m and m times

TOPIC 5 – REGULAR EXPRESSIONS 6

Regular Expression

An atom specifies what text is to be matched and

where it is to be found.

An operator combines regular expression atoms.

3/27/2021

4

TOPIC 5 – REGULAR EXPRESSIONS 7

Atoms

An atom specifies what text is to be matched and where

it is to be found.

TOPIC 5 – REGULAR EXPRESSIONS 8

Single-Character Atom

A single character matches itself

3/27/2021

5

TOPIC 5 – REGULAR EXPRESSIONS 9

Dot Atom

matches any single character except for a new

line character (\n)

TOPIC 5 – REGULAR EXPRESSIONS 10

Class Atom
matches only single character that can be any of

the characters defined in a set:

Example: [ABC] matches either A, B, or C.

Notes:

1) A range of characters is indicated by a dash, e.g. [A-Q]

2) Can specify characters to be excluded from the set, e.g.

[^0-9] matches any character other than a number.

3/27/2021

6

TOPIC 5 – REGULAR EXPRESSIONS 11

Example: Classes

short-hand classes
[:alnum:]

[:alpha:]

[:upper:]

[:lower:]

[:digit:]

[:space:]

12
Topic 5 – Regular

Expressions

3/27/2021

7

TOPIC 5 – REGULAR EXPRESSIONS 13

Anchors

Anchors tell where the next character in the pattern must

be located in the text data.

Back References: \n
used to retrieve saved text in one of nine buffers

can refer to the text in a saved buffer by using a back reference:

ex.: \1 \2 \3 …\9

more details on this later

14
Topic 5 – Regular

Expressions

3/27/2021

8

TOPIC 5 – REGULAR EXPRESSIONS 15

Operators

TOPIC 5 – REGULAR EXPRESSIONS 16

Sequence Operator

In a sequence operator, if a series of atoms are shown in

a regular expression, there is no operator between them.

3/27/2021

9

TOPIC 5 – REGULAR EXPRESSIONS 17

Alternation Operator: | or \|

operator (| or \| ) is used to define one

or more alternatives

Note: depends on version of “grep”

TOPIC 5 – REGULAR EXPRESSIONS 18

Repetition Operator: \{…\}

The repetition operator specifies that the atom or

expression immediately before the repetition may be

repeated.

3/27/2021

10

TOPIC 5 – REGULAR EXPRESSIONS 19

Basic Repetition Forms

TOPIC 5 – REGULAR EXPRESSIONS 20

Short Form Repetition Operators: * + ?

3/27/2021

11

TOPIC 5 – REGULAR EXPRESSIONS 21

Group Operator

In the group operator, when a group of characters is

enclosed in parentheses, the next operator applies to the

whole group, not only the previous characters.

Note: depends on version of “grep”

use \( and \) instead

Grep detail and examples
grep is family of commands

◦ grep

common version

◦ egrep

understands extended REs

(| + ? ( ) don’t need backslash)

◦ fgrep

understands only fixed strings, i.e. is faster

◦ rgrep

will traverse sub-directories recursively

22
Topic 5 – Regular

Expressions

3/27/2021

12

Commonly used “grep” options:

-c Print only a count of matched lines.

-i Ignore uppercase and lowercase distinctions.

-l List all files that contain the specified pattern.

-n Print matched lines and line numbers.

-s Work silently; display nothing except error messages.

Useful for checking the exit status.

-v Print lines that do not match the pattern.

23
Topic 5 – Regular

Expressions

Example: grep with pipe

24
Topic 5 – Regular

Expressions

% ls -l | grep ‘^d’

drwxr-xr-x 2 krush csci 512 Feb 8 22:12 assignments

drwxr-xr-x 2 krush csci 512 Feb 5 07:43 feb3

drwxr-xr-x 2 krush csci 512 Feb 5 14:48 feb5

drwxr-xr-x 2 krush csci 512 Dec 18 14:29 grades

drwxr-xr-x 2 krush csci 512 Jan 18 13:41 jan13

drwxr-xr-x 2 krush csci 512 Jan 18 13:17 jan15

drwxr-xr-x 2 krush csci 512 Jan 18 13:43 jan20

drwxr-xr-x 2 krush csci 512 Jan 24 19:37 jan22

drwxr-xr-x 4 krush csci 512 Jan 30 17:00 jan27

drwxr-xr-x 2 krush csci 512 Jan 29 15:03 jan29

% ls -l | grep -c ‘^d’

10

Pipe the output of the

“ls –l” command to

grep and list/select

only directory entries.

Display the number of

lines where the pattern

was found. This does

not mean the number

of occurrences of the

pattern.

3/27/2021

13

Example: grep with \< \>

25
Topic 5 – Regular

Expressions

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

% grep ‘\‘ grep-datafile

north NO Ann Stephens 455000.50

Print the line if it contains the word “north”.

Example: grep with a\|b

26
Topic 5 – Regular

Expressions

% grep ‘NW\|EA’ grep-datafile

northwest NW Charles Main 300000.00

eastern EA TB Savage 440500.45

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Print the lines that contain either the expression “NW” or the expression “EA”

Note: egrep works with |

3/27/2021

14

Example: egrep with +

27
Topic 5 – Regular

Expressions

% egrep ‘3+’ grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

Print all lines containing one or more 3’s.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Note: grep works with \+

Example: egrep with RE: ?

28
Topic 5 – Regular

Expressions

% egrep ‘2\.?[0-9]’ grep-datafile

southwest SW Lewis Dalsass 290000.73

Print all lines containing a 2, followed by zero or one period, followed by a number.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Note: grep works with \?

3/27/2021

15

Example: egrep with ( )

29
Topic 5 – Regular

Expressions

% egrep ‘(no)+’ grep-datafile

northwest NW Charles Main 300000.00

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

Print all lines containing one or more consecutive occurrences of the pattern “no”.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Note: grep works with \( \) \+

Example: egrep with (a|b)

30
Topic 5 – Regular

Expressions

% egrep ‘S(h|u)’ grep-datafile

western WE Sharon Gray 53000.89

southern SO Suan Chin 54500.10

Print all lines containing the uppercase letter “S”, followed by either “h” or “u”.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Note: grep works with \( \) \|

3/27/2021

16

Example: fgrep

31
Topic 5 – Regular

Expressions

% fgrep ‘[A-Z]****[0-9]..$5.00’ grep-datafile

Extra [A-Z]****[0-9]..$5.00

Find all lines in the file containing the literal string “[A-Z]****[0-9]..$5.00”. All

characters are treated as themselves. There are no special characters.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Example: Grep with ^

32
Topic 5 – Regular

Expressions

% grep ‘^n’ grep-datafile

northwest NW Charles Main 300000.00

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

Print all lines beginning with the letter n.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

3/27/2021

17

Example: grep with $

33
Topic 5 – Regular

Expressions

% grep ‘\.00$’ grep-datafile

northwest NW Charles Main 300000.00

southeast SE Patricia Hemenway 400000.00

Extra [A-Z]****[0-9]..$5.00

Print all lines ending with a period and exactly two zero numbers.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Example: grep with \char

34
Topic 5 – Regular

Expressions

% grep ‘5\..’ grep-datafile

Extra [A-Z]****[0-9]..$5.00

Print all lines containing the number 5, followed by a literal period and any

single character.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

3/27/2021

18

Example: grep with [ ]

35
Topic 5 – Regular

Expressions

% grep ‘^[we]’ grep-datafile

western WE Sharon Gray 53000.89

eastern EA TB Savage 440500.45

Print all lines beginning with either a “w” or an “e”.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Example: grep with [^]

36
Topic 5 – Regular

Expressions

% grep ‘\.[^0][^0]$’ grep-datafile

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

eastern EA TB Savage 440500.45

Print all lines ending with a period and exactly two non-zero numbers.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

3/27/2021

19

Example: grep with x\{m\}

37
Topic 5 – Regular

Expressions

% grep ‘[0-9]\{6\}\.’ grep-datafile

northwest NW Charles Main 300000.00

southwest SW Lewis Dalsass 290000.73

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Print all lines where there are at least six consecutive numbers followed by a period.

% cat grep-datafile

northwest NW Charles Main 300000.00

western WE Sharon Gray 53000.89

southwest SW Lewis Dalsass 290000.73

southern SO Suan Chin 54500.10

southeast SE Patricia Hemenway 400000.00

eastern EA TB Savage 440500.45

northeast NE AM Main Jr. 57800.10

north NO Ann Stephens 455000.50

central CT KRush 575500.70

Extra [A-Z]****[0-9]..$5.00

Example: grep with \< 38 Topic 5 - Regular Expressions % grep '\