程序代写代做代考 Context Free Languages algorithm lec5.key

lec5.key

CS 314 Principles of Programming Languages

Prof. Zheng Zhang
Rutgers University

Lecture 5: Syntax Analysis (Parsing)

September 19, 2018

Class Information

2

• Homework 1 is being graded now. 

The sample solution will be posted soon.

• Homework 2 will be posted tomorrow.

Review: Context Free Grammars (CFGs)

• A formalism to for describing languages

• A CFG G = < T, N, P, S >:
1. A set T of terminal symbols (tokens).
2. A set N of nonterminal symbols.
3. A set P production (rewrite) rules.
4. A special start symbol S.

• The language L(G) is the set of sentences of terminal symbols in T* that
can be derived from the start symbol S:

L(G) = {w ∈ T* | S ⇒* w}

3

4

Terminal Symbol: Symbol-in-Boldface

Non-Terminal Symbol: Symbol-in-Angle-Brackets

Production Rule: Non-Terminal ::= Sequence of Symbols
or
Non-Terminal ::= Sequence | Sequence | 


Elements of BNF Syntax

Example:

::= if then
::= id <= id 
 ::= id := num

terminal

non-terminalterminal

Review: Context Free Grammar

5

::= if then
::= id <= id 
 ::= id := num

CFGs are rewrite systems with restrictions on the form of
rewrite (production) rules that can be used. The left hand side
of a production rule can only be one non-terminal symbol.

Rule 1 $1 ⇒ 1&
Rule 2 $0 ⇒ 0$
Rule 3 &1 ⇒ 1$
Rule 4 &0 ⇒ 0&
Rule 5 $# ⇒ →A
Rule 6 &# ⇒ →B

Context free grammar Not a context free grammar

6

A Language May Have Many Grammars

Consider G’: The Original Grammar G:

:= 0

2

:= 0

2

X2 := 0

1. < letter > ::= A | B | C | … | Z
2. < digit > ::= 0 | 1 | 2 | … | 9
3. < ident > ::= < letter > |
4. < ident > < letterordigit >
5. < stmt > ::= < ident > := 0
6. < letterordigit > ::= < letter > | < digit >

1. < letter > ::= A | B | C | … | Z
2. < digit > ::= 0 | 1 | 2 | … | 9
3. < identifier > ::= < letter > |
4. < identifier > < letter > |
5. < identifier > < digit >
6. < stmt > ::= < identifier > := 0

X X

Review: Grammars and Programming Languages

• Captures the logic structure of the language
⇒ structure carries some semantic information 

(example: expression grammar)

• Use meaningful names
• Are easy to read
• Are unambiguous
• …

7

Many grammars may correspond to one programming language.

Good grammars:

Review: Ambiguous Grammars

8

• two distinct parse trees for w, or
• two distinct leftmost derivations for w, or
• two distinct rightmost derivations for w.

We want a unique semantics of our programs, which
typically requires a unique syntactic structure.

“Time flies like an arrow; fruit flies like a banana.”

A grammar G is ambiguous iff there exists a w ∈ L(G) such that
there are:

Review: Arithmetic Expression Grammar

9

< start > ::= < expr > 

< expr > :: = < expr > – < expr > |
< expr > * < expr > |
< d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

Parse “8 – 3 * 2”:

< expr >

< expr > – < expr >

< expr > * < expr >

< d > < d >

< d >

3 2

8

< expr >

< expr > * < expr >

< expr > – < expr >

< d > < d >

< d >

3

2

8

Two parse trees!

Review: Arithmetic Expression Grammar

10

Parse “8 – 3 * 2”:

< expr >

< expr > – < expr >

< expr > * < expr >

< d > < d >

< d >

3 28

< expr >

< expr > * < expr >

< expr > – < expr >
< d > < d > < d >

3 28

leftmost derivation
⇒L
⇒L
⇒L
* ⇒L
* ⇒L
*

leftmost derivation
⇒L
* ⇒L
* ⇒L
* ⇒L
* ⇒L
*

Two Parse Trees —> Two leftmost derivations!

Review: Ambiguity

11

How to deal with ambiguity?

• Change the language 

Example: Adding new terminal symbols as delimiters. 

Fix the dangling else, expression grammars.

• Change the grammar 

Example: Impose associativity and precedence in an arithmetic

expression grammar.

Changing the Grammar to Impose Precedence

12

< start > ::= < expr >
< expr > ::= < expr > + < expr > |
< expr > – < expr > |
< term >
< term > ::= < term > * < term > |
< term > / < term > |
< d > | < l >
< d > ::= 0 | 1 | 2 | 3 | … | 9
< l > ::= a | b | c | … | z

< start > ::= < expr > 

< expr > ::= < expr > + < expr > |
< expr > – < expr > |
< expr > * < expr > |
< expr > / < expr > |
< d > | < l >
< d > ::= 0 | 1 | 2 | 3 | … | 9
< l > ::= a | b | c | … | z

Modified Grammar G’Original Grammar G

13

Grouping in Parse Tree Now Reflects Precedence

Parse “8 – 3 * 2”: < start > ::= < expr >
< expr > :: = < expr > + < expr > |
< expr > – < expr > |
< term >
< term > ::= < term > * < term > |
< term > / < term > |
< d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

< expr >

< expr > – < expr >

< term > * < term >

< d > < d >

< term >

3 28

< d >

Modified Grammar G’
Only One Possible Parse Tree

Precedence

14

• Low Precedence:
Addition + and Subtraction –

• Medium Precedence:
Multiplication * and Division /

• Highest Precedence:
Exponentiation ^

< start > ::= < expr >
< expr > :: = < expr > + < expr > |
< expr > – < expr > |
< term >
< term > ::= < term > * < term > |
< term > / < term > |
< d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

Still Have Ambiguity…

15

< start >

< expr > – < expr >

< expr > – < expr >

< start >

< expr >–< expr >

< expr > – < expr >

3 2

1 3

2 1

Eval: (3 – 2) – 1 = 0 Eval: 3 – (2 – 1) = 2

< start > ::= < expr >
< expr > :: = < expr > + < expr > | < expr > – < expr > | < term >
< term > ::= < term > * < term > | < term > / < term > | < d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

3 – 2 – 13 – 2 – 1 OR?

How about 3 – 2 – 1 ?

Still Have Ambiguity…

16

How about 3 – 2 – 1 ?

< start > ::= < expr >
< expr > :: = < expr > + < expr > | < expr > – < expr > | < term >
< term > ::= < term > * < term > | < term > / < term > | < d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

< start >

< expr > – < expr >

< expr > – < expr >

< start >

< expr >–< expr >

< expr > – < expr >
3 2

1 3

2 1

3 – 2 – 13 – 2 – 1 OR?

Still Have Ambiguity…

17

• Grouping of operators of same precedence not disambiguated.
• Non-commutative operators: only one parse tree correct.

< start > ::= < expr >
< expr > :: = < expr > + < expr > | < expr > – < expr > | < term >
< term > ::= < term > * < term > | < term > / < term > | < d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

Imposing Associativity

18

< expr > :: = < d > – < expr > |
< d >
< d > :: = 0 | 1 | 2 | 3 | … | 9

Same grammar with left / right recursion for – :

Our choices:

< expr > :: = < expr > – < d > |
< d >
< d > :: = 0 | 1 | 2 | 3 | … | 9

Or:

Which one do we want for – in the calculator language?


< d > –
< d > – < d > –
< d > – < d > – < d > –


– < d > ⇒
– < d > – < d > ⇒
– < d > – < d > –

Associativity

19

• Deals with operators of same precedence
• Implicit grouping or parenthesizing
• Left to right: *, /, +, –
• Right to left: ^

Complete, Unambiguous Arithmetic Expression Grammar

20

< start > ::= < expr >
< expr > ::= < expr > + < term > |
< expr > − < term > |
< term >
< term > ::= < term > * < factor > |
< term > / < factor > |
< factor >
< factor >::= < g > ^ < factor > |
< g >
< g > ::= ( < expr > ) | < d > | < l >
< d > ::= 0 | 1 | 2 | … | 9
< l > ::= a | b | c | … | z

< start > ::= < expr > 

< expr > :: = < expr > + < expr > |
< expr > – < expr > |
< expr > * < expr > |
< expr > / < expr > |
< expr > ^ < expr > |
< d > | < l >
< d > :: = 0 | 1 | 2 | 3 | … | 9
< l > :: = a | b | c | … | z

Original Ambiguous Grammar G

Unambiguous Grammar G

Abstract versus Concrete Syntax

21

Concrete Syntax:

Representation of a construct in a particular language,
including placement of keywords and delimiters.

Abstract Syntax:

Structure of meaningful components of each language construct .

Example:

22

< S > ::= < E >
< E > ::= < E > − < T > | < T >
< T > ::= < T > * id | id

Consider A * B − C:

Concrete Syntax Tree

Abstract Syntax Tree
(AST)

< S >

< E >

< E > < T > −

< T >

< T > * id

id

id

* id

id id

Abstract versus Concrete Syntax

23

Same abstract syntax, different concrete syntax:

Pascal while x <> A[i], do
i := i + 1

end

C while ( x != A[i])
i = i + 1;

Regular vs. Context Free

24

• All Regular languages are context free languages
• Not all context free languages are regular languages

Example:

::= |
::= a | b
::= c | c

is equivalent to: a b* | c+

Is {an bn |n ≥ 0} a context free language?

Question:

::= a b | ϵ

Regular vs. Context Free

Regular vs. Context Free

26

• All Regular languages are context free languages
• Not all context free languages are regular languages

Example:

::= |
::= a | b
::= c | c

is equivalent to: a b* | c+

Is {an bn |n ≥ 0} a context free language?

Is {an bn |n ≥ 0} a regular language?

Question:

Regular Grammars

27

CFGs with restrictions on the shape of production rules.

::= a | b
::= a b

Left-linear:

::= a b | a b
::= b | b

Right-linear:

Complexity of Parsing

28

Classification of languages that can be recognized by
specific grammars.

Regular grammars DFAs O(n)

LR grammars Kunth’s algorithm O(n)

Arbitrary CFGs Earley’s algorithm O(n3)

Arbitrary CSGs LBAs P-SPACE
COMPLETE

Complexity:

Reading:

Scott Chapter 2.3.4 (for LR parser) and Chapter 2.4.3 for
language class.
Earley, Jay (1970), “An efficient context-free parsing
algorithm”, Communications of the ACM.

CSG

CFG

RG
LR

Top – Down Parsing – LL(1)

29

• The parse tree is constructed from the root, expanding non-terminal
nodes on the tree’s frontier following a leftmost derivation.

• The input program is read from left to right, and input tokens are read
(consumed) as the program is parsed.

• The next non-terminal symbol is replaced using one of its rules. The
particular choice has to be unique and uses parts of the input (partially
parsed program), for instance the first token of the remaining input.

Basic Idea:

x

y

α
β

Top – Down Parsing – LL(1) (cont.)

30

Example:

S ::= a S b | ε

How can we parse (automatically construct a leftmost derivation)
the input string a a a b b b using a PDA (push-down automaton)
and only the first symbol of the remaining input?

INPUT: a a a b b b eof

LL(1) Parsing Example

31

S ::= a S b | ε

S Remaining Input: a a a b b b

Sentential Form:
S

Applied Production:

S

LL(1) Parsing Example

32

S ::= a S b | ε

S Remaining Input: a a a b b b

Sentential Form:
a S b

Applied Production:
S ::= a S b

Sa b

a

S

b

LL(1) Parsing Example

33

S ::= a S b | ε

S Remaining Input: a a a b b b

Sentential Form:
a S b

Applied Production:

Sa b

Match!

a

S

b

LL(1) Parsing Example

34

S ::= a S b | ε

S Remaining Input: a a b b b

Sentential Form:
a S b

Applied Production:

Sa b

S

b

LL(1) Parsing Example

35

S ::= a S b | ε

S Remaining Input: a a b b b

Sentential Form:
a a S b b

Applied Production:
S ::= a S b

Sa b

Sa b

a

S

b

b

LL(1) Parsing Example

36

S ::= a S b | ε

S Remaining Input: a a b b b

Sentential Form:
a a S b b

Applied Production:

Sa b

Sa b

Match!
a

S

b

b

LL(1) Parsing Example

37

S ::= a S b | ε

S Remaining Input: a b b b

Sentential Form:
a a S b b

Applied Production:

Sa b

Sa b

S

b

b

LL(1) Parsing Example

38

S ::= a S b | ε

S Remaining Input: a b b b

Sentential Form:
a a a S b b b

Applied Production:
S ::= a S b

Sa b

Sa b

S ba

a

S

b

b

b

LL(1) Parsing Example

39

S ::= a S b | ε

S Remaining Input: a b b b

Sentential Form:
a a a S b b b

Applied Production:

Sa b

Sa b

S ba
Match!

a

S

b

b

b

LL(1) Parsing Example

40

S ::= a S b | ε

S Remaining Input: b b b

Sentential Form:
a a a S b b b

Applied Production:

Sa b

Sa b

S baS

b

b

b

LL(1) Parsing Example

41

S ::= a S b | ε

S Remaining Input: b b b

Sentential Form:
a a a b b b

Applied Production:
S ::= ε

Sa b

Sa b

S ba

ε
b

b

b

LL(1) Parsing Example

42

S ::= a S b | ε

S Remaining Input: b b b

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

ε
Match!b

b

b

LL(1) Parsing Example

43

S ::= a S b | ε

S Remaining Input: b b

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

εb

b

LL(1) Parsing Example

44

S ::= a S b | ε

S Remaining Input: b b

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

ε

Match!

b

b

LL(1) Parsing Example

45

S ::= a S b | ε

S Remaining Input: b

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

ε

b

LL(1) Parsing Example

46

S ::= a S b | ε

S Remaining Input: b

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

ε

Match!

b

LL(1) Parsing Example

47

S ::= a S b | ε

S Remaining Input:

Sentential Form:
a a a b b b

Applied Production:

Sa b

Sa b

S ba

ε

Next Lecture

48

• Read Scott, Chapter 2.3.1 – 2.3.2 (and the materials on companion site)

Next Time: