程序代写代做代考 go chain C interpreter ER flex algorithm compiler AI Parser: Syntactic Analysis Readings: EAC2 Chapter 3

Parser: Syntactic Analysis Readings: EAC2 Chapter 3
EECS4302 M: Compilers and Interpreters Winter 2020
CHEN-WEI WANG
Context-Free Languages: Introduction
● We have seen regular languages :
○ Can be described using finite automata or regular expressions.
○ Satisfy the pumping lemma.
● Languages with a recursive structure are provably non-regular.
e.g.,{0n1n ￿n≥0}
Context-free grammars (CFG’s) are used to describe strings
● ●
3 of 96
that can be generated in a recursive fashion. Context-free languages (CFL’s) are:
○ Languages that can be described using CFG’s.
○ A proper superset of the set of regular languages.
Parser in Context
○ Recall:
Lexical Analysis Source Program
(seq. of characters)
Scanner
Syntactic Analysis seq. of tokens
Parser
Semantic Analysis
AST1 … ASTn
pretty printed
Target Program
○ Treats the input programas as a a sequence of classified tokens/words
○ Applies rules parsing token sequences as
abstract syntax trees (ASTs) [ syntactic analysis ]
○ Upon termination:
● ReportstokensequencesnotderivableasASTs ● ProducesanAST
○ No longer considers every character in input program.
○ token sequences constitute a
Derivable
context-free language (CFL)
.
2 of 96
CFG: Example (1.1)
● The language that we previously proved as non-regular {0n#1n ￿n≥0}
can be described using the following grammar :
A → 0A1 A→B B→#
● A grammar contains a collection of substitution or production rules, where:
4 of 96
○ A terminal is a word w ∈ ⌃∗ (e.g., 0, 1, etc.).
○ A variable or non-terminal is a word w ∈￿ ⌃∗ (e.g., A, B, etc.).
○ A start variable occurs on the LHS of the topmost rule (e.g., A).

CFG: Example (1.2)
● Given a grammar, generate a string by:
1. Write down the start variable.
2. Choose a production rule where the start variable appears on the
LHS of the arrow, and substitute it by the RHS. 3. There are two cases of the re-written string:
3.1 It contains no variables, then you are done.
3.2 It contains some variables, then substitute each variable using the
relevant production rules. 4. Repeat Step 3.
● e.g., We can generate an infinite number of strings from
○ A⇒B⇒#
○ A⇒0A1⇒0B1⇒0#1
○ A⇒0A1⇒00A11⇒00B11⇒00#11
○ … 5 of 96
A → 0A1 A→B B→#
CFG: Example (2)
Design a CFG for the following language: {w￿w∈{0,1}∗∧w is a palidrome}
e.g., 00, 11, 0110, 1001, etc.
P→✏ P→0 P→1
P → 0P0 P → 1P1
7 of 96
CFG: Example (1.2)
Given a CFG, the derivation of a string can be shown as a parse tree .
e.g., The derivation of 000#111 has the parse tree
6 of 96
CFG: Example (3)
Design a CFG for the following language: {wwR ￿w∈{0,1}∗}
e.g., 00, 11, 0110, etc.
P→✏
P → 0P0 P → 1P1
8 of 96

CFG: Example (4)
Design a CFG for the set of binary strings, where each block of 0’s followed by at least as many 1’s.
e.g., 000111, 0001111, etc.
● We use S to represent one such string, and A to represent
each such block in S.
S → ✏ {BC of S}
S → AS {RC of S}
A → ✏ {BC of A}
A → 01 {BC of A}
A → 0A1 {RC of A: equal 0’s and 1’s} A → A1 {RCofA:more1’s}
9 of 96
CFG: Example (5.2) Version 1
Expression
→ IntegerConstant
￿ -IntegerConstant ￿ BooleanConstant ￿ BinaryOp
￿ UnaryOp
￿ ( Expression )
IntegerConstant → Digit
￿ Digit IntegerConstant
Digit → 0￿1￿2￿3￿4￿5￿6￿7￿8￿9 BooleanConstant → TRUE
￿ FALSE 11 of 96
CFG: Example (5.1) Version 1
Design the grammar for the following small expression language, which supports:
● Arithmetic operations: +, -, *, /
● Relational operations: >, <, >=, <=, ==, /= ● Logical operations: true, false, !, &&, ||, =>
Start with the variable Expression. ● There are two possible versions:
1. All operations are mixed together. [e.g., (1 + true)￿false] 2. Relevant operations are grouped together.
10 of 96
Try both!
CFG: Example (5.3) Version 1
BinaryOp → ￿
UnaryOp →
Expression + Expression Expression – Expression Expression * Expression Expression / Expression Expression && Expression Expression || Expression Expression => Expression Expression == Expression Expression /= Expression Expression > Expression Expression < Expression ! Expression ￿ ￿ ￿ ￿ ￿ ￿ ￿ ￿ ￿ 12 of 96 CFG: Example (5.4) Version 1 However, Version 1 of CFG: ○ Parses string that requires further semantic analysis (e.g., type checking): e.g.,3 => 6
○ Is ambiguous , meaning that a string may have more than one
ways to interpret it.
e.g., Draw the parse tree(s) for 3 * 5 + 4
13 of 96
CFG: Example (5.6) Version 2
15 of 96
ArithmeticOp → ￿
RelationalOp → ￿
ArithmeticOp + ArithmeticOp ArithmeticOp – ArithmeticOp ArithmeticOp * ArithmeticOp ArithmeticOp / ArithmeticOp (ArithmeticOp) IntegerConstant -IntegerConstant ArithmeticOp == ArithmeticOp ArithmeticOp /= ArithmeticOp ArithmeticOp > ArithmeticOp ArithmeticOp < ArithmeticOp LogicalOp && LogicalOp LogicalOp || LogicalOp LogicalOp => LogicalOp
! LogicalOp (LogicalOp) RelationalOp BooleanConstant
LogicalOp
￿ → ￿ ￿ ￿ ￿ ￿ ￿
￿ ￿ ￿ ￿ ￿
￿
CFG: Example (5.5) Version 2
Expression → ArithmeticOp ￿ RelationalOp
￿ LogicalOp
￿ ( Expression )
IntegerConstant → Digit
￿ Digit IntegerConstant
Digit → 0￿1￿2￿3￿4￿5￿6￿7￿8￿9 BooleanConstant → TRUE
￿ FALSE 14 of 96
CFG: Example (5.7) Version 2
However, Version 2 of CFG:
○ Eliminates some cases for further semantic analysis:
e.g.,(1 + 2) => (5 / 4) [noparsetree]
○ Still Parses string that might require further semantic analysis :
e.g.,(1 + 2) / (5 – (2 + 3))
○ Is ambiguous , meaning that a string may have more than one
ways to interpret it.
e.g., Draw the parse tree(s) for 3 * 5 + 4
16 of 96

CFG: Formal Definition (1)
● A isa4-tuple(V, ⌃, R, S):
○ V is a finite set of variables. ○ ⌃isafinitesetofterminals. ○ R is a finite set of rules s.t.
[V∩⌃=￿]
R ⊆ {v → s ￿ v ∈ V ∧ s ∈ (V ∪ ⌃)∗}
○ S ∈ V is is the start variable.
● Givenstringsu,v,w∈(V∪⌃)∗,variableA∈V,andarule
A → w:
○ uAv ⇒ uwv menas that uAv yields uwv .
○ u ⇒∗ v means that u derives v, if: ● u=v;or
● u⇒u1 ⇒u2 ⇒⋅⋅⋅⇒uk ⇒v [ayieldsequence] ● GivenaCFGG=(V, ⌃, R, S),thelanguageofG
L ( G ) = { w ∈ ⌃ ∗ ￿ S ⇒∗ w } 17 of 96
context-free grammar (CFG)
CFG: Formal Definition (3): Example
● ConsiderthegrammarG=(V,⌃,R,S):
○ R is
Expr → Expr + Term
￿ Term
Term → Term * Factor
○ V={Expr,Term,Factor} ○ ⌃ = {a,+,*,(,)}
○ S=Expr
￿a
￿ Factor Factor → (Expr )
● Precedence of operators + and * is embedded in the grammar. ○ “Plus” is specified at a higher level (Expr) than is “times” (Term). ○ Both operands of a multiplication (Factor) may be parenthesized.
19 of 96
CFG: Formal Definition (2): Example
● Design the CFG for strings of properly-nested parentheses. e.g., (), ()(), ((()()))(), etc.
Present your answer in a formal manner.
● G=({S},{(,)},R,S),whereRis S→( S )￿SS￿✏
● Draw parse trees for the above three strings that G generates. 18 of 96
Regular Expressions to CFG’s
● Recall the semantics of regular expressions (assuming that we do not consider ￿):
L(✏) = {✏}
L(a) = {a} L(E+F) = L(E)∪L(F) L(EF) = L(E )L(F ) L(E∗) = (L(E))∗
L( (E) ) =
● e.g.,Grammarfor(00+1)∗+(11+0)∗
D → 11￿0 20 of 96
S→A￿B
A → ✏￿AC
C → 00￿1
B → ✏￿BD
L(E )

DFA to CFG’s
● Given a DFA M = (Q, ⌃, , q0, F):
○ Make a variable Ri for each state qi ∈ Q.
○ Make R0 the start variable, where q0 is the start state of M. ○ Add a rule Ri → aRj to the grammar if (qi,a) = qj.
○ AddaruleRi →✏ifqi ∈F.
● e.g., Grammar for
R1 → 0R0￿1R1￿✏ 21 of 96
11 0
s0: even 0’s
s1: odd 0’s
R0 → 1R0￿0R1
0
CFG: Rightmost Derivations (1)
Expr → Expr + Term ￿ Term Term → Term * Factor ￿ Factor Factor → (Expr)￿a
○ Unique rightmost derivation for the string a + a * a:
Expr ⇒ Expr + Term
⇒ Expr + Term * Factor ⇒ Expr+Term*a
⇒ Expr + Factor * a
⇒ Expr+a*a
⇒ Term+a*a
⇒ Factor+a*a ⇒a+a*a
○ This rightmost derivation suggests that a * a is the right operand of +.
23 of 96
CFG: Leftmost Derivations (1)
Expr → Expr + Term ￿ Term Term → Term * Factor ￿ Factor Factor → (Expr)￿a
○ Unique leftmost derivation for the string a + a * a:
Expr ⇒ Expr + Term ⇒ Term + Term
⇒ Factor + Term
⇒ a + Term
⇒ a + Term * Factor ⇒ a + Factor * Factor ⇒ a+a*Factor ⇒a+a*a
○ This leftmost derivation suggests that a * a is the right operand of +.
22 of 96
CFG: Leftmost Derivations (2)
Expr → Expr + Term ￿ Term Term → Term * Factor ￿ Factor Factor → (Expr)￿a
○ Unique leftmost derivation for the string (a + a) * a: Expr ⇒ Term
⇒ Term * Factor
⇒ Factor * Factor
⇒ ( Expr ) * Factor
⇒ (Expr+Term)*Factor ⇒ (Term+Term)*Factor ⇒ ( Factor + Term ) * Factor ⇒ (a+Term)*Factor
⇒ (a+Factor)*Factor
⇒ (a+a)*Factor ⇒(a+a)*a
This leftmost derivation suggests that (a + a) is the left 24 of 96operand of *.

CFG: Rightmost Derivations (2)
Expr → Expr + Term ￿ Term Term → Term * Factor ￿ Factor Factor → (Expr)￿a
○ Unique rightmost derivation for the string (a + a) * a: Expr ⇒ Term
⇒ Term * Factor
⇒ Term * a
⇒ Factor * a
⇒ (Expr)*a
⇒ (Expr+Term)*a ⇒ (Expr+Factor)*a ⇒ (Expr+a)*a
⇒ (Term+a)*a ⇒ (Factor+a)*a ⇒(a+a)*a
This rightmost derivation suggests that (a + a) is the left 25 of 96operand of *.
CFG: Parse Trees vs. Derivations (2)
● A string w ∈ ⌃∗ may have more than one derivations.
Q: distinct derivations for w ∈ ⌃∗ ⇒ distinct parse trees for w?
A: Not in general ∵ Derivations with distinct orders of variable substitutions may still result in the same parse tree.
● For example:
Expr → Expr + Term ￿ Term Term → Term * Factor ￿ Factor Factor → (Expr)￿a
For string a + a * a, the leftmost and rightmost derivations have distinct orders of variable substitutions, but their corresponding parse trees are the same.
27 of 96
CFG: Parse Trees vs. Derivations (1)
○ Parse trees for (leftmost & rightmost) derivations of expressions: a + a * a (a + a) * a
26 of 96
○ Orders in which derivations are performed are not reflected on
parse trees.
CFG: Ambiguity: Definition
Given a grammar G = (V,⌃,R,S):
○ A string w ∈ ⌃∗ is derived ambiguously in G if there exist two or more distinct parse trees or, equally,
two or more distinct leftmost derivations or, equally, two or more distinct rightmost derivations.
Here we require that all such derivations have been completed by following a particular order (leftmost or rightmost) to avoid false alarm.
○ G is ambiguous if it generates some string ambiguously. 28 of 96

fragment
|
if Expr then Statement else Statement | Assignment
code.
| …other statements …
This fragment shows that the else is optional. Unfortunately, the code if Expr1 then if Expr2 then Assignment1 else Assignment2
fragment
CFG: Ambiguity: Exercise (1)
● Is the following grammar ?
Expr →Expr + Expr ￿Expr * Expr ￿( Expr )￿a
● Yes ∵ it generates the string a + a * a ambiguously :
● Distinct ASTs (for the same input) mean distinct semantic interpretations: e.g.,
when a post-order traversal is used to implement evaluation
29 of 96
● Exercise: Show leftmost derivation
o
w
The
cla
mmar for a pro-
s
f
r
t
s
sic
ex
h
am
p
e
t
o
p
le of
a
na
mbig
uo
a
us
co
gramming language is the if-then-else construct of many Algol-like languages. The straightforward grammar for if-then-else might be
r
s
ns
e
t
r
truc
t
in
th
e
e
eg
s
ra
.
ambiguous
has two distinct rightmost derivations. The difference between them is if Expr1 then if Expr2 then Assignment1 else Assignment2
simple. The first derivation has Assignment2 controlled by the inner if, so Assignment2 executes when Expr1 is true and Expr2 is false:
has two distinct rightmost derivations. The difference between them is
CFG:Amsimpble.igThueifitrsyt:derEivaxtioenrhcasiAsseignm(2ent.2c)ontrolledbytheinner 2
if, so Assignment2 executes when Expr1 is true and Expr2 is false: (Meaning 1) Assignment2 may be associated with the inner if:
Statement
if Expr1 then Statement
if Expr2 then Statement else Statement
The second derivation associates the else clause with the first if, so that Assignment2 executes when Expr1 is falsAes,singdnempentd1ent of tAhsesivganlmueenotf2 Expr2 :
(MeaninThge 2sec)oAndsdseriigvantimoneasnsotciamtesatyhebeleseacslasuosecwiaithtethde fiwrsithif,thsoethoatuter if: 2
Assignment2 executes when Expr1 is false, independent of the value of
Expr2 :
if Expr1
Statement
2 Expressing Syntax 91
31 of 96
then Statement else
if Expr2 then Statement
Statement
Assignment2
Assignment1
Clearly, these two derivations produce different behaviors in the compiled
1 2 3 4
Statement !
| if Expr then Statement
3.
CFG: Ambiguity: Exercise (2.1)
has tw
The classic example of an ambiguous construct in the grammar for a pro-
gramming language is the if-then-else construct of many Algol-like
languages. The straightforward grammar for if-then-else ● Is the following grammar
2 d 212
1 Statement ! if Expr then Statement else St
2 | if Expr then Statement
3 Statem|enAtssignme→nt if Expr then
4 | …other statements …
This fragment shows that the else is optional. Unfortunately, the code
This fragment shows that the else is optional. Unfortunately, the code 3.2 Expressing Syntax 91
fragment
if Expr1 then if Expr2 then Assignment1 else A
by the Assignment executes when Expr is true and Expr is false:
Statement
￿ if Expr then Statement else Statement
￿ Assignment
fragment
if Expr1 then if Expr2 then Assignment1 else Assignment2
● Yes ∵ it generates the following string :

hastwodistinctrightmostderivations.ThedifferencebetweenthemisThesececlausewiththefirstif,sothat simple. The first derivation has Assignment2 controlled by the inner Assignment2 executes when Expr1 is false, independent of the value of
if Expr then if Expr then As
i
nment else Assignment 12
if, so Assignment2 e1xecutes when Expr1 is true and E2xpr2 is false: Statement
if Expr1 then Statement
if Expr2 then Statement else Statement
Expr2 :
● Exercise: Show leftmost derivations for the two parse trees. 30 of 96
s
if
g
E
:
xp
r
2
Statement
ssignment2 between them is
?
inner
dangling else
e, independent of the value of
might be simple.
ambiguo
u
i
f,
so
s
ambiguously
ond derivation associates the els
Expr1
then
Statement else Statement
if Expr2 then Statement Assignment2 Assignment1
Assignment1 Assignment2 The second derivation associates the else clause with the first if, so that
Assignment2 executes when Expr1 is fals ● This is called the
Clearly, these two derivations produce different behaviors in the compiled code.
problem.
Clearly, these two derivations produce different behaviors in the compiled
code.
CFG: Ambiguity: Exercise (2.3)
● We may remove the ambiguity by specifying that the dangling else is associated with the nearest if:
Statement → if Expr then Statement
￿ if Expr then WithElse else Statement
￿ Assignment
WithElse → if Expr then WithElse else WithElse
￿ Assignment
○ The true branch will be produced via WithElse.
There is no circularity between the two non-terminals. 32 of 96
● When applying
○ The false branch will be produced via Statement.
:
if . . . then WithElse else Statement
o distinct rightmost derivations. The difference The first derivation has Assignment controlle
atement

Discovering Derivations
● GivenaCFGG=(V, ⌃, R, S)andaninputprogramp∈⌃∗: ○ So far we manually come up a valid derivation S ⇒∗ p.
○ A parser is supposed to automate this derivation process.
Given an input sequence of (t,c) pairs, where token t (e.g., r241) belongs to some syntactic category c (e.g., register):
Either output a valid derivation (as an AST ), or signal an error .
● In the process of building an AST for the input program: ○ Root of AST: start symbol S of G
○ Internal nodes: A subset of variables V of G
○ Leaves of AST: token sequence input by the scanner
⇒ Discovering the grammatical connections (according to R)
between the root, internal nodes, and leaves is the hard part!
● ApproachestoParsing: [w∈(V∪⌃)∗,A∈V, A→w ∈R] ○ Top-down parsing
For a node representing A, extend it with a subtree representing w. ○ Bottom-up parsing
33 of 96
For a substring matching w, build a node representing A accordingly.
TDP: Exercise (1)
● Given the following CFG G:
Expr → Expr + Term ￿ Term
Term → Term * Factor ￿ Factor
Factor → (Expr ) ￿a
Trace TDParse on how to build an AST for input a + a * a. ● Running TDParse with G results an infinite loop !!!
○ TDParse focuses on the leftmost non-terminal. ○ The grammar G contains left-recursions.
● We must first convert left-recursions in G to right-recursions. 35 of 96
TDP: Discovering Leftmost Derivation
ALGORITHM: TDParse
INPUT: CFG G=(V, ⌃, R, S)
OUTPUT: Root of a Parse Tree or Syntax Error
PROCEDURE:
root := a new node for the start symbol S focus := root
initialize an empty stack trace trace.push(null)
word := NextWord()
while (true):
if focus ∈ V then
if ∃ unvisited rule focus → 12 …n ∈ R then
create 1 , 2 . . . n as children of focus trace.push(n n−1 . . . 2 )
focus := 1
else
if focus = S then report syntax error else backtrack
return root
elseif word matches focus then word := NextWord()
focus := trace.pop()
elseif word = EOF ∧ focus = null then else backtrack
backtrack ￿ pop focus.siblings; focus := focus.parent; focus.resetChildren 34 of 96
TDP: Exercise (2)
● Given the following CFG G:
Expr → Term Expr′ Expr′ → + Term Expr′
￿✏
Term → Factor Term′
Term′ → * Factor Term′ ￿✏
Factor → (Expr ) ￿a
Exercise. Trace TDParse on building AST for a + a * a. Exercise. Trace TDParse on building AST for (a + a) * a. Q: How to handle ✏-productions (e.g., Expr → ✏)?
A: Execute focus := trace.pop() to advance to next node.
● Running TDParse will terminate ∵ G is right-recursive.
● We will learn about a systematic approach to converting
left-recursions in a given grammar to right-recursions. 36 of 96

Left-Recursions (LR): Direct vs. Indirect
GivenCFGG=(V, ⌃, R, S),↵,,∈(V∪⌃)∗,Gcontains:
○ A cycle if ∃A ∈ V ● A ⇒∗ A
○ AdirectLRifA→A↵∈Rfornon-terminalA∈V e.g., e.g.,
Expr → Expr + Term
￿ Term
Term → Term * Factor
￿ Factor Factor → (Expr )
￿￿∗
○ AnindirectLRifA→B∈Rfornon-terminalsA,B∈V,B⇒A
a
A → Br A → Ba ￿ b B → Cd B → Cd ￿ e C → At C → Df ￿ g
∗ D → f∗ ￿ Aa ￿ Cg 37of96 A→Br,B⇒Atd A→Ba,B⇒Aafd
Expr → ￿
Expr + Term Expr – Term Term
Term * Factor Term / Factor Factor
￿ Term →
￿
CFG: Eliminating ✏-Productions (1)
● Motivations:
○ TDParse requires CFG with no ✏-productions.
○ RemoveLR produces CFG which may contain ✏-productions.
● ✏∈￿L⇒∃CFGG=(V, ⌃, R, S)s.t.Ghasno✏-productions. An ✏-production has the form A → ✏.
● A variable A is nullable if A ⇒∗ ✏.
○ Each terminal symbol is not nullable. ○ Variable A is nullable if either:
● A→✏∈R;or
● A→B1B2…Bk ∈R,whereeachvariableBi (1≤i≤k)isanullable.
● Given a production B → CAD, if only variable A is nullable, then there are 2 versions of B: B → CAD ￿ CD
● In general, given a production A → X1 X2 . . . Xk with k symbols, if m of the k symbols are nullable:
○ m < k: There are 2m versions of A. ○ m=k: Thereare2m −1versionsofA. [excludingA→✏] 39 of 96 TDP: (Preventively) Eliminating LRs 1 ALGORITHM: RemoveLR 2 INPUT: CFG G = (V, ⌃, R, S) 3 ASSUME: G acyclic ∧ with no ✏-productions 4 OUTPUT: G′ s.t. G′ ≡ G, G′ has no 5 indirect & direct left-recursions 6 PROCEDURE: 7 impose an order on V: ￿￿A1,A2,...,An￿￿ 8 fori:1..n: 9 for j: 1 .. i−1: 10 if ∃ Ai →Aj∈R ∧ Aj →1 ￿2 ￿...￿m ∈R then 11 replace Ai →Aj with Ai →1￿2￿...￿m 12 end 13 for Ai →Ai↵￿∈R: 14 replace it with: Ai →A′, A′ →↵A′ ￿✏ L9 to L11: Remove indirect left-recursions from A1 to Ai−1. L12 to L13: Remove direct left-recursions from A1 to Ai−1. Loop Invariant (outer for-loop)? At the start of ith iteration: ○ No direct or indirect left-recursions for A1,A2,...,Ai−1. ○ Moreprecisely:∀k∶k[Goal ! • List, eof] [List ! • List Pair, eof] [List ! • List Pair, (] >
cc0 =
<3s6s75= [List ! • Pair, eof] [List ! • Pair, (] [Pair ! •( Pair ), eof] >4r2r2 >
[Pair!•(Pair),(] [Pair!•(),eof] [Pair!•(),(] 5s8
:6s6s109 ; we can derive the state of the parser after it recognizes an initial ( by com-
7r5r5
puting goto(cc0,( ). The inner loop finds four items that have • before (. 8r4r4
Goto creates a new item for each, with the • advanced beyond (. Closure 9 s 11
adds two more items, generated from the items with • before Pair. These Calculate goto(cc , (). 10 [“nexr 5t state” from cc taking (]
0items introduce the lookahead symbol ). Thus, goto(cc0 ,( ) re0turns (11r4 )
[Pair!(•Pair),eof] [Pair!(•Pair),(] [Pair!(•),eof] . [Pair!(•),(] [Pair!•(Pair),)] [Pair!•(),)]
(a) Parentheses Grammar (b) Action and Goto Tables
To find the set of states that derive directly from some state such as cc0, the
algorithm can compute goto(cc0,x) for each x that occurs after a • in an nFIGURE3.16 TheParenthesesGrammar.
item in cc0. This produces all the sets that are one symbol away from cc0. 76 of 96 To compute the complete canonical collection, we simply iterate this process
does not contain a handle, so the parser shifts ) onto the stack to build more
to a fixed point.
rsers
;
for each item i 2 s
if the form of i is [↵! •x, a] then moved moved [ x , a]
does not contain a handle, so the parser shifts ) onto the stack to build more context. It moves to state 7.
context. It moves to state 7. The Algorithm
To construct the canonical collection of sets of lr(1) items, the algorithm
rs
In the third iteration, the situation has changed. The stack contains a han-
computes the initial set, cc0, and then systematically finds all of the sets of
a
In an LR parser, the handle is always positioned at
dle, hPair ! ( ) i,t, where t is the stack top. The Action table directs the

the language.
In practice, an lr(1) parser generator must produce other tables needed by
3.4 B
Constructing CC: The Algorithm (1)
ALGORITHM: BuildCC ′ INPUT: a grammar G = (V , ⌃, R, S), goal production S → S
(1) a set CC = {cc0,cc1,…,ccn} where cci ⊆ G’s LR(1) items
(2) a transition function PROCEDURE: ′
1
2
3 OUTPUT:
4
5
6
7 cc0 := closure({[S → ●S, eof]})
8
CC := {cc0} processed := {cc0 } lastCC := ￿ while(lastCC ≠ CC):
9 10 11 12 13 14 15 16 17 18
lastCC := CC
for cci s.t. cci ∈ CC ∧ cci ∈￿ processed:
processed ∪ {cci }
processed :=
for x s.t. [⋅⋅⋅→⋅⋅⋅●x…]∈cci
temp := goto(cci , x) if temp ∈￿ CC then
CC := CC∪{temp} 19 end
20
:= ∪(cci, x, temp)
77 of 96
the skeleton parser. For example, when the skeleton parser in Figure 3.15 on
page119reducesby A!,itpops“2⇥||”symbolsfromthes Constructing CC: The Algorithm (2.2)
pushes A onto the stack. The table generator must produce data st
that map a production from the reduce entry in the Action table, say A ! ,
Resulting transition table:
intoboth| andA.Othertables,suchasamapfromtheintegerrepresenting agrammarggingandfor diagnostic m
Handle Fi
lr(1)parsergmechanism embeddedition,CC,rep- resentsahawsthedfafor our example
Howcanthhenweknow
thatthelangelr(1)parser
relies on a s set of handles
79 of 96
tack and ructures
|
sydu
n
sn nec no
,
ew uh imple observation: the set of handles is finite. The
mItebroatlioint
oItietms t
exGtouaall
nLaismt
eP, aire
n(ee
ed) fo
redoefb
essages.
0 cc0 ; cc1 cc2 cc3 ; ; 1 cc1 ; ; cc4 cc3 ; ;
ding, Revccisited; ; ; ; ; ; 2
cc3 ; ; cc5 cc6 cc7 ; derive their efficiency from a fast handle-findi
2 cc4 ; ; ; ; ; ; the Action and Goto tables. The canonical coll
cc5 ; ; ; ; cc8 ; cc ; ; cc cc cc ;
dle-finding df6a for the gramm9ar. Fi6gure 130.25 sh cc7 ; ; ; ; ; ;
the parentheses grammar.
3 cc8 ; ; ; ; ; ;
cc ; ; ; ; cc ; 9 11
lr(1) parser use a dfa to find the handles, cc10 ; ; ; ; ; ;
age of parentheses is not a regular language? T
4 cc11 ; ; ; ; ; ;
is prenciFsIGeUlRyE3t.h23e TsreacteoffthecLoR(m1)Cpolnesttruectilonro(n1th)eiPtaeremnthse—sesGthraomsmear.with the placeholder •
120 CHAPTER 3 Parsers
at the right end of the item’s production. Any language with a finite set of sentences can be recognized by a dfa. Since the number of productions and
The LR implic drasti handl
s: shifts
Constructing CC: The Algorithm (2.1)
State eof (
1 2 3 4 5
● CalculateCC={cc0,cc1,…,cc11}
● Calculate the transition function ∶ CC × ⌃ → CC
(a) Parentheses Grammar
Acl
0s3
1 acc s3 2r3r3 3s6s 4r2r2 5s 6s6s 7r5r5 8r4r4 9s
10 r 11 r
(b) Action and G
Goal ! List
List ! List Pair
| Pair
Pair !(Pair)
|()
78 of 96
nFIGURE3.16 TheParenthesesGrammar.
8[Goal ! • List, eof] [List ! • List Pair, eof] [List ! • List Pair, (] 9 the number of lookahead symbols are both finite, the number of complete
0 =:[List!•Pair,eof] [List!•Pair,(] [Pair!•(Pa items is finite, a[nPdairth!e •la(nPgauira)g,(e]of h[aPnadirle!s •is(a),reogfu]lar lang[uPagire!. •
ir ), eof] ( ),(]
Constructing CC: The Algorithm (2.3)
When the lr(1) parser executes, it interleaves two kinds of action
to TableSince each item has the • at the start of its right-hand side, cc0 contains only
t anPadirreduces. The shift actions simulate steps in the handle-finding dfa. The
bles
RespouslstinbiglitDiesF.AThfoisristhapepproaprsiaeter,:since it is the parser’s initial state. The first 2 iteration of the while loop produces three sets, cc1 , cc2 , and cc3 . All of the 4 other combinations in the first iteration produce empty sets, as indicated in
Figure3.23◆,whic⇣htrac◆esthe⇣constr◆uction⇣ofCC◆.⇣ ✏ ✏ ) ✏
goto(cc , List) is cc .
0 1
Pair
✓⌘✓⌘✓⌘✓⌘
5
cc1 – cc4 cc5 – cc8
✓8 @( ✓
List Pair (
@
9 ◆⇣ @R◆⇣◆?⇣ ◆⇣◆
cc = [Pair!•(Pair),eof] [Pair!•(Pair),(] [Pair!✏•(),e 1( ( Pair )
[Goal ! List •, eof] [List ! List • Pair, eof]
[List ! List • Pa
< cc - cc - cc - cc - cc 0 3 6 [Pair!•(),(] 9 11 ✓⌘: ✓⌘✓⌘ ✓⌘✓ @@@ cc represents the parser configurations that result from recognizing a List. Pair @ )@ )@ ◆⇣ ◆⇣◆⇣ All of th@Re i✏temsare possibilitiesR@th✏at lead toR@an✏otherpair of parentheses, cc cccc 1 except for the 2item [Goal ! List •, eo7f]. It repres1e0nts the parser’s accept state—a reduction by Goal ! List, with a lookahead of eof. ✓⌘ ✓⌘✓⌘ n FIGUgRoEt3o.(2c5c H, Panadirle)-Fisincdcing. DFA for the Parentheses Grammar. 02 ir, ( ⇣⌘of] 80 of 96 cc does not contain a handle, so the parser shifts ) onto the stack to build more context. It moves to state 7. [List ! Pair •, (]o tionTabe Go ) Lis 1 7 8 10 11 5 4 otoTa < =; cc 2 = n[List ! Pair •, eof] cc2 represents the parser configurations after it has recognized an initial Pair. ]9=; c e s ! n(=n!o•o) [Pair!•(Pair),)] [Pair!(•Pair),eof] [Pair!(•Pair),(] other combinations in the first iteration produce empty sets, as indicated in c ;;;;;; h a h r n ce 6a s c o c c 1 fi d u gs generator can fill in the Action and Goto tables by iterating through ! it arri v e Bothite 2 Li ve new ermin it mu to re c6 ,)) s d The second iteration of the while loop tries to derive new sets from cc1, s i 23 9 c10 ; ; ; ; ; ; n s t Figure 3.23, which traces the construction of CC. ce of rule 4, e ( Pair ). Both items specify the corresponding reduction. c c ,a ndc c .F iv omfo vteh dec o;m bin a tio nsp ro du c en o nem if the form of i is [↵! •x, a] then cc1 ! prior history of the parse—to restart the handle-finding dfa in a state th roheeflpeacts the nonterminal that the parser just recognized. For example, in t =!•!• et contains one item, which specifies a reduction to Pair. ! (a) Pare Action Goto cc 567( ) or a matching ). state 6 to represent the n , In cc9, the parser nee[dPsaitro!fin•d (a P)atior )co, )m]ple[Pteairrul!e 4(. • Pair ), )] 3 c c ,a nd cc .T hr ee of the co i s d e h mb na tion pro uc ne pt a cc3= 3 cc ; ; ; ; ; ; tion uses left context—the state revealed by the reduction summarizes t goto(s,x) Pair,itenter[Psatihri!ss•ta(te).,B)]othite[mPasire!pr(e•se)n,teoafr]eductio[Pnaibry!L(is•t!),(]ListPair. ys s r ets w set nal it . T1his pa sonc he sca e c o s, w ile o g n ne ize e e d f specifyareductionbyPair!(). 10 c ,)) 9 baycGhotco 1caltslook d n alid states o 3p.4ar.s3 8 a In cc , the parser needs to find a ) to complete rule 4. in are new. for each item i 2 s goto(ccc13,(r)eipsrecscen3t,s4wtheicpch goto(cc0, List) is cc1. c,Profwhich 6cccognize r finds a a t e 8 , t h e p a r s e r h n 1a1rrsepr;’rsesc;eonfits;guthrae;tifount;uafrte;rneiterdectogfininzdesaamnaintcithiailng(.). cc9 ; ; ; ; cc11 ; [Pa [Pa ), )] ir () ir ( il sta n ,f When the parser enters state 3, it must recognize a matching ) at some point ou ro fw h ich c moved moved [ {[↵!x•, a]} goto(cc1, Pair) is crcet4u.rn closure(moved) goto(cc , Pair) is cc . in3the future. 5 = [Pair ! • ( Pair ), eof] [Pair ! • ( Pair ), (] [Pair ! • ( ), eof] nFIGURE3.21 ThegotoFunction. 8 goto(cc ,() is ccn . 9 < [Goal!nList•,eof] [List!List•Pair,eof] [oList!List•Pair,(]= air)Cisocncst.ructing CC: TheCA []Pair[Pa!ir ! ] :goto(cc0,() is cc3. [Pasirta!te•( 0 : ; o] repcrce The(leftcoangrtaemxmtafrosyrmtbhoilsxsaendtriestucrncs1a,nwewhsiecthofrlerp(1r)eistemnst.sItaitesrattaetseovwerherethepar)ser he itemRse naewrseit sot i The goto function, shown in Figure 3.21, takes a set of lr(1) items s and l g [Pairth!e ite•m(s iPna9 ( rithm (2.4.1) empl estohitsan seua redl ethc tra d p:arentheses, n intro d isn du nsain gbewilis etti, olc ce the loo aetask totc iteration of the while loop produces three sets, cc1 , cc2 , and cc3 . All of the . cc3 represents the parser’s configuratigonoatfteor (itcreco6g,n)iz)esiasn cinictia1l0( goto(cc1,() is cc c ctl com toe eep stiarte o i c e rs kahead sym bo l). Th us ,go to( ) retu 6 120 CHAPTER 3 Parsers ),t 2 ,saendnctc3 rns (cc ,Pai goto(cc3,()is[Pacirc!6.(•Pair),eof] [Pair!(•Pair),(] [Pair!(•),eof] inthefuture.n o ) oin ms are handles for a reduction by List ! Pair. h iteration of the while loop tries to deri pr t e s e n T e f ] h •( u o r 66 8 6 a en nFIGURE3.23 TraceoftheLR(1)ConstructionontheParenthesesGrammar. ew. s.irW)h,en)]it fin[dPsainrit!em i(n •whPicahirth)e ,•eimomf]ediat[ePlcyacpir5ec!ecdeos(n•siPsatisr )o,f(t]wo partially complete items. The re Since each item has the • at the start of its right-hand side, cc0 conta ha=s recognized one or more occurrences of List. When it then recognizes a 3 x, it creates a new item by moving the • rightward past x. This new item resents[tPhaeirp!ars•e(r)c,o)n]figura[tPioainrs!th(a•tr)e,seuolft]fromr[Peaciorg!ni(zi•n)gpo,s(asi]bLilitiesst..Thisisappropriate,sinceitistheparser’sinitialstate.Thefirst g c. e p2a oto n th en ters st ate 3 In cc , another ( wil t <[Goal ! • List, cc4=[List!ListPair•,eof][List!ListPair•,(T]heseconditerationofthewhilelooptriestods cc5 =cc =[Pa[Liirst!•P(airP,eaofi]r•)[L,iset!of•P]air,( r o r) is c eof] [List ! • List Pair, eof] [List [Pair ! • ( Pair ),(] [Pair ! • ( ), eof] [Pai t.hFiveeonfethecdomfboinratiaonsmproadtucehnionou Pair,itenrteprsesethntissthsetpaatrese.rB’socotnhfigiutneramtiosnrafeteprrrecsoegnitziangrex.dGuotcotipolancebsytheLseisgto!toL(cisct1P,Paairi.or)iscc4. oaplnl When the parser enters state 3, it must recognize a matching ) at some point Given the initial set for the parentheses grammar, goto(cc , List) is cc . u c n i l operan ,oanfC willreduceotbheyrcroumblienat4io,nsPinatihrefi!rstit(eraPtioaniprod)u.ceemptysets,asindicatedin followed by a Pair; it now must find a matching ). If the parse goto(cc ,)) is cc . cc = [List ! List Pair •, eof] [List ! List Pair •, (] which=represe[nPts atheirfutu!re nee(d toPfiandira m•atcFh) igiun, for the item [Goal !9 List •, eof]. It represe5nts the parser8’s accept t a, PT cso5ncn =dc>:2
raPirvca
o[e
:1[r,Pla!eir=!4(.•[P(PaPaiirr)
,
(]ifr•!], )(]•[Paairi)r, );] oa
rah
eire
s
e
)w,(,] m
s)ceir
)p, e[to a
(s]e p=
ciu r
returns that new state.
3,
rge 3
e
d
s
t
o
fi
n
d
t
)rPfia]eisr
t
mPcac

(e[
o
c
o
l
goto(cc3,() is c
1
reduction by8 Goal ! List, with a looka1h3e2adCHoAfPTeERo3f.ParsersT9he left context for this set is cc , which represents a state where
goitnot(hcec3fu,tPuareir.) is cc5. n
no
8
><[Goal ! • List, eof] [List ! • List Pair, eof] [List ! • List Pair, (] >= (< [Goal ! List •, eof] [List ! List • Pair, eof] [List ! List • Pair, (]= n •n !eirc hoas recognized one or more occurrences of List. When it then re e ] , which traces the construction of CC. ).42.3 c6. 0 1 9 0 ir al,ire [ • ( ) , cc0 = [List!•Pair,eof] [List!•Pair,(] [Pair!•(Pair),eof] cc1 = !,Pa)i it[.P[eaP t!o t>;s
fr
)
riara!t
iiro•!n
P)th,(ae]ir
!olfo• ](o
d([eP
•(e8
( Poa(f
w•h[iP)
r]!
Paicrc, it e=nters1t0his state. Both items represent a reduction by List
[Pair!•(Pair),eof] [Pair!•(Pair),(] [Pair!•(),eof]
, it
mus
reco
P
6
cc2, and cc3. Five of the combinations produce nonempty sets, four of whicgohto(cc ,() is cc .
air !
we can derinve the state of the parser after it recognizes an initial ( by com- o cc1 represe(nts the parser configurations that result from recognizing a List. )
[P
•(),)] [Pair !(•),)]
cc5 consists of two partially complete items. The parser has recognized a (
are new. goto(cc ,() is cc , which represents the future need to find a mat
0,i(s)t.T!heiPnnaerirloo•p,fiendosfo]urite[mLsitshtat!haveP•abierfo•re,((.] Allof1theite[Pmasira!r3ep•o(ssPiabirli)ti,e)s]tha[tPaleiard!to(a•nPotahirer),peaiorfo]fp[aPraeinrt!hes(es•,Pair),(]
pcutcing g=oto(c[cL
2 cc3 =
followed by a Pair; it now must fiWnd ahmeantchiintg )
a
Goto creates a new item for each, with the • advanced beyond (T. Chleosuprearser arrives in cc6 when it encounters a ( and it alread
goto(cc1, Pair) is cc4.
adds two more items, generated from the ite
tains

T
, which specifies a redu
h
i
s
s
a.rIfrtihvepeasrserinfinexdc
e
e
t
c
o
n
sep
tat
t,t
amir 8
fo)r
hiet [itPe
[!Go•
•,e
rep(
i
ze
0
3
t
e
gotos(tactec—a, Preaduicrt)ionisbycGcoa.l ! List, with a lookahead of eof.
, one item
t
h
!),
o[fP
),(
e
p
a
r
s
e
r
h
a
gniz
s
r
c
o
ea
m
r
at
ch
ing
at s
r
e
s
g
p
n
al(
)L]ist
aircc!
]a.i
er
ept(
Irt !
,teso
[’Psa
re•se)n
tfh]e
pars
o
om
n
]
.
resents the p6arser configur6ationsPaafteirrit!has r(ecoPganiizred)an.inBitioaltPhacc3 represents the parser’s configuration after it recognizes an initial (.
items is cc
I
c
parser to stack anoth
n
cc4(= [List!ListPair•,eof] [List!ListPair)•,(] cc5 = [Pair!(Pair•),eof] [Pair!(Pair•),(]
c
,
a
n
o
t
h
e
r
w
(
0 Wiirh
re
Pai
r.
se
ill ca
u
s
3
n one(onothestack.Tnheitemsshowthateithera(ora)leadoto
will reduce by rule 4, Pair ! ( Pair ).
ms
wit
h
The
5
• be cc0 ,(
fo
e
t
t.e
rser s
t t
t i
h
e
e
m
s
p
e
c
i
f
y
h
e
c
o
ep
t
t
h
e
n
e
e
d
f
o
r
a
m
a
t
c
h
i
n
g
.
goto(cc3,)) is cc7.
T
ir
!
)]
he
l
eft
co
nte
x
tf
or th
is
se
ti
s cc
w
1,
te
w
5
oenpsaisrtsTsehroefsetcwonodiptearartiioanlloyftcheowmhiplletloeopitterimesst.oTdehrievepnaerwsesertshfarosmrcec1o,
hic
h re
pres
en
ts
a
sta
hcerce)tch
. cc = [List ! Pair •, eof] [List ! Pair •, (]
( ) 3.4 Bottom-Up Parsing 133
[Pair!(•),(] [Pair!•(Pair
),)
[Pa
),
2
Tofindthes[ePtaofirsta!test•hat(dePriavierdi)re,c)tly]from[Psaoimre!state(su•chPaasicrc0),,th)e] cc,anndcc.Fiveofthecombinationsproducenonemptysets,fourofwhoich
cc2 represen2ts the pars3er configurations after it has recognized an initial Pair.
has recocgcniz=ed one or more occurrences of List. When it thenforellcowgendizebsya Pair; it now must find a matching). If the parser fi
c=om[Pabirin!a
Pair, it eitnemteirnsctch0.isThsitsaptreo.duBceostahllitthemsestsrtehaptraereseonetsaymreb6odluacwtaiyofnrobmyccL0i.swt !ill LreisdtuPc9aeibr.y rule 4, Pair !( Pair ).
[Pair ! • ( ), )] an[Pdairc!c
n•,e
(•.
),O goto(cc ,1P0air)iscc .
algo6rithm can computegoto(cc0,x)for each x that occurs after a • in an Bothitemasraerneehwan.dlesforareductionbyList!Pair.
)]nlyoncce
o•n, eocfr]ea[tPeasir !a n
m]
ti
o
(
The parser arrives in cc6 when it encounters a ( andIift,ailnreasdtayteha3s,athleapstarser nfinds a ), it takes the transition toocc
7
goto(cc1, Pair) is cc4.
(
)
)
(
i
s
c
To compute the complete canonical collection, we simply iterate this process
= n[List ! List Pair •, eof] [List ! List Pair •, (]o
e
on
the sta
[
P
r
P
,
e pa)rser ]
(
r
c
.
goto(ccto1a,(fi)xeidspcocin3t., which represents the future need to find a matching ). cc goto(cc5,)) is cc8. goto(cc3,() is cc64.
F
one ( on the 1st1ack. Thne items show that either a ( or a )olead to v
ck. The items show that either a ( or a ) lead to valid states.
10 specify a reduction by Pair ! ( ).
on The Algnorithm o cc The lef
=(t context
goto(cc3, Pair) is cc5. goto(cc9,)) is cc11.9 [Pair ! • ( Pair ), )] [Pair ! ( • Pair ), )]
Tcocc8on=struc[tPthaeirca!noni(caPl acoilrle)cti•on,eofosfet]s of[lPra(1i)rit!ems(, thPeaailrgo)rit•h,m(] has recognized one or more occurrences of List. When it then recognizes a
ion S 10 0
goto(cc3, Pair) is cc5.
yces a trancscition t=o an e[xPisatiinrg !state.( Pair ) •, )]
!S, the gotwo(icllcr6e,dPuacier)biysrcucle9.4, Pair!(Pair).
or a
If,instateini3tia,litzhinegCpCartosecornfitaindcsc0,aas)d,esictritbaekdeasrlitehr.eNterxat,nitssiytsitoemnatiocalclyc7.Bothitems cc5= [Pair!(Pair•),eof] [Pair!(Pair•),(]
grammar
with the go
al
pr
odu
ct
algo
rithm
7
s. Tthorefienonfdtheac)omtboinactionms proledutce rnuewloese 7
fo
ere)th
r th
a
is s
e
i
a
i
t is c
pre
sen
!

c , whi(ch re
tate )wh
1
sas
t
o cc6 = computestheinitialset,cc0,andthensystematicallyfindsallofTthhesetstohfirditeratioPanir,iotefntetrhs[Ptehaisiwrstah!tei.Bl•oet(hi)tleo,m)os]rpeprterseinetsa[rPetdaouicrtid!onebry(ivL•iest)!,n)Le]iswtPaisr.e
goto(cc3,)) is cnc7.
cc5 = [Pair!(Pair•),eof] [Pair!(Pair•),(]
Whenitarrlirv(1e)siteimnstshtaattanerer8e,actn
e 3.2[2P,
r, anedeccd
sets in cCCc;7go=to, in[Ptuarnir, u!ses c(lo)su
c
a
goto(cc1,() is cc3, which represents the future need to find a matching).
hhabelepfraormseccr0.hItarsepreeatcedolyganpipzliedsgoatnotionthsetanneowcoe of rule 4,
I•r
sahio trw
pc)goc•r
.cce
n,e.eFoi
h !s t e
r,(m]s
c
6
Pair ! ( Pair ). Both items specify the corres9ponding reduction.The parser arrives in cc6 when it encounters a ( and it already h
c
gfu]r
he(al
it5,h
c
,regoin
followed by a Pair; it now must find a matching ). If the parser finds a ), it
goto(cc ,)) is cc .
3 cc5 consi7sts of two partially complete items. The parser has recognized a (
=
r
!
•pb
[
P
]
a
i
cc5 consists of two partially complete items. The parser has recognized a (
(
)
)dsub
specify a reduction by Pair ! ( ).
81 of 96 n o
nds e
a t
t e
goto(cc ,() is cc . In goto(cc3,()isccc6c.9= [Pair!(Pair•),)] 6 6
c
c
,
a
cc7 [Pair () ,eof] [Pair () ,(] State 11 calls for a reduction by Pair ( Pair ).
air
)
rfi
), i
n
o
t
h
e
r
w
6
The third iteration of the while loop tries to derive new sets from cwci4ll,reduce by rule 4, Pair ! ( Pair ).
(
s
h
tchin
g
. If
the
pa
rse
i
l
l
c
l cause
•ListPair,(] = erive new set
), e•of)] , ( er !m•p(ty),(s]ets, f
a
u
Pairir
(• (
g). ;
parser has ins only
9
fol
low
ed by
aP
; it
no
w
mu
st
find
a ma
3.4 Bottom-Up Parsing 133
rser to stack another
,
of “(())()”, the parser stacked an instance of sta d a ( Constructing Action and Goto Tables (2)
t it encounters. These stacked states allow the algorithm to m
), it Resulting Action and Goto tables: pening and closing parentheses. ro
nStdatien
Action Table
Goto Table
gedof
a(h
s)tr
nListit
ioPanirs
parser traverses the nont
0s312
1 acc s3 4
hese transitions, shown in
2r3r3 3s6s75
n the Goto table. The comb 4r2r2
s is5 to invoke sth8e dfa rec l.6 s6 s10 9
7r5r5 8r4r4
9 s 11
10 r5 11 r4
air)•,(]
.
ethatthehandle-fiaaonbothterminala
nalsymbols.Theerminaledgesonl
ilenasttance of rule 4,
to Pair.
ael!acLtisotn.Eachoft grayinFigure3.25 uction.
ates.
st ! List Pair
sets from cc , cc ,
stoavalide8ntry9iinedeffectofthet t. | Pair
iorn!te(rmPairn)alactionursivelyeachtime tems
|()
nize a nontermina
cc4 , t one
.
er tEorstraockrsanointhetrhe Table Construction ntheses Grammar (b) and Tables
te 3 f
83 of 96
cc62repngred
n
from cc1 parseoreve
cc1rep(thaatchu All of t
exceptheo
state—a
goto(cc , the p
valid st c,()ree1rduGco,corr
!
r
the parse
)cognizes ! ( Pa
nd no
N
g
Li
c
p
niz
ed
a
(
3
o
st
Pa
i
o
n
nds a), i pty set a4ndPna
. Both i
5
recog
ts from
4.
as at leas
ts, while
t
i
ching ). ermiyon
ydhasnat ction
As a second example of the lr(1) table construction, consider the ambig nFIGURE3.16 TheParenthesesGrammar.
9from cc gotoIt(cficnd,)s)oisnclycem.ptysets,sotheconstructionhaltswith12sets,cc
The final iteration of the w
got
6
cc6 = If, in state 3, the parse(r finds a ), it takes the transition) to cc7. Both items
produces a transition to an existing state.
[Pair!•(),)] [Pair!(•),)] [Pair!•(Pair),)] [Pair!(•Pair),)]
o(c
c
,()
is cc
.
p urthi(cc,)iscc.Incc,anotherwillcausetheparsertostackanothercc6=ou,sg11gawa
h
,)] e
loo
e
s to derive new sets from cc .
t
ri
teration of the while loop tries to derive new! • !

goto 6 ( 6 6 (
sets from cc , cc
g
.c .
S
1
13
ot go
pr
xisting state.
The parser arrives in cc6 when it encounters a ( and it already has at least
Th
state 6 to represent the need for a matching ). The third iteration of the w6hile loop tries to derive n8ew sets
one(onthestack.Theitemsshowthateithera(ora)leadtovalidstateosn.e(onthestack.Theitemsshowthateithera(ora). 6 10
o(ctco
) i, )s
through cc
.
goto(cc3,)) is cc7.
set
co
ntains
o derive new sets from cc4, oneit2emyspositionedat
1c
r
a
r
all
)cics 37
and cc10. Only one combi cc5, cc6, and cc7. Thr
c
s fo
The final iteration of the
uring const
It finds only empty sets
through cc11. 18 20
ly empty19
Filling in the Tables
c cc1
r
ed
1c0
11
it
ion
to
an
e
Constru
1
cting A
n
cc10 =
o
)
10 If, in state 3, the parser finds a ), it takes the transition to cc7 . Both itemsI
[Pair ! ( ) •, )] If, in state 3, the parse
1. = 11
is
cc111=1 j
e ar
d
17
1 cases gener
8
od
uce
s
a tra
10 cc 2 CC. E
tate l12yelements
s
f
o
u
c
ii uct
14
15 t
16
eration of th fromocc
se2sets,c
r)•,)] ○ L12, 13: Next valid step in discovering A is to match terminal symbol x.
t ○ L14, 15: Having recognized , if current word matches lookahead a, reduce to A. s
21
1.
Given the canonical collection of sets of lr(1) items for a grammar, the
parser generator can fill in the Action and Goto tables by iterating through
State 11 calls for a reduction by Pair ! ( Pair ).
CC and examining the items in each ccj 2 CC. Each ccj becomes a parser
○ L16, 17: Accept if input exhausted and what’s recognized reducible to start var. S.
state. Its items generate the nonempty elements of one row of Action; the
○ L20, 21: Record consequence of a reduction to non-terminal v from state i
corresponding transitions recorded Tduhrineg cfionnstraucltiointeofrCaCtisopencifyothfe the while loop tries to deri 82 of 96
e
pars
ns
cc7 = [Pair!()•,eof] [Pair!( ction and Goto Tables (1)
n
cc7 =n[Pair!()•,eof] [Pair!()•,(]o cc =n[Pair!()•,)]
er ar
rive
s in
cc w
he
e
nt
n it
ncou
ers
a(
and it
alre
sa
st
ady ha
t lea
l
y
o
mbination creates a n
n
e
c
o
o
6(c,)c no
t i
cc5,cc6,andcc7.Threeofthecombinationspros,
n
e
m
p
t
y
s
e
t
.
specify a reduction by Pair ! ( ). r finds a ), it takes the transition to cc7. Both items
ALGORITHM: BuildActionGotoTables
The third iteration of the while loop tries t In an LR parser, the handle is alwa cc5, cc6, and cc7. Three of the combinations
INPUT:
, which specifies a reduction to Pair.
Pair!().
n(1) a grammar G=(V, ⌃, R, S) o
F
ptor
edr
e
m
c
i
l
l
i
n
g
i
n
t
t→s fSro
h
e
T
Given the canonical collection of se anOeUxTisPtUinTg:stAactet.ion Table & Goto Table
a
bles
′ stparocdkutcoespaatrnandsitiohnetocahnaeixnistoinfghstanted.les
while(2lo)opgotrailes
This set contains one item, which speci
od
ui
teino
cv
the while loop tries to derive new sets from cc4,
enw sS
(3) a canonical collection CC = {cc ,cc ,…,ccn}
nation creates a nonempty set. r0ever1se rightmost derivation.
c8, c
c9
,
of the combinations produce new sets, while o
ee (4) a transition function ∶ CC × ⌃ → CC
ne
[
P
!(Pair)•,)]
The fourth iteration of the while loop tri
PRO
r
C
ED
a
i
UR
E
for cci ∈CC
:
arser generator can fill in the Actio
:
p
for item ∈ cci:o
aPitan
on creates
eimrd
[,A)
CC and examining the items in each
)=•c
. c ] → 1 0 ●
O
n
l
y
o
n
e
c
o
m
b
i
na
t
i
[Pair !if(
(
x) =
Action
]
a
s
p
o
n
]
e∧
d
9 11
i
[i, n
cc
ep
cc
c
x
:= shift j i j
e
n
c
th
t
o
Action[i , x]
i
n
b
,
elseif item = [A → ●, a] then
,
y
P
ir!(Pair). ).
a
a
\pa
us
o
state. Its items generate the nonempt
n
b
yP
air
(Pa
!
ir
Action[i, a] := reduce A →
elseif item = [S → S′●, eof] then while logopotrtieos t(ocdcerive,)ne)wisetscfcrom c.c11.
c
ng transitions recorded d
or
r
e
eo
t
f
:
=
, so the construction halts with 12 sets, cc0
e while loop tries to derive new sets
for v∈V:
end nonempty elements of Goto. Three
ts,if (cci, v)=ccj then
s
t
G
v
]
construction halts with 1
o
to[i , o
=j e
h
table: cc11 = [Pair ! ( Pai
end
i
s
c
c
.
is
Th specify a1re1duction by
The fourth iteration of t4he The third iteration of
3
produce new sets, while one
5
fies a reducti
6 producesatransitio7ntoestoderive
ts of lr(1) goto(cc9,))9nnandGotot
produces a
a nonempty
the Tables
nonempty elements of Goto. Three cases generate entries in the Action
table: It finds only empty sets, so the construction h
lead to valid states
duce new set
•, (]o
,rammarfortheclassicif-then-elseconstruct.Abstractin
0
e
t contain a handle, so the parser shifts ) onto the stack to build more
etails of the controlling expression and all other statem
t!P c[0,bLe
j
ahead row
codes n of
rse re tries
BUP: Discovering Ambiguity (1)
. It moves to state 7.
hem as terminal symbols) produces the following f
hird iteration, the situation has changed. The stack contains a han-
mar:
ir ! ( ) i,t, where t is the stack top. The Action table directs the
o reduce ( ) to Pair. Using the state beneath Pair on the stack, 0, and Pair.
e pars
er moves to state 2 (specified by Goto[0,Pair]). In state 2, a grammar, the
pckandeofasitslookahead,theparserfindsthehan- frcG,ocacl,!Stmt
the sta om1 c ratin r,ti an
2 sotm]).eF
the par of3Ac
his situ
89
tegthrough aidreduces,whichleavestheparserinstate1(specified
Stmt ! if expr then Stmt icsinaalplya,rinsesrtate1,withListatopthestackandeofas
,serdiscoversthehandlehGoal!List,ti.TheAction
tion; the | if expr then Stmt else Stmt
tationasanacceptaction,sotheparsehalts.
Cecifythe
| assign
C sp 4
qwoshiftsandthreereduces.lr(1)parserstaketime e Action
uired t in th
for
ir ato sets
by i
ional to the length of the input (one shift per word returned from nner) and the length of the derivation (one reduce per step in the
two● nCoanlctuelramteiCnCal=s{yccm,bcocls,.,.G.,}oal and Stmt, and six terminal s on). In general, we cannot expe0ct to1discover the derivation for a
e in an●y fewer steps.
Calculate the transition function ∶ CC × ⌃ → CC
xpr, then, else, assign, and the implicit eof. 3.17 shows the parser’s behavior on the input string, “(())().”
w sets from cc . 84 of 96 11
sceropnesrftorumcstsioxnshiftbs,efigveinresducbesy,andinoniteiaclciezpitnognthicscinput.to the item [
0 3.18 shows the state o0f the partially-built parse tree at the start of
ith 12 sets, cc
ents ( ur-pr
while on
.Ohedbytrea context
does no
items
with Pa new
ables dle hLis
set.
of one
able en ructio
ate en proport
0 hccIthasymbol
g in
1
derivati entenc
through cc
.
• Stmt, eof ] and taking its closure to produce the first set.
11
the canonical collection of sets of lr(1) items for a grammar, the
if, e
4
ngtooductio
n the t gram
dle, hPa parser t
on to
Pair, th
Figure
ve ne
Thheepar Goal
alts w Figure
each iteration of the parser’s while loop. The top of each drawing shows an
iteration number and a gray bar that contains the partial parse tree’s upper

> ;;;;;;;=:;!• ;{}
cc ; ; ; ; ; ; > ;
;
3!•{}
cc =>< !• >=
> > > > ( > > > >< ) > > > >
> > > >=
!
• Stmt, for if, and for assign.
B
e
4 5 [Stmt ! if • expr then Stmt,{eof, else}],
2 cc4 ; ;
3 cc ; cc cc ; ; ; cc [Stmt!if•exprthenStmtelseStmt,{eof,else}] cc6 =
heca;noni;calcollectionofsetsoflr(1)items. 14
0 [Stmt ! • assign, eof ] [Stmt ! • if expr then Stmt else Stmt, eof ] cc10 = nFIGURE3.26 TraceoftheLR(1)ConstructionontheIf-Then-ElseGrammar.
8 cc14 ; cc15 cc7 ; ; ; cc8 ;
ItemGoalStmtifexprthenelseassigneof Thefinaliterationloo[kSstamtctc1
cc0 nFhmarsym
From this set, the construction begins deriving the remaining members of
9 cc15 ;0 cc ;; cc; Fcrcom; t;his se;;t, the; co;nsctcructio;n; begins ;deriving the remaining members of in cc15, it can only generate empty sets. At this point, no additional sets of
0123
Figure 3.26 shows the progress of the construction. The first iteration exam-
3.4
( >[Stmt ! • assign, {eof, else})], cc8 = {[Stmt ! assign •>, {eof, else}]}
The fourth iteratio>n examines transitions out of cc5. It creates new sets for ( > )
2cc;;;;cc; ;>; >
S;tmt, for cicf7, a=nd for;
; ; cc5 :[Stmt ! • if expr then Stmt else Stmt, {eof, else}];
assign. ; [Stmt!ifexprthenStmt •,eof],
The fifth itera[tSitomnt!exaimfineexsprcct6h,ecnc7S,tmatnd•eclcs8e.SWtmht,ileofm]ost of the com- 3cccccc;ccccwsets.The
(;) 5 6 7 8
()
BUP: Discovering Ambig8uity (2.1)
Item Goal Stmt if 0 cc0 ; cc1 cc2
arsing 137
;cc7 ;; ;
cc2 ; ; ; Item Goal Stmt if cc3 ; ; ;
cc7 ; ; ; 2cc4 ; ; ;
8 53 cc95 4 ccc160
; cc161 cc72
; ; ;
cc7 ; ; ; (
cc7 5 cc9 ; cc11 cc2
struction on the If- [Stmt !
6cc11
cc8 ; ; ;
; ; ;
] [Stmt!•if !• eof] [Stmt
cc12 ; cc13
7cc13 ; ; ;
cc10 ; ; ; ; cc12 ; ; ; 8 cc14 ; cc15 cc7 ; ; ; 0 cc8 ;
c
6cc11 ; ; ; ; ; ; ; ;
nFIGURE3.26 Traceoft
c
LR(1) Co=nstruction on the If-Then-Else Grammar.
he
4cc5 ; 6 7 ;
6;;;; 4 cc6 ; ; ;
cc7 ; ; ; cc1
cc ; ; ; c cc7 ; ; ; ;
cc8 ; ; ; ;
5 cc9 ; cc11 cc2 ;3.4 Bottom-U;p Par
85 of 96 From this set, the construction begins derivin
9cc15 ; ; ; ; ; ; ;[Stm;t
gn, eof ] [Stmt ! • [Goal Stmt, eof ] [Stmt if expr then Stmt, eof ]
;
;
; c10 ;
0
Resulting transition table:
1cc1;;;c7;;;;
5 cc ; c;c ;cc ;; cc 190 112 12
3.4 Bottom-Up Parsing 137
cc ;;c
;;
P
c
tr n
f
6 cc10
cc ccc
Item
Goal
Stmt
cc4 ; ; ; ; e3xprcc5then; elscec6 ascsc7ign ;eof ; ;
expr
cc8 ; ; ; ; 9;cc; ; ;
8 cc15 ; cc 4cc6;;;;14;cc9 ;15;
cc;3 ;; ; ; ; c;c ;;
; c;c5 ;
cc7 ; ; ; cc10 ; ; ; ;
cc
5; cc9 ; ; cc9c11 cc2; ; ; ; ;
cc8 ; ;9;8
cc4 ; ; ; ;
1
1then
;
else
; ; ;; ;;;
15
cc3 ;
; ; ;nFIGU;RE3.2;6 TraceoftheLR(1)Cons
cc10cc10;; ;; ;; ;;cc12 ; ; ;
; cc;5 ; ;( ;;
6cc11 ; ; ; ; ; ; ; ;
; cc12 ; ; ;cc13 ccc7c8 ; ;; ; ; 3
7;cc13cc;12;cc;;;;;;;;cc14 ; ;
cc8 ; nFIGURE3.26 TraceoftheLR(1)Co
9
[Goal ! • Stmt, eof
cc =
cc10 ; ; ; ; 8;cc14;; ;cc150cc7; ;;; ;
;;;;;
; ; ; cc8 ; 9 cc15 ;
if
cc8 ;
; ; [;Stmt;! •;assi;gn, e;o
assign
;
6 cc12 ; 1;3
0 cc0 ; cc1 cc2 ;11; ;; cc3 ;
3.4Bottom-Up 1 cc1 ; ; ; ;123; ; ; 13;
7 cc ; c;c cc2 ; ; ; cc4 ; ; ; ;
expr then else assign eof
; ; ; cc3 ;
cc3 ; ; ; ; ; ; ; ;
cc; ; cc;
; ; ; 78;
2cc4 ; ; ; ;134cc5 ; ;15;
eof
; ; ; cc3 ; ; ; cc14 ; ;
Stmt,
[Goal
]
;7; ; cc ;
;; cc3;;;uctionontheIf-Th
20 cc40 ; c;c1 c;c2 31 cc51 ; c;c6 c;c7
cc2 ; ; ; 4 cc6 ; ; ;
;;; c7; ;; ;
cc12 ; cc13 cc7 ; ; ; cc8 ;
7
3.4

!
UpParsisngs
Bottom-
13
i
; cc8 ={[Stmt!assign•,{eo;
[Stmt ! if expr then Stmt •, eof ], binations produce the empty set, two combinations lead to ne
cc6 = 8
f, else}]}
The fourth it[eSrtmatt!ioinf exparmthieneStsmtt•realnsesiStmiot,enosf]out of cc . It creates new sets for transit5ionone[Sltsmet!fromifc•c6exlepardsthtoenccS9tm,ta,{nedofth,eltsraen}s],itionon
9;; cc7= Thefifthiter(ationexaminescc6,cc7,andcc8.Whilemostofthe)com-[Stmt!if•exprthenStmtelsese
Stcmct, for if, an;d for assig;n. 9
cc7 creates cc10.
binations p;rodu[Scetmtth!e eimfp;t•yesxetp,rtwtohecnomStbmint,a{teionfs,elesade}t]o, new sets. The
; cc7 =
; ; cc8 ={[Stmt!assign•,{eof,else}]}
cc7 ; ( ; >[Stmt )! if expr then Stmt els ],
B
:
crea
tes c c10
.
; 9 The fifth ictecr9at=ion examines cc6, cc7, and cc8. While most of th
U
P
Discovering Ambiguity (
transitiononel[Sstemftr!omicfc•6 lexapdsrtothcecn9,SatmndtetlhseetrSatnmsitt,i{oenofo,nelesxep}r]from 8
> cc8={[Stmt![Satsmsitg!n•,{eiof,elesxe}p]}rthenStmt •,eof], ><[Stmt!•ifexprthenStmt,eof], ; cc =8 ; cc6 ; >[Stmt ! • if expr then Stmt else Stmt, eof ],
The fifth iter[aStmiot!niefxeaxpmr•intehesnSctmctel,secSctmt,{eaofn,edlsec}c] . Wcchil=e{[mStmots!tioifetxhper•cthtoehmennS-StmtmtteellsseeSStmtmtt,•{,eeooff,e]}ls
; cc11={[S
;
c
c
; ; 6 7 8 11
Iteration seven creates8cc13 from cc12 on Stmt. It recreates cc7 and cc8. 89
tm
nStmtelseStmt •,eof]}
[Stm
.
if expr then Stmt else • Stmt, {eof, else}],>Iteration eight finds one new set, cc14 from cc13 on the transition for else
t ! if expr the
8
binations produce the empty set, two combinations lead to new sets. The
(>[Stmt ! if expr then • Stmt, {eof, else}], )> >[Stmt ! if expr then • Stmt, {eof, else}],
When the sixth>iteration examines the sets produced in the fifth iteratio>n, it >
; > W>hen the sixth i>teration examines the sets produced in the fifth iter
><[[Sttm ; ; t r 2.2.2) 7 Resulting canonical collection CC: ; cc;= [Stmt ! • if expr then Stmt,eof], [Stmt ! • assign,eof] 3 >[Stmt ! if expr then Stmt else • Stmt, eof ],> >
> [Stmt!ifexprthenStm>t •elseStmt,>eof]
> > binations produce the empty set, two combinations lead to new se
cc ; ;
binations pro9duce the 3empty set, two combinations lead to new settsr.aTnshietion on else from cc6 leads to cc9, and the transition on exp
The fifth iteratio[Stmt ! • if expr then Stmt else Stmt, eof ],>
>( > )
;;;
(
transition on els:e from cc6 leads to cc9, and the transition on ;exprccfrocmreates cc . [Stmt ! • assign, eof ] 7 10
[Stmt ! if expr • then Stmt, {eof, else}], cc7 creates;cc10.( [Stm;t!if•exprthenStmt,){eof,cecl10s=e}],
[Stmt ! if expr • then Stmt else Stmt, {eof, cc= 8 9
7 [Stmt ! if expr • then Stmt, {eof, else}],
cc10=8 [Stmt!if•exprthenS9tmtelseStm>t,[{Setmotf!,eiflsexep}r]thenStmtelse•Stmt,eof],> cc; > >
; 8[S
;
;
then Stmt else Stmt, {eof, else}] > > ; cc9 [Stmt ! • if expr then Stmt, eof ], When tchce9 s=ixth iteration examines the sets produced in the fifth
;=
cc ; ; >[Stmt ! • if expr then Stmt else Stmt, eof ],>
tmt ! if expr •
><[Stmt ! if expr then Stmt else • Stmt, eof ],>= <[Stmt ! • if expr then Stmt, eof ], = When the sixth iterati8on examines the sets produced in the fifth iteration, it > >
14
cc14 also creates duplicate sets for cc2 and cc3 from cc9.
>[Stmt ! • if expr then Stmt else Stmt, eof ],> creates two ne>w sets, cc from cc on Stmt and cc from cc> > > : 11 9 12 ;10
cc => {[Stmt!assign•,{eof,els>e}]}
8 : ; 3[.S4tmBott!tom-•UpasPasrisignng,e1of39] creates two new[sSettms,t c!c11•fraosmsicgcn9,eoonfS]tmt and cc12 from cc10 on then. It
c
c
s d8u(plica;te sets f;or cc and;cc from cc . ) (
also crea
cc10 = [Stmt ! if expr • then Stmt, {eof, else}],
te
239
cc10 = [Stmt ! if expr • then Stmt, {eof, else}],
nS•tmSttm•t,e{lesoef,Setlmst,e{}e],of,else}],>=Iterationseven>cr[eSatemstc!c13iffromexcpc1r2 otnhSetnmt•.ItSrtemctreealteseccS7tmant,d{ceco8f.,e
910
tt!if expr the
on=enw setls,scce frroommcc concStmlteaandcsc tofrocmcc , oanntdhent.hIte transic[Stmt.! • if expr then Stmt else Stmt, {eof, else}a],l>so creates dupl>i[cSatmtet !setisffoerxcpcr tahnedn cStcmt f•ro,m{eocfc, e.lse}],
also creates dup>licate sets for cc2 and cc3 from cc9. >
7 >: 10 >; cc13 = >[Stmt! •if2exprth3enStmte9lseStmt,{eof,e Iterationeightfin[Sdtsmotn!en•eawsseitg,nc,c{1e4ofr,oemlscec}1]3 onthetransitionforelse. >:[Stmt!ifexprthenStmt •elseStmt,{eof,else}]
cc11 = {[Stmt ! if expr then Stmt else Stmt •, eof ]}
[Stmt ! •assign,{eof,else}]
Else Gram
8 8 ) ) 9 cc11 ={[Stmt!ife9xprthenStmtelseStmt •,eof]} 89
m
a
>
>[Stmt ! if expr then • Stmt, {eof, else}], > 8
>< >[Stmt ! if expr then Stmt e=>lse • Stmt, eof ],> > [Stmt ! • if expr then Stmt, {eof, else}], >
!
exprthe>n>Stmt,eof] > >8[Stmt!if>exprthen•Stmt,{eof,else}], 9 cc14 =<[Stmt ! if expr then • Stmt else Stmt, {eof, else}],= >
<>=
>[Stmt ! • if expr then Stmt else Stmt, {eof, else}],> >>[Stmt ! if expr then Stmt else • Stmt, {eof, else}],> > >>> >
cc12=>[Stmt![S•timftex!prt•heniSftmt,{exofp,erlset}],henStm>t,eof],>[Stmt!ifexprthen•StmtelseStmt,{eof,el>s : ;<< = >
then cSct9>m=t else Stmt, eof ] > [Stmt ! • if expr then Stmt, {eof, else}],
> [Stmt ! • assign,{eof,else}]
>[Stmt ! • if expr then Stmt else Stmt, {eof, else}], > ccc1c21=4 = [Stmt ! • if expr then Stmt, {eof, else}],
>>>>
: [Stmt! •ifexprthenStm;telseS>t>m[St,met!of•]i,fexprthenStmtelseStmt,{eof,else}],>
[Stm>t ! •assign,{eof,else}] >> > > if expr th>en Stmt, eof ] >:[Stmt ! • if>expr then Stmt else Stmt, {eof, els;e
Iteration nine genera:tes cc15 from cc14 on the transition for Stmt, along with > [Stmt ! • as;sign,{eof,else}]
[Stmt ! • assign,eof] duplicates of cc7 and cc8.
:[Stmt ! •assign,{eof,else}]
Iteration nine generates cc15 from cc14 on the transition for Stmt, along wit
emain87ionfg96 members of
xpr then Stmt else Stmt, eof ]
cc15 = {[Stmt ! if expr then Stmt else Stmt •, {eof, else}]} duplicates of cc and cc . 78
[Stmt if expr then Stmt, eof,else ],
Stmt, {eof, el e • Stmt,eof
expr from ;; }]
; ; 9>=
;ecom- sing 1;3 >
;; c12 ; ;; ; cc
t>;s. The r from
on then. It ;;)
; ;lse}],>
;
e}] ation, i>t
then. It en-Ellse}],>;
Then- . 9 • if > expre}],>=>
! • }], >
>; 7cc13;;;cc=;(;!•cct!•)ife(!•{})h
g the r
s.
cc fo uct1ion
m of d
!
5. Sinicfe the•xlpiers at•thetehndeonf eSvetrmy itteemlcsc1e5=S{[tSmtmt,!{eifofex,perltshen}S]tmtelseStmt •,{eof,else}]}
else}]) iteration, it
9> =
=
the canonical collection of sets of lr(1) item
( [Goal!1•cSc1tmt,;eof;] ; ;[Stm;t!•;ife;xprt;henStmt,eof] ) the canonical collection of sets of lr(1) items.
cc7 BUP
biguity (2.2.1)
cc ; ; ; cc ; ; ; ;
IGURE3.26 TraceoftheLR(1)ConstructFioingounreth3e.I2f6-sThhoewns-tEhlei spn
[Stmt assign, eof ] [Stmt if expr then Stmt else Stmt, eof ] cc
5 m t(his set, th3e cco
new sets for the canonical collection fr
24
!• !• 3;;;;;;;;
0
2 cc4 ; ; ; ; cc5 ; ; ;
6 ibncece7 nstr;ucticocn
:
Di
canonical collection of sets of lr(1) items.
;
;
=
;
;
;
cc
;
;
;
cc for assign. These sets are:
sc
o
v
e
10
;
[Stmt!•acscs8ign;,eof;]cc;[3Stfmort;a!ss•i;ginf.Te;hxepsrest;ehtsenareS;:tmtelseStmt,eof]
ure3.26show5stchc9
he c;onstr;uctio;n.i
fis3rst
rin
Fig
g
Am
eroGergasrmemstsarho.fethetcroansntrsucittioino.nThsefiorsutiteraotifoncexcam-foreac
4 cc6 ; ; ; ; ; cc9
[Goal!•Stmt,eof] newsets[Sftomrth!ec•ainofniecxaplrcoltlehcetnionStfmrotm,ecocf0]:cc1forStmt,cc2forif,and
nTnhc ec3
;ihteeratiotnraexn
for each gra
ur
e3.26showstheprogressofthr
gram
om cc : e cons0t
gsinths;edterarni;vsintigon;tsheourtecomc8facinc0i;nfgormeeamchbgerasmomfarsym)bol.Itproducesthree
e pr;ogreccs
ocfct
oams-itions out of cc
s
0
1
1
2
smthtehitsrasnesti,titohnescountstorfuctcio0nfobregeiancshdgeraivminmgarthseymrebmoali.nIitnpgromdeumcebsertshroefe n( o
cc10 ; ; ; ; cc12 ; ; ;
cc1 = [Goal ! Stmt •,eof]
cc12 ; cc13 cc7 ; ; ; csaentosnfiocRratlheceoscllaeuncotlitnoincaoglfcsoecltlsae1co3ntfi8olonr(nfC1rH)oicAmitPeaTmcElcRs.c:3coPcalrlsefeorrcsSttimot,nccCfCor:if,and
6cc11 ; ; ; ; ; ; ; ;
l collection from c cc = [Goal ! Stmt •,eof]
n 012
e
7 cc13 ; ; ; ; ; cc14 [St;mt !;if • expr then Stm1t,eof],
cc2 =
for assign. These sets are:
8 cc14 ; cc15 cc7 ; ; ; [Sctcm8t!;if•exprthenStmtelseStmt,eof]
(
cc
;
)
8 w
ure 3.26 shows the progress of the construction. The first iteration exam-
n o cc for assign. These sets are:
3 9cc15 ; ; ; ; ; ; ;
;
s
ets for the canonica
s the transitions out of cc0 for each grammar symbol. It produces three
cc1 = [Goal ! Stmt •,eof] cc3 = n[Stmt ! assign •,eof]o n( o
R(l1e)CcontsitroucntionfornothmeIfc-Tch0e:n-cEcls1eGfroamrmSar.tmt,cc2 forTifhe,anedxt iteration com[pSutems tra!nsitionisffrom•cec4 w sets for thneFIGcUaREn3o.26niTcracaelofcthoeL
x;pitrcreateshcec5
=th.=ese th[reGe noewalse!ts. Stmt•,eof] cc2=[Goal!•Stmt,eo1f3]8CH[AStPmTt!ER•i3fPeaxprsretrshenStmt,eof] 8 [Stmt!if•exprthe9n
l
t 4u,tc
n
3 foTrhaessniegxnt. Ti(theersaet[iSostenmtstc!aormei:pfTuht•esesxetprcaornsdtithiioetenrsaSttifmornot,meoxfacm]c,in;esittrcarnesgai)otietoson(sccoc
()
caho2sefn)
451
cc0= n o
goto(cc4,th[eStnm[t)S!.tm•ats!signi,efOofn•]lye[Sxotmnpter!c•toihmfebenixnpSrattmtiohteneplStrsmoetdeulSsctemesStt,maet,onefoefw]] set, looking>[aSttmcct2!wiifthetxhpersytmhebnol• Stmt,eof], >
cc1 = [Goal ! Stmt •,eof] > >
8n expr. o 9 ><[Stmt! >
From this set, the construction begins deriving the remaining members of cthce3ca=non>(ic[aSl ctomlletc!tioniaoffssetesioxgfpnlr•(1,)teithoemefsn.] • Stm(t,eof], )
if expr then • StmtelseStmt,eof], >= no
of], [S[tSmtmt !t !asisfig•n •e,xeporf]t> cFicgu2re=3.<26showstheprogressoftheconstruction.Thefirstiterationexam- The nex=t it3e>ra[tSitomnt!com•paustessigtnra,{nesoitfio,neslsfer}o]m, cc4; it creates cc>5 cc4= > >
cc> = [Stmt ! • if ex)pr then Stmt,{eof,else}],
c>n
>[Stmt!if•exprthenStmt,eof], > > >
>[Stmt!ifexprthen• Stmt[eSltmset!Stmitf,eeoxfp]r, •the
=
e second iteration[Setxmatm!ineisft•raenxspitriontsheonuSttmoftetlhseeseSttmhtr,eeofn]ew sets.cc
cince5s t=he tran[sSitimonts!out o•f cci0fforeexacphrgratmhmearnsySmtmbotl.,[I{Stetpmoroftdu,!ceelsisthfere}e]e,xprgo•tot(hcecn,Sthm2etn[eS)lt.mste!Stm•t,iefofe]xpr thenStmtelseStmt,{eof,else}]
>>
new sets>fnor the canonical collection from cc0:occ1 for Stmt, cc2 for if, and > n [Stmt ! if • expr t
4
ly one combinati>on produces a new set, looking at cc2 with the symbol> 8 9
>[Stmt ! • assign,{eof,else}], >
cc3 for a>ssign. These sets are: The four>th iter>at[iSotnmtex!amiifneesxtprranstihtieons•oSuttmot,fecocf ]., It creates new se>ts f
r. [Stmt !n • if expr othen StmtT
eStmst,e{e
f,n
se}]iter
cc3 =:[Stmt!assign•,eof]
>
>[Stmt ! if expr then • Stmt else Stmt, eof ], >
ro=duces a new set, looking at cc2 with the symbol cc ly one comcbci4n=ation cpc2 [Stmt ! if • expr then Stmt else Stmt, eof ]
pr.
[Stmt ! • if expr then Stmt else Stmt, {eof, else}] cc3 = n[Stmt ! assign •, eof ]o [Stmt ! if expr then Stmt • else Stmt, eof ]
The fourth iterat[iSotnmte!x[aSmtmitif!neiesfxt•praerxnps•rittithohenesSntmoStu,tetmofot]
[
S
i],gn •,eof> of
cc6 = The secon
d
i
(
([Stmt ! if expr • then Stmt, eof], )
cc4 = cc86={[Stmt !assign•,{eof,else}]}
el
h
c
o
d
; > 5 > ation examines transitions o
O,f,ecocnf5].l,yIt croeantes ncewosmetsbfoirn
3> > Stmt,forif,and[Sftomrta!ssigfn.expr•thenStmtelseStmt,eof] cc6=: ;
expr.
f
sx
() ()()
[Stmt ! if expr • then Stmt,eof], [Stmt(! if • expr then Stmt,{eof,else}],
The second[Sitetmrattio!n exiamfinesxtprarnsittiohnsenoutStomf the•se,ethorefe]n,ew sets.
t, for
cc7 =
,a
[Stmt ! if expr then Stmt • else Stmt, eof ] [Stmt ! if • expr then Stmt else Stmt, {eof, else}]
cOcnl4y o=ne combination produces a new set, looking at cc2 with the symbol
Stm
i
r
.
expr. cc7 =
[Stmt ! if expr • then Stmt else Stmt, eof] ( [Stmt ! if expr • )then
[Stmt ! if expr • then Stmt else Stmt, eof] [Stmt ![
[Stmt ! if • expr then Stmt,{eof, else}],
i

expr. cc8 = {[Stmt ! assign •, {eof, else}]}
xpr
5
cStm>t, e
> cc1 = [Goal ! Stmt •, eof ] Stmt, for if, an(
m
t
eonf,Setlma
)
n
ation produces a new se
t, lo
o
>[S =
!
s
>
a
df
tmt ! i•
ase
ipgrn
,t{he
t
o
i
o
n
e
sign
)cc4 =[Stmt ! if expr then Stmt •, eof ],
Only one combination produces a new set, l
t
The fourth iteration examines transitions out of cc5. It creates new sets f
t
e
r
ste•}s],,e
xamines transitions
t! ()
binationsccp7ro=duce the empty set, two combinations lead to new sets. T [Stmt ! if • expr then Stmt else Stmt, {eof, else}]
S
t
m
thenSitmft •eelsxeSptmrt,eo
f] then
86 of 96
The fifth iteration examines cc6, cc7, and cc8. While most of the co [Stmt ! if • expr then Stmt,{eof, else}],
[Stmt ! if • expr then Stmt else Stmt, {eof, else}]
f
e
transition on else from cc6 leads to cc9, and the transition on expr fro
sym
o
e] h
n th
mt,
]
,
items can be added to the canonical collection, so the algorithm has reachTehde final iteration looks at cc15. Since the • lies at the end of every ite a fixed point. It halts. in cc15, it can only generate empty sets. At this point, no additional sets
Wbhoeln. tIhtepsrixotdhuitceerastitohnrexeamines the sets produced in the fifth iteration, it itemscanbeaddedtothecanonicalcollection,sothealghe
The ambiguity in the grammar becomes apparent during the table-filling
a fixed point. It halts.
creates two new sets, cc from cc on Stmt and cc from cc on then. It
r Stmt, cc for if, and
. The firs2t iteration exam-
10
egrammarbecomesapparentdurilin
algorithm. The items in states cc01t1hrough cc12 gen9erate no conflicts. State 12
y
BUP: Discovering Ambig
cc13 contains four items:
)
also creates duplicate sets for cc2 and cc3 from cc9.
Th
grammar is ambiguous. Items 2 and 4 generate a similar shift-reduce conflict
cc =Q.W[Sthmatt!do•eisfitemxeparntihfethneStcmut,r{rgenoenrfatte,wsealosshrideft}etn]ot,rycfoor tnhessuamme loecatiisoneinlthse teab?le. Clearly, the tab
else Stm12t,eof]
Iteration eight finds one new set, cc from cc on the transition for else.
with a lookahead>of eof. When the table-filling algorithm encounte1rs4such 13 >
u
e am
it
(
bi
> entry cAantynpiocatlehrorolrdmbesosatgheafrcomtioanpasr.seTrgheinsersaht>oirft-reduce conflict indicates that t
A>. W[Stemcta!n e•iithfeerxsphrifth(tehneSntmetxeplescetinStgmt,o{emoaf,teclhsaen}]o,t>her Stmt) or a conflict, the con>struction has failed. The table generator should report the >
>)
red[uSctmett!o a•Satsmsti.gn,{eof,else}]with a lookahead of eof. When the table-filling algorithm encounters suc
grammianrcluisdeasmthebLiRg(1u)oituems.sIthteatmgesn2eratnedthe4 g>enerate a similar shift-reduce confli problem—a funda:mental ambigu8ity between the productions in the specific conflict; another reason to study the table ; 9
lre(1o) itfems
tothecompilerwrit>er.[Stmt!ifexprthenStmcontstreucltions.e• Stmt,{eof,else}],>

> a conflict, the construction has failed. The table gene>rator should report t A single A>=s.
In this case, the conflict arises beca[uSsetmprtod!uctio•n 2infthegxrapmrmartihs aenStmt, eof,else}],
mt else Stmct,ceo=f]
prefix of produTctihonis3. Tish1e4tkabnleogewnenratoar csoutldhbe d .compiler writer.
these three new se>ts.
conflict in favor of shifting; that forces the parser to recognize the longer
t cc with the symbol 2
In this case, the conflict arises because production 2 in the grammar is
guity
i
3
1. [
algorithm. The items in states cc0 through cc12 generate no conflicts. Sta Iterationsevencreatescc fromcc cocncSotnmtaint.sfIoturrietecmrse:atescc andcc.
b
o
l
.
I
t
p
r
d
u
c
Stm
,el
t
Stm
!
n
s
i
f
ex
pr
t
t

he
● Consider cc
3. [Stmt ! if expr then Stmt1•3else Stmt, else]
2. [Stmt !if expr then Stmt • ,eof]
13 1213 7 8
e
s
t
ree
forcSctm=t,{c[Sctmtf!oriiffe,xapnrdthenStmtels1e. [Stmt !t •if,exopfr ]t}hen Stmt • ,else]
1 11 2 ( )
4. [Stmt ! if expr then Stmt • else Stmt, eof ]
8 cc = [Stmt ! if expr then Stmt • , {eof, else}], 9
3. [Stmt ! if expr then Stmt • else Stmt, else] Item 1 generates>a r[eSdutcm)1e3etn!try foirfcc13eaxndprthe ltoohkeahnead•elsSet.mItetm,{3eof,else}], >
> [Stmt ! if expr then4. S[Sttmtt!•ifeelxspretShetmn Stt,m{te•ofel>,selSstmet,}e]of ] generates a shift e>ntry for the same location in the table. Clearly, the table >
eof ], >[Stmt ! if expr then • Stmt else Stmt, {eof, else}],> entry cannot hold[Stmt!•seStmt,{eof,else}],>
● Consider another scenario, say:
production and binds theelseto the[iSnntmermto!stif.• assign,{eof,else}]
> >>
:;
prefix of production 3. The table generator could be designed to resolve th [A → ●, a] conflict in favor of shifting; that forces the parser to recognize the long
[B → ●, a]
Q. Wh)at does it mean if the current word to consume is a?
production and binds the else to the innermost if. Iteration nine generates cc15 from cc14 on the transition for Stmt, along with
of thesdeuptlhicraetees onf cecw7 ansdectsc.8.
eof], A. We can either reduce to A or reduce to B.
g at cc2 with the symbol
A single Action table entry cannot hold these two alternatives.
else Stmt, ecocf]= {[Stmt ! if expr then Stmt else Stmt •, {eof, else}]} 15
{
orithm has reac ng the table-fil
This is known as the reduce-reduce conflict .
The final iteration looks at cc15. Since the • lies at the end of every item
88 of 96
Fro the
cc0g
Figmmar iFnreo nthewec0:cc cc3
te
3 le he ct
Fig
ine
ne aSstmt,
cc Stmt
The fifth iteration examines cc6, cc7, and cc8. While most of the com-
items can be added to the canonical collection, so the algorithm has reached ,thetcohme-nStmtelseStmt,eof]
binations produce the empty set, two combinations lead to new sets. The
t
t
7e,nanStdmictce
cc7 creates cc10.
Stmt ! i}]}f expr • then Stmt,eof]in, cc15, it can only generate empty sets. At this point, no additional sets of
cc8 = {[Stmt ! as
•,{
f, else
The fifth iteratio[nStmexta!miin
!
f
rsotfo]f •
[m
[
S
cp6r, m
s
ef
ig
se
n
eo
cc =
48 9
cx
ctch
transition on else from cc6 leads to cc9, and the transition on expr from
binations produc[eStmhet emptyifseet,xtpwrotchoemnbSitnmat,ioenosf]l,ead to new sets. The a fixed point. It halts.
8l.seWe•hilxSetmpmt,oe
henSthAtypicaler
Thasheincludesth
OnhenStficconflict;an
exputoofconstructio or
a
Thkingais
On ]
er
ex
orout Stmt,
ookin
m- (he )
Stmt

Index (1)
Parser in Context
Context-Free Languages: Introduction
CFG: Example (1.1)
CFG: Example (1.2)
CFG: Example (1.2)
CFG: Example (2)
CFG: Example (3)
CFG: Example (4)
CFG: Example (5.1) Version 1
CFG: Example (5.2) Version 1
CFG: Example (5.3) Version 1
89 of 96
Index (3)
CFG: Leftmost Derivations (2)
CFG: Rightmost Derivations (2)
CFG: Parse Trees vs. Derivations (1)
CFG: Parse Trees vs. Derivations (2)
CFG: Ambiguity: Definition
CFG: Ambiguity: Exercise (1)
CFG: Ambiguity: Exercise (2.1)
CFG: Ambiguity: Exercise (2.2)
CFG: Ambiguity: Exercise (2.3)
Discovering Derivations
TDP: Discovering Leftmost Derivation
91 of 96
Index (2)
CFG: Example (5.4) Version 1
CFG: Example (5.5) Version 2
CFG: Example (5.6) Version 2
CFG: Example (5.7) Version 2
CFG: Formal Definition (1)
CFG: Formal Definition (2): Example
CFG: Formal Definition (3): Example
Regular Expressions to CFG’s
DFA to CFG’s
CFG: Leftmost Derivations (1)
CFG: Rightmost Derivations (1)
90 of 96
Index (4)
TDP: Exercise (1)
TDP: Exercise (2)
Left-Recursions (LF): Direct vs. Indirect
TDP: (Preventively) Eliminating LRs
CFG: Eliminating ✏-Productions (1)
CFG: Eliminating ✏-Productions (2)
Backtrack-Free Parsing (1)
The first Set: Definition
The first Set: Examples
Computing the first Set
Computing the first Set: Extension
92 of 96

Index (5)
Extended first Set: Examples
Is the first Set Sufficient?
The follow Set: Examples
Computing the follow Set
Backtrack-Free Grammar
TDP: Lookahead with One Symbol
Backtrack-Free Grammar: Exercise
Backtrack-Free Grammar: Left-Factoring
Left-Factoring: Exercise
TDP: Terminating and Backtrack-Free
Backtrack-Free Parsing (2.1)
93 of 96
Index (7)
Canonical Collection (CC) vs. LR(1) items Constructing CC: The closure Procedure (1) Constructing CC: The closure Procedure (2.1) Constructing CC: The goto Procedure (1) Constructing CC: The goto Procedure (2) Constructing CC: The Algorithm (1)
Constructing CC: The Algorithm (2.1) 95 of 96
LR(1) Items: Example (1.1)
LR(1) Items: Example (1.2)
LR(1) Items: Example (1.3)
LR(1) Items: Example (2)
Index (6)
Backtrack-Free Parsing (2.2)
LL(1) Parser: Exercise
BUP: Discovering Rightmost Derivation
BUP: Discovering Rightmost Derivation (1)
BUP: Discovering Rightmost Derivation (2)
BUP: Example Tracing (1)
BUP: Example Tracing (2.1)
BUP: Example Tracing (2.2)
BUP: Example Tracing (2.3)
LR(1) Items: Definition
LR(1) Items: Scenarios
94 of 96
Index (8)
Constructing CC: The Algorithm (2.2) Constructing CC: The Algorithm (2.3) Constructing CC: The Algorithm (2.4) Constructing Action and Goto Tables (1)
Constructing Action and Goto Tables (2)
BUP: Discovering Ambiguity (1)
BUP: Discovering Ambiguity (2.1)
BUP: Discovering Ambiguity (2.2.1)
BUP: Discovering Ambiguity (2.2.2)
BUP: Discovering Ambiguity (3)
96 of 96