COMP90045 Programming Language Implementation
Syntax-Directed Definitions
Harald Søndergaard Lecture 14
Semester 1, 2019
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 1 / 18
Syntax-Directed Definitions
An assumption behind attribute grammars is that the functions used to evaluate attributes are free from side-effects.
A syntax-directed definition (SDD) is a relaxed version which allows side-effects.
C-based tools such as yacc and bison implement SDDs.
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 2 / 18
S-Attributed Definitions and LR Parsers
An LR parser can evaluate S-attributed definitions without constructing a parse tree.
It can do this by storing the synthesised attributes of each grammar symbol with the state following that symbol on the parser stack.
Whenweareabouttoreduce A→X Y Z thetopofthestackis s1 X s2 Y s3 Z s4
Given that all these nonterminals have one synthesised attribute each (A.a, X.x, Y.y, and Z.z), the stack may be organised as shown on the right.
s4
Z.z
Z
s3
Y.y
Y
s2
X.x
X
s1
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 3 / 18
Synthesized Attributes on LR Parser Stacks
s4
Z.z
s3
Y.y
s2
X.x
s1
Actually, the stack doesn’t need to store the grammar symbols themselves; they are implicit in the states covering them. (For example, s2 always covers X). This is because all states except the first were constructed as the target of a goto on a given symbol; each state covers the symbol that caused its construction.
top →
We can implement the stack as two separate arrays, state and val, with top pointing to the current top element in each array. (Or we can implement the stack as an array of two-element structures.)
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 4 / 18
Attribute Values and Reductions
s1
s2
X.x
s3
Y.y
s4
Z.z
s1
s
A.a
↑↑
top top
action(s4,c)= reduceA→XYZ, semantic rule: A.a:=X .x + Y.y + Z .z
where s = goto(s1, A)
before := top-3;
newtop := before+1;
state[newtop] := goto(state[before],A);
val[newtop] := val[before+1]+val[before+2]+val[before+3]; top := newtop;
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 5 / 18
Example: Implementing Actions for a Conditional
stmt : IF expr THEN stmt { $$ = mknode(If,$2,$4) } The val array stores values (in this case, ASTs).
Within each action, $n refers to val[before + n].
In this case, before = top-4, and $2 refers to val[before+2],
and $4 refers to val[before+4].
The resulting AST is stored in val[newtop].
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 6 / 18
S-Attributed Definitions with RD Parsers
For each non-terminal A, build a function that returns A’s single synthesized attribute. If A has more than one synthesized attribute, group them into a single structure.
The code of this function is an extended version of recursive descent. The code for each production handles each part of the RHS:
terminal X: match token, advance input and return X’s synthesized attribute.
nonterminal B: generate B.s := f (A1.s, . . . , Ak .s) where B.s is B’s synthesized attribute.
action code: copy to function. The code has access to the synthesized attributes of the symbols on the RHS, and computes the value to be returned: the synthesized attribute of the LHS.
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 7 / 18
Identifier Lists: An Example
list : ID rest { $1 : $2 }
rest:COMMAIDrest {$2:$3} | { [] }
pIdList = do
ident <- pId
idents <- pRest
return (ident : idents)
pRest
= ( do
pComma
ident <- pIdent
idents <- pRest
return (ident : idents)
) <|>
return []
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 8 / 18
Left Recursion Elimination
Even if you can write an S-attributed definition for the original grammar, you may not be able to write one for the grammar you get after left recursion elimination.
E1∗T E.val=E1.val∗T.val
E →
E → E1/T E.val = E1.val/T.val E → T E.val=T.val
T →
num T.val=num.value
E→TR?
R → ∗TR1 ?
R → /TR1 ?
R→ǫ ?
T → num T.val=num.value
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 9 / 18
Left Recursion Elimination
If we allow inherited attributes, and we allow actions to be written between grammar symbols on RHSs, then we can capture the desired calculations with semantic rules like these (note they use both inherited and synthesized attributes):
E → T R R→∗ T
R1 R→/
T
R1 R → ǫ
{R.in=T.val} {E.val=R.val}
{R1.in=R.in∗T.val} { R.val = R1.val }
{R1.in=R.in/T.val} { R.val = R1.val } {R.val=R.in}
Note: here all “sideways” inheritance is left-to-right in the parse tree.
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 10 / 18
Left Recursion Elimination
This graph shows attributes of the nodes of this parse tree for “9 / 5 * 2”, and how they are computed.
The value of R.in in the base case is then propagated up the tree as the value of R.val and then E.val.
E.val = 2
T.val = 9
num.val = 9 /
R.in = 9,R.val = 2
T.val = 5 R.in = 1,R.val = 2
PLI (Sem 1, 2019)
num.val = 5 ∗ T.val = 2
num.val = 2 Syntax-Directed Definitions
R.in = 2,R.val = 2
ǫ
⃝c University of Melbourne
11 / 18
The General Case
A → A1 Y {A.a=g(A1.a,Y.y)} A → X {A.a=f(X.x)}
After left recursion elimination, this turns into
A → X {R.i=f(X.x)} R {A.a=R.s}
R → Y {R1.i=g(R.i,Y.y)} R1 {R.s=R1.s}
R→ǫ {R.s=R.i}
PLI (Sem 1, 2019)
Syntax-Directed Definitions ⃝c University of Melbourne 12 / 18
L-Attributed Definitions
In an L-attributed definition, a node may have synthesised attributes, and it may inherit attributes from its parent and from its siblings to its left, but not from its siblings to the right.
This allows all attribute values to be computed using a single depth-first, left-to-right traversal of the parse tree.
• ••
••• ••
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 13 / 18
L-Attributed Definitions with RD Parsers
Building a recursive descent parser for an L-attributed definition is only slightly harder than building one for an S-attributed definition.
The main difference is that the functions we build now have arguments, one argument for each inherited attribute of the nonterminal they recognize.
The actions that compute the values of the inherited attributes of a nonterminal on the right hand side can be anywhere before the call to the function that recognizes that nonterminal.
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 14 / 18
Inherited Attributes with Yacc
Yacc doesn’t have any support for inherited attributes, and neither do most other LR parser generators.
The problem is that bottom up parsers can’t use the technique used by top-down parsers. They can’t store inherited attributes in function arguments, since they don’t parse different nonterminals using different functions.
One way to cope with this problem, which always works, is to delay the evaluation of all inherited attributes, as well as any synthesised attributes that depend on them, until after the parse tree is complete.
However, sometimes we can do better.
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 15 / 18
Example: Infix to Postfix
Yacc (but not happy) allows (side-effecting) actions to occur anywhere on the right hand side of a production, not just at the end.
For example, this yacc specification converts expressions involving additions and subtractions from infix to postfix form:
expr : term rest
;
rest : ’+’ term { printf(“+ “); } rest | ’-’ term { printf(“- “); } rest | /* empty */
;
term : NUM { printf(“%d “, $1); } ;
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne 16 / 18
Infix to Postfix: Parse Tree
Given the input “9 – 5 + 2”, this generates output “9 5 – 2 +”. This parse tree shows the order in which actions are executed:
9
E TR
{printf(”9”)} − T {printf(” − ”)} R
5 {printf(”5”)} + T {printf(” + ”)}
2 {printf(”2”)}
R
ǫ
PLI (Sem 1, 2019) Syntax-Directed Definitions ⃝c University of Melbourne
17 / 18
Actions Inside Productions
Bottom up parsers can’t insert actions into the middle of functions; they execute actions only when they perform reductions.
They therefore replace actions in the middle of productions by a new marker nonterminal, create a production with an empty right hand side for that nonterminal, and attach the action at the end:
rest : ’+’ term print_plus rest | ’-’ term print_minus rest
| /* empty */ ;
print_plus :
;
print_minus:
;
{ printf(“+ “); }
{ printf(“- “); }
PLI (Sem 1, 2019)
Syntax-Directed Definitions
⃝c University of Melbourne
18 / 18