COMP90045 Programming Language Implementation
Syntax-Directed Translation
Harald Søndergaard Lecture 16
Semester 1, 2019
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 1 / 21
Syntax-Directed Translation
The idea of syntax-directed translation is that the rules for translating the abstract syntax tree to IR should be based on the structure of the abstract syntax tree itself: the rules for constructing the IR can be written as semantic rules.
In other words, attribute grammars are a useful tool for code generation.
Typically, each type of node in the abstract syntax tree (such as expression, statement) has several attributes used in generating IR.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 2 / 21
Syntax-Directed Translation Attributes
One attribute is the IR code generated for the node itself.
For expressions, other attributes include the type, and the place
where the expression’s value is stored.
The place attribute can be synthesized or inherited.
If inherited, the “user” of the expression (the context in which it appears) dictates where its value should be put.
If synthesized, the user must be able to cope with all possible places where it could be put.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 3 / 21
Translation of Primitive Expressions
We will assume that place is an inherited attribute, and that it is the number of the target register. gen() generates an IR instruction.
E → intconst
E.code = gen(E.place ′ :=′ intconst.value)
E → id
E.code = gen(′load′ E.place ′from′ id.stackslot)
In non-optimizing compilers, every variable is stored in its own stack slot, allocated before code generation starts.
Optimizing compilers can arrange that some variables may spend their entire lives in registers, and can enable variables with non-overlapping lifetimes may be stored in the same register.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 4 / 21
Translation of Operators
The standard algorithm for translating expressions with binary operators uses a stack, which can be simulated with registers. Unary operators are even simpler to handle.
E → E1 + E2
E1.place = E.place
E2.place = E.place + 1 E.code = E1.code ∥ E2.code ∥
gen(E.place ′ :=′ E1.place E.op E2.place) E1.place = E.place
E → −E1
E.code = E1.code ∥ gen(E.place ′ :=′ E.op E1.place)
Here, ∥ concatenates code sequences.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 5 / 21
Translation of Function Calls
E → id(exprlist)
Save all the live registers (the ones containing values that may be needed later) in stack slots. If you follow the algorithm on the previous slide, these will be registers 0 through E .place − 1.
Put the values of the expressions from expr list where the called function expects them.
Call the function, arranging for control to return to the instruction after the call.
Move the return value to where it is expected, E.place. Restore all the saved registers from their stack slots.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 6 / 21
Stack Slots for Temporaries
When processingx=1+2*f(3,4+5*g(y));the stack frame will need temporary slots holding 1 and 2 at f ’s call site, and other temporary slots holding 3, 4 and 5 at g’s call site.
The code generator should keep two counters. One counts the number of temporary slots in use right now; the other records the maximum value the first counter has ever reached.
The size of the stack frame should be the sum of the number of slots needed for variables, the maximum number of slots needed for temporaries, and the number of slots needed for other purposes (for example, the saved return address).
The compiler generates the prologue (which allocates the stack frame) after it generates the code of the body of the function, because only then does it know the second number.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 7 / 21
Translating Assignments
S → id:=E E.place = 0
S.code = E.code ∥ gen(′store′ E.place ′to′ id.stackslot) After this, the value being assigned is available both in register zero
and in the variable’s stack slot.
After assignments, as after other statements, the counter for the number of temporary slots in use right now can be reset to zero.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 8 / 21
Translating Boolean Expressions
Translating boolean expressions is relatively easy if the comparison instructions of the target instruction set put boolean results into general purpose registers.
E →
E1 < E2
E1.place = E.place
E2.place = E.place + 1
E.code = E1.code ∥ E2.code ∥ gen(E.place ′ :=′
E1 and E2
E1.place = E.place
E2.place = E.place + 1
E.code = E1.code ∥ E2.code ∥ gen(E.place ′ :=′
E →
E1.place ′and′ E2.place) Such instruction sets also make it relatively simple to translate
compound statements involving changes in the flow of control.
E1.place ′ <′
E2.place)
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 9 / 21
Translating Do-While Loops
S → doS1whileE E.place = 0
S.begin = newlabel()
S.code = gen(S.begin ′ :′) ∥ S1.code ∥ E.code ∥
gen(′if ′ E.place ′=′ ′true′ ′goto′ S.begin)
S.begin :
S1.code
E.code
if E .place = true goto S .begin
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 10 / 21
Translating While Loops
S →whileES1
Here is the obvious way to compile the while loop:
S.begin :
E.code
if E .place = false goto S .after
S1.code
goto S.begin
S.after :
However, optimizing compilers usually compile while E S1 as if it were written if E then do S1 while E , to reduce the number of branches per iteration from two to one.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 11 / 21
Translating If-Then-Elses
S → ifEthenS1elseS2
E.code
if E .place = false goto S .else
S1.code
goto S.after
S.else :
S2.code
S.after :
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 12 / 21
Boolean Expressions without Boolean Comparisons
Few instruction sets have comparison instructions that put a boolean result into a general purpose register. Most have comparison instructions that put the comparison result into a special register called the condition code register or CCR, where only conditional branch instructions can look at it.
Compilers for such machines give each boolean expression E two inherited attributes, E.true and E.false, specifying the labels at which execution should resume if E is true and false respectively.
The statements where boolean expressions appear, for example, loops, if-then-elses and also assignments of booleans, supply the labels.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 13 / 21
Comparisons with CCRs
Boolean expressions still need a place attribute to tell them which registers they can use for computing subexpressions, even though it doesn’t control where the final boolean result is put (since it isn’t put anywhere as a boolean).
E → E1 < E2
E1.place = E.place
E2.place = E.place + 1 E.code = E1.code ∥ E2.code ∥ gen(′if ′ E1.place ′ <′
gen(′goto′ E.false)
E2.place ′goto′ E.true) ∥
The conditional branch IR instruction requires two separate instructions (a comparison and a branch) in many real world instruction sets, but that is OK.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 14 / 21
Short Circuit Evaluation
E →
E1 and E2
E1.place = E2.place = E.place
E1.false = E.false
E1.true = newlabel()
E2.false = E.false
E2.true = E.true
E.code = E1.code ∥ gen(E1.true ′ :′) ∥ E2.code
Short-circuiting r1 < r2 and r3 < r4:
if r1 < r2 goto L
goto E.false
L: if r3 < r4 goto E.true
goto E.false
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 15 / 21
Short Circuit Evaluation
E →
E1 or E2
E1.place = E2.place = E.place
E1.false = newlabel()
E1.true = E.true
E2.false = E.false
E2.true = E.true
E.code = E1.code ∥ gen(E1.false ′ :′) ∥ E2.code
notE1
E1.place = E.place E1.false = E.true E1.true = E.false E.code = E1.code
E →
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 16 / 21
Translating If-Then-Elses with CCRs
S → ifEthenS1elseS2 E.place = 0
E.false = newlabel() E.true = newlabel() S.after = newlabel() S.code = E.code ∥
to E.true E.code
to E.false
S1.code
goto S.after
S2.code
gen(E.true ′ :′) ∥ S1.code ∥
gen(′goto′ S.after) ∥ gen(E.false ′ :′) ∥ S2.code ∥ gen(S.after ′ :′)
E.false: S .after :
E .true :
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 17 / 21
Translating Do-While Loops with CCRs
S → doS1whileE S.begin = newlabel()
S.after = newlabel() E.place = 0
E.true = S.begin E.false = S.after S.code = ...
S.begin :
S1.code
E.code
S.after :
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 18 / 21
Translating While Loops with CCRs
S →whileES1
S.begin = newlabel()
S.body = newlabel() S.after = newlabel() E.place = 0
E.true = S.body E.false = S.after S.code = ...
S.begin :
E.code
S.body :
S1.code
goto S.begin
S.after :
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 19 / 21
Translating Boolean Assignments with CCRs
If id and E are booleans, then the assignment id := E
should be translated as if it were written
if E then id := true else id := false
That is also how boolean arguments of function calls should be computed.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 20 / 21
Coming to a Theatre Near You
Runtime environments, and handling function calls and returns. Then: Symbol tables, and memory management.
PLI (Sem 1, 2019) Syntax-Directed Translation ⃝c University of Melbourne 21 / 21