Compiler Theory
Assignment 7, February 12, 2019
Shift-Reduce Parsing
Reading: Text, chapter 5
To parse arithmetic expressions, we will use a method of bottom-up parsing
known as “shift-reduce parsing”. This proceeds at each step by either shifting
the next input token (the “lookahead”) onto a stack, or reducing one or several
symbols on top of the stack (the “handle”) into a nonterminal, which is pushed
back on the stack. The shift/reduce decision is made by comparing the top
terminal on the stack to the lookahead. The comparison symbols are <-, ->, and =,
so that
T <- L means shift
T -> L means reduce
T == L means shift (and T,L will wind up in the same handle)
When reducing, one goes down the stack until one finds <- between two terminals,
so the handle is bracketed in <- ->. The nonterminals are not included in the
comparisons, and whether they are part of the handles depends on the reduction
rules formed from the grammar.
In doing shift-reduce by hand, the stack plus remaining input is written on one
line, with the stack on the left and the remaining input on the right. The
stack start symbol is $ and the end-of-input symbol is $.
Assignment:
1. The expression grammar
E ==> E + E
E ==> E * E
E ==> ( E )
E ==> NUM Lookahead
could have the shift-reduce table at right: $ NUM + * ( )
Here, “Acc” means “Accept” for a success, ————————-
and “Err” means an error, since this $ | Acc <- <- <- <- Err
combination cannot legally occur |
NUM | -> Err -> -> Err ->
|
Top + | -> <- -> <- <- ->
stack |
terminal * | -> <- -> -> <- ->
|
( | Err <- <- <- <- ==
|
) | -> Err -> -> Err ->
So for example the expression 3 * ( 4 + 5 ) would have a parse
Stack Remaining input Comment
$ 3 * ( 4 + 5 ) $ $ <- NUM so shift
$ 3 * ( 4 + 5 ) $ NUM <- * so reduce
$ E * ( 4 + 5 ) $ $ <- * so shift
$ E * ( 4 + 5 ) $ * <- ( so shift
$ E * ( 4 + 5 ) $ ( <- NUM so shift
$ E * ( 4 + 5 ) $ NUM -> so reduce
$ E * ( E + 5 ) $ ( <- + so shift
$ E * ( E + 5 ) $ + <- NUM so shift
$ E * ( E + 5 ) $ NUM -> ) so reduce
$ E * ( E + E ) $ + ) so reduce
$ E * ( E ) $ ( == ) so shift
$ E * ( E ) $ ) -> $ so reduce
$ E * E $ * -> $ so reduce
$ E $ $ Acc $ so accept, done.
OVER
a. Do a shift-reduce parse of 2 * 3 + 1.
b. Do a shift-reduce parse of 2 * 3 * 1. Is multiplication left or right
associative?
c. Do a shift-reduce parse of 2 * 3 * ( 4 + 5 * ( 6 + 7 ) ).
2. Construct a shift-reduce table for the expressions in Atto-C. The terminals
involved are
$
* / high precedence binary operators
+ –
< > <= >=
== !=
&&
||
=
, low precedence binary operator
IDENT
NUM
! – (unary minus)
)
(
You can treat the symbols on one row as a group, so your table will only need
14 rows and columns. The binary operators are listed in precedence order, so
* / should be done before + – which are before < etc. The unary operators
have higher precedence than the binary operators. Also, remember that the
only legal thing on the left of an assignment symbol '=' is an identifier, not
an expression, so you want the combination IDENT = E to wind up being a handle.
The only nonterminal you will need is E, for expression.
You will use this table in the next stage of writing your compiler, so you
want to get it exactly right.
3. Use your table to do a shift-reduce parse of x = (5 >= 6) && !( y == 2 )
4. Operator precedence parsing: One way to compact a shift-reduce table into
convenient form is to assign numerical precedence levels to each symbol,
an “f” precedence level to use for the stack terminal, and a “g” level
to use for the lookahead symbol, so that
T <- L is equivalent to f(T) < g(L)
T -> L is equivalent to f(T) > g(L)
T == L is equivalent to f(T) = g(L)
Construct f and g values for your shift-reduce table (just put the values
next to the symbols in your table). Don’t worry about the Acc and Err entries.
Example: Lookahead
g: 50 30 10 20 50 2
$ NUM + * ( )
f ————————-
1 $ | Acc <- <- <- <- Err
|
31 NUM | -> Err -> -> Err ->
|
Top 11 + | -> <- -> <- <- ->
stack |
terminal 21 * | -> <- -> -> <- ->
|
2 ( | Err <- <- <- <- ==
|
49 ) | -> Err -> -> Err ->