Tutorial – A Toy Language and Its Interpreter/Compiler
Based on Lex and Yacc Tutorial by T. Niemann 14.3.2019
• calc1, calc2, calc3
– Thefirsttwoarecalculators
– calc3a–theinterpreter
Niemann’s Files LexAndYaccCode.zip
– calc3isatoylanguage,withwhile,if-then-else,print,etc.
– Usecalc3asthetemplateforyourcompiler(ifyouwilluseC/C++)
• Three versions of calc3:
– calc3b–thecompiler;theoutputisassemblycodeforasimplestack machine which Niemann didn’t implement, which I will provide
– calc3g–todisplaytheparsetreesgraphically • I expanded the language a little bit:
– c4
• The following slides are based on calc3
calc3b
They all share the same scanner and parser files:
But each has its own backend (calc3a.c, calc3b.c, calc3g.c)
id
con
calc3a
A statement (stmt)
3 types of tree nodes: con, id, opr (of type nodeType)
con and id are leaf nodes; opr is non-leaf node
calc3g
opr
The following slides describe the common files: calc3.h (header file)
calc3.l (scanner)
calc3.y (parser)
= INTEGERs
= VARIABLEs
Note: not the value contained in the variable, but one of a, b, c, …, z
= nodeType which hasn’t been declared
An array of pointers to subtree nodes; only one is defined here; it will be expanded to as many subtree nodes as needed later on (via C’s variadic function)
Only used by the interpreter; the compiler generates a, b, c, …, z which are mapped to the stack machine’s registers with the same names
A tree node (of type nodeType)
type
con or id or opr
If opr
Two types of tokens (terminals) that have a value (yylval): VARIABLE (or id) and INTEGER (or con)
Non-zero numbers starting with 0 not allowed
Accessed like C’s union: yylval.iValue, yylval.sIndex, …; nPtr is a pointer to an inner tree node, an address
Solving dangling else (previous lectures refer)
These are non-terminals
low
high
Null statement
In the outermost scope, every stmt is a tree
To “walk” the tree rooted at $2 (stmt); walk means execute (interpreter) or generate-code-for (compiler)
ELSE has higher precedence than “IF … then”
# arguments following (= # branches
(nodeType *)
Defining the 3 node construction functions:
p
A tree node (of type nodeType)
type
con or id or opr
If opr
One can use an array or a linked list instead of a variadic function
Caller
callee
The following slides describe the backends: calc3a.c (interpreter)
calc3b.c (compiler)
calc3g.c (parse tree visualizer) – skipped
Essentially, they implement the tree walking function, ex(); for the interpreter, it executes every stmt on the spot; for the compiler, it generates assembly code
calc3b.c
static variable initialized to 0
x = 0;
while (x < 3) {
print x;
x = x + 1; }
Feel free to change “sas.y”
top: the stack pointer, pointing at the topmost element of the stack
Implemented by
sas.l + sas.y (sas = simple assembler)
A very simple stack machine and its assembly instructors
Note: these instructions are “destructive” – the two operands are replaced by the result; similarly, for the add, sub, ...
“and” “or” are implemented in the stack machine in the same fashion as the arithmetic operators, which makes complex logical expressions very easy to compile into assembly code; otherwise, it would be quite difficult (we’ll see in the near future)
This one is destructive too
• Features I added: – Comments(//...)
– Inputoperator(readx)
– Logicaloperators(&&forAND,and||forOR) – FOR-loop
c4
• Common files: calc3.h (the same one), c4.l (scanner), and c4.y (parser)
• Interpreter: c4i.c
• Compiler: c4c.c
• Assembler and simulator (which is an interpreter): sas.l, sas.y
You might want to call it a “virtual machine”
$ c4c fact.sc >fact.sas $ sas fact.sas
The readme File for c4.zip
A simple interactive calculator: ================================
cal – integers only [to make: make cal] calx – real numbers, power (^) [make calx]
A simple calculator language (sc) – interpreter and compiler ============================================================
c4i – interpreter [make c4i]
c4c – compiler -> [make c4c]
sas – the assembler for a simulated stack machine [make sas]
Example programs:
for.sc – a double for loop
fact.sc – factorial
To run:
c4i fact.sc
c4c fact.sc >fact.sas
sas fact.sas
c4i for.sc
c4c for.sc >for.sas
sas for.sas