Assessed Exercise, Task 3- Code Generation
Assessed Exercise, Task 3: Code Generation
Summary
In this task, you are going to implement a code generator
targeting RISC-V machine code, for ASTs for the same
simple programming language.
The input for this task is again in the form of an S-expression
representing an AST, according to the specification
described for Task 1, read from System.in. For this task,
we assume that the input program has passed the both the
syntax analysis phase (Task 1) and the semantic analysis
phase (Task 2).
The generated code, which is printed to System.out,
should be valid and executable RISC-V assembly code, with
the result of the computation saved in register a0 (also
known as x10). For your convenience we provide the simple
assembler interface (based on RARS’s Java API) that we
use to verify your generated code.
You can make use of the grammar in the skeleton project
provided for Task 2 to turn the input AST (represented by an
S-expression) into a data structure that you can work on
directly (you will find symbol tables handy as you generate
code) — but again, you are free to write the code generator
in any way you want, as long as it adheres to the
requirements specified below.
Please also carefully read the submission guidelines.
Input/Output specification
Input format
Your code generator will read an AST, in the form of an S-
expression, from System.in as input. (The format will be
the same as the output of the Task 1, but may also contain
whitespace.) Note that the input is assumed to satisfy the
following:
1. The first function is main, which takes no arguments, and
which has return type int;
2. All functions have distinct names, and all parameters of
each function have distinct names;
3. All identifiers used in a function, are either the names of
parameters of that function, or the names of other
functions which are defined somewhere;
4. All the function bodies satisfy the type deduction rules of
Task 2.
It will generate valid and executable RISC-V assembly code
that implements the input program — in particular, it is
required that the generated code can be simulated by RARS
through the provided assembler interface, which we use to
verify your generated code.
Output format
After the generated code finishes execution, register a0
(also known as x10), which is the second line of the output
of RARSInterface.java, should be the result of the
computation (the return value of main). For example, if we
have rars_46ab74d.jar and RARSInterface.java in
the current folder, we can pipe RISC-V assembly code into
RARS as follows (depending on your OS):
On *nix-based systems
We can pipe RISC-V assembly code into RARS with the
shell command
printf “li t0, 100\n\nmv a0, t0” | java -classpath ./rars_46ab74d.jar ./RARSInterface.java
On Windows systems
We can pipe RISC-V assembly code into RARS on the
command line using
cmd /c “(echo li t0, 100 & echo mv a0, t0) | java -classpath .\rars_46ab74d.jar .\RARSInterface.java”
Additional technical remarks
The generated code should not lead to an error (such as
memory misalignment) or loop indefinitely while being
simulated: the first line of the output of
RARSInterface.java should be either
CLIFF_TERMINATION or NORMAL_TERMINATION.
While the main function in the input program receives no
argument, we require that the generated code for all other
functions to work correctly for (sensible) arbitrary
arguments. This means that a suitable procedure call /
return mechanism based on activation records should
be implemented in the generated code.
Integers should be mapped to RISC-V 32 bit signed
integers, despite the fact that we only allow non-negative
integer literals in the input language / S-expression. (You
do not have to treat the possibility of integer overflow /
underflow.)
Some suggestions
It might be easier to first compile down to an intermediate
representation, such as stack machine code or register
machine code, and then embed the IR code into RISC-V
assembly. We provide an example assembly file that
makes use of RARS’s macro functionality to realise this
idea. Note that it might be handy to define some
additional custom instructions!
Each new assembly command should be on a new line —
i.e., don’t put more than one assembly command on a line.
Hint: Consider generating a list of RISC-V assembly
command strings, and then concatenate them with added
newlines after / between each.
You’ll need a function that generates fresh labels. Don’t
hardcode labels, ensure that the function generates a fresh
label every time it is called. Don’t use a random generator
for this, use a global / static variable for this purpose. This
is one of the few legitimate uses of global/static variables.
Check that your submission is not miscompiling
conditionals like if / then / else or loops in the sense
that if the condition evaluates to true then the else-
branch is executed and vice versa.
You are advised against using global variables for function
parameters in your code generation methods; the reason
is that functions can be recursive, thus more than one
activations of a single function may be active at the same
time.
We strongly recommend testing your code before
submission. Note that pasting code into RARS and simply
press ‘Assemble’ is insufficient as test, because that
doesn’t test the termination of the code generated by your
code generator.
Resources
The simple assembler interface that we use to verify your
generated code.
An example assembly file based on macros for stack
machine instructions.
2021/11/27 10:28
⻚码:1/1