java-compiler代写: Assessed Exercise 2

 

Introduction. Here is the second (of two) assessed exercises. It consists of three tasks.

  • Task 1 (worth 40%) is here.
  • Task 2 (worth 45%) is here.
  • Task 3 (worth 15%) is here.

When and how to submit. The first assessed exercise is due Friday 15 December 2017, at 18:00. To assure anonymity, please submit via the course’s Study Direct page, where I will set up a submission link. For submission guidelines see here.


Your task is to implement, in Java, code generators. As source language we take subsets of (a minor variant of) the simple language that we generated code for in the lectures (see the lecture notes, Page 4). As usual, the language is represented by ASTs. The target language is MIPS machine code.

Background. It might be interesting to see the language you will have to compile in it’s conventional (non-AST) form. Here it is as a context free grammar.

PROG → DEC | DEC; PROG 
DEC → def ID (VARDEC) = E
VARDEC →  ε | VARDECNE 
VARDECNE → ID | VARDECNE, ID 
ID → ... (identifiers)
INT → ... (Integers)
E →  INT 
  | ID 
  | if E COMP E then E else E endif
  | (E BINOP E)
  | (E)
  | skip
  | (E; E)
  | while E COMP E do E endwhile
  | repeat E until E COMP E endrepeat
  | ID := E
  | ID(ARGS)
  | break
  | continue
ARGS → ε | ARGSNE
ARGSNE → E | ARGSNE, E
COMP → == | < | > | <= | >=
BINOP → + | - | * | / 

The ASTs for this language can be found hereDo note change these definitions at all! (except by adding new methods). If you change them in a way that my test suite will not compile, you will get 0 points.

An important difference from the pseudo-code used in the lectures is that in declarations we use numbers as variables (instead of strings). Variable 1 refers to the first (leftmost) variable in the declaration, variable 2 to the second from the left and so on. For example the procedure

def f(x,y,z) = { if x == y then z else 0 }

becomes

Declaration ( "f", 
              3, 
              new If ( new Variable ( 1 ), 
                       new Equals (),
                       new Variable( 2 ),
                       new Variable( 3 ),
                       new IntLiteral ( 0 ) ) );

The meaning of break and continue will be explained in Task 3. Note that you don’t have to write a lexer or parser for this language, only a code generator. All your code generators will use the following given classes.

Notes on the implementations. It is your task to implement the code generators in Java. You will have to use the following interface and exception to implement the code generators.

class CodegenException extends Exception {
    public String msg;
    public CodegenException ( String _msg ) { msg = _msg; } }

interface Codegen {
    public String codegen ( Program p ) throws CodegenException; }

The MIPS code to be produced is returned by all these methods as a string. Don’t print out the generated code. Just return it as a string.

You can download these hereDo note change these definitions at all! If you change them, my test suite will not compile and you will get 0 points. I will instantiate your code generators with the following classes, one for each task.

  class Task1 {
     public static Codegen create () throws CodegenException { ... } }
  
  class Task2 {
     public static Codegen create () throws CodegenException { ... } }
  
  class Task3 {
     public static Codegen create () throws CodegenException { ... } }

(You find these three code fragments herehere and here.) To implement e.g. Task 1, replace the ... with appropriate code that instantiates the Codegen interface. If you don’t want to do a task, just throw a CodegenException exceptions in create instead.

The codegen method is intended to generate code for a whole p of type Program. The code generated for a whole program p includes preamble, data layout directives etc. The generated code should be valid and executable MIPS code. You need to add alignment instructions and ensure that upon start of your program control is handed to the initial program, and upon termination control is handed back.

The code generators for Task 1 and Task 2 work only on subsets of the language. The test suite I will use for Task 1 will only use ASTs that are valid programs in the subset of Task 1. The test suite I will use for Task 2 will only use ASTs that are valid programs in the subset of Task 2. So it doesn’t matter what your code generators for these tasks do when they encounter ASTs that are not valid in the relevant subsets. For example if you do Task 1 and 2, it makes sense to write a code generator for Task 2, and then let Task1.create as well as Task2.create return the same code generator.

The generated code must do the following.

  • Be syntactically correct, meaning is must compiler.
  • Generate syntactically correct code, meaning generated code must run in the MARS emulator.
  • Implement the accumulator machine described in the lectures. You should use $a0 as accumulator. The final result of the program should also be in $a0. If you don’t want to implement an accumulator machine, then please contact me beforehand and discuss this matter with me.
  • Implement the meaning of the program correctly, i.e. 4+4 must evaluate to 8 and so on.
  • Should not lead to an error when being executed, e.g. due to memory misalignment.
  • Integers should be mapped to MIPS 32 bit integers, as described in the lectures. Note that you don’t have to care about integer overflow. All integers occurring in test-data will be sufficiently small to avoid overflow.

A small test suite that you can use can be found here. Since I will evaluate your submissions using a testing framework like the above, it is vital that you keep the relevant class signatures intact.

Use the provided CodegenException to signal errors or problems in your code generator. Do not use other means of signalling errors. In particular, please don’t abort the program if you encounter a problem.

You can assume that each program that I use to test your submission definition defines at least one procedure. You can also assume that all declarations in a program have distinct names. All programs are valid, e.g. contain no type errors. Moreover, the first declaration in the declaration list of a program takes 0 arguments. This first declaration is to be executed first, i.e. it is the procedure that should be executed when control is handed over to the code your code generator produces. Note that the first procedure does nothave to be called main.

Note that the test suite will call the codegen multiple times on the same object of type Codegen.

I strongly recommend testing your code before submission. Note that cutting/pasting code into MARS is insufficient as test, because that doesn’t deal with the termination.

 

 

 

Implement the part of the language in black. No need to implement code generation for the red parts. No test for this task will feature programs containing language constructs in red. Your code generator can throw CodegenException for the ASTs that correspond to language constructs in red, or you can return MIPS code, it does not matter.

PROG → DEC | DEC; PROG 
DEC → def ID (VARDEC) = E
VARDEC →  ε | VARDECNE 
VARDECNE → ID | VARDECNE, ID 
ID → ... (identifiers)
INT → ... (Integers)
E →  INT 
  | ID 
  | if E COMP E then E else E endif
  | (E BINOP E)
  | (E)
  | ID(ARGS)
  | skip
  | (E; E)
  | while E COMP E do E endwhile
  | repeat E until E COMP E endrepeat
  | ID := E
  | break
  | continue
ARGS → ε | ARGSNE
ARGSNE → E | ARGSNE, E 
COMP → == | < | > | <= | >= 
BINOP → + | - | * | / 

Recall that the relevant definitions are herehere and here. If you don’t want to implement a feature, simply throw CodegenException when the code generator encounters this feature.

Note that the translation of procedure invocation (given by the production E → ID(ARGS) in the grammar above) is part of Task 1. Since every program will have at least one procedure invocation, I strongly suggest that you focus on getting this right.

 

Implement codegenTask2 for the part of the language in black. No need to implement code generation for the red parts. No test for this task will feature programs containing language constructs in red. Your code generator can throw CodegenException for the ASTs that correspond to language constructs in red, or you can return MIPS code, it does not matter.

PROG → DEC | DEC; PROG 
DEC → def ID (VARDEC) = E
VARDEC →  ε | VARDECNE 
VARDECNE → ID | VARDECNE, ID 
ID → ... (identifiers)
INT → ... (Integers)
E →  INT 
  | ID 
  | if E COMP E then E else E endif
  | (E BINOP E)
  | (E)
  | skip
  | (E; E)
  | while E COMP E do E endwhile
  | repeat E until E COMP E endrepeat
  | ID := E
  | ID(ARGS)
  | break
  | continue
ARGS → ε | ARGSNE
ARGSNE → E | ARGSNE, E
COMP → == | < | > | <= | >= 
BINOP → + | - | *  | / 

Recall that the relevant definitions are herehere and here. If you don’t want to implement a feature, simply throw CodegenException when the code generator encounters this feature.

Assignment, as well as commands whilerepeat and skip can return whatever they want, it’s not specified. No test will depend on the return value of such commands. Sequential composition (E; E') returns whatever its second command returns.

Note that ifwhile and repeat constructs all carry out comparisons. I suggest to factor the translation of comparisons into a separate method to avoid code duplication.

 

 

Implement codegenTask3 for the whole language.

PROG → DEC | DEC; PROG 
DEC → def ID (VARDEC) = E
VARDEC →  ε | VARDECNE 
VARDECNE → ID | VARDECNE, ID 
ID → ... (identifiers)
INT → ... (Integers)
E →  INT 
  | ID 
  | if E COMP E then E else E endif
  | (E BINOP E)
  | (E)
  | skip
  | (E; E)
  | while E COMP E do E endwhile
  | repeat E until E COMP E endrepeat
  | ID := E
  | ID(ARGS)
  | break
  | continue
ARGS → ε | ARGSNE
ARGSNE → E | ARGSNE, E
COMP → == | < | > | <= | >=
BINOP → + | - | * | / 

Recall that the relevant definitions are herehere and here. If you don’t want to implement a feature, simply throw CodegenException when the code generator encounters this feature.

This task is more difficult than the previous two in the sense that we have not discussed how to implement division, break and continue in the lectures. The actual changes to the code generator are small and simple. Note that / is integer division, i.e. remainders are ignored (e.g. 17/5 = 3).

Meaning of break and continue. The commands break and continue work just as the eponymous statements do in Java, see e.g. here in their unlabelled versions). They always occur within a while or repeat loop. This is checked by semantic analysis. You can assume that the test suite will only contain correct uses of break and continue.

When inside a loop, break will immediately break out of the innermost containing loop and execute the command following that innermost loop. For example consider the following declaration.

def f ( x ) = (
   while x > 0 do
      if x > 500 then
         break
      else
         x := (x-1)
      endif
   endwhile; 
   x )

With this definition, f ( 1000 ) will return 1000, while f ( 432 ) will return 0. Relatedly, the outer loop in

while x > 0 do (
   repeat (
      break; x := (x+1) )
   until x > 0 endrepeat;
   x := (x-1) )
endwhile

will be executed exactly x times (assuming x ≥ 0).

When inside a loop, continue will immediately abandon the current round of the innermost containing loop and go to checking the condition. So the following loop never terminates whenever x ≥ 0.

repeat 
   ( continue; x = ( x - 1 ) )
until x < 0 endrepeat

Both, break and continue, can leave anything in the accumulator.

Important side-condition. The grammar allows us to have expressions like 1 + break. You do not have to cater for this. All uses of break and continue will be ‘normal’, subject to the restrictions break and continue must meet in Java.