窗体顶端
窗体底端
Modify the recursive descent compiler for the language PL0 (provided on the course web page with this assignment) to add a skip statement, a multiple assignment statement, and a case (switch) statement.
Please ensure you follow the course Piazza bulletin board for any updates and further information on the assignment.
You should only modify the files that must be submitted (see below). You should not modify any other files because we will be testing your implementation using the existing other files with your submitted files.
Please do not reformat the files because we would like to just print the differences between the originals and the versions you hand in. Please keep the length of lines in your files below 120 characters, so that we can print them sensibly. Either avoid using tabs or set your tabs stops to 4 spaces (this is the default for Eclipse) so that your files will print sensibly.
Your implementation should be in Java 1.8. Set the Eclipse preferences for the Java compiler to 1.8 (or if not in Eclipse use the “-source 1.8” option to the Java compiler). The compiler will compile under Java 1.6 and that level is sufficient for doing your assignment.
Read the fine print in detail before starting! When you have finished implementing the assignment, come back and re-read the fine print again.
Overview
The addition of a skip statement allows one to write
if x < 0 then x := -x else skip
to assign x its absolute value.
The multiple assignment statement allows set of assignments to take place simultaneously, conceptually. The trick is that all the expressions are evaluated before any of the assignments takes place, for example, the following multiple assignment swaps the values of x and y:
x := y | y := x
The following assignment rotates the values of x, y and z:
x := y | y := z | z := x
The following case statement assigns to y exactly once.
case i of
when 2: y := 1
when 3: y := x
when 5: y := x*x
default y := 42
end
Syntax Changes
The following lexical tokens have been added to the scanner module of PL0 already.
BAR
→
"|"
KW_CASE
→
"case"
KW_OF
→
"of"
KW_WHEN
→
"when"
KW_DEFAULT
→
"default"
KW_SKIP
→
"skip"
Keywords are case sensitive.
The non-terminal Statement in the grammar for PL0 is modified to the following Extended Backus-Naur Form (EBNF) grammar rules.
Statement
→
WhileStatement | IfStatement | CallStatement |
Assignment | ReadStatement | WriteStatement | CompoundStatement |
SkipStatement | CaseStatement
where
SkipStatement
→
KW_SKIP
Assignment
→
SingleAssign { BAR SingleAssign }
SingleAssign
→
LValue ASSIGN Condition
CaseStatement
→
KW_CASE Condition KW_OF { CaseBranch } [ KW_DEFAULT StatementList ] KW_END
CaseBranch
→
KW_WHEN Constant COLON StatementList
Note that the branches of the case statement contain statement lists rather than just a single statement.
Static Semantic Restrictions
Skip statement
There aren't any static semantic restrictions on the skip statement, that is, it is well formed for any symbol table context.
syms ⊢ WFStatement(skip)
Multiple assignment
All of the variables on the left side of a multiple assignment must be distinct. The type of each condition (expression) on the right side of an assignment must be assignment compatible with (coercible to) the base type of the corresponding variable to which it is assigned.
∀ i, j ∈ [1..n] • i ≠ j ⇒ vi ≠ vj
∀ i ∈ [1..n] • (syms ⊢ vi : ref(Ti)) && (syms ⊢ ei : Ti)
syms ⊢ WFStatement(v1 := e1 | v2 := e2 | ... | vn := en)
Case statement
The type of the expression (condition) for the case statement must be a scalar type (it could be int or boolean or a subrange thereof) and all labels of case branches must be compatible with that type and must be elements of the type (see containsElement in Type.java). The labels must be constant expressions, and the value of each label must be unique within a single case statement. The bodies of all the case branches must be well formed.
The rule for well formedness of a case statement is given below; within it T must be a scalar type and for a scalar type values(T) is the set of values in the type T, e.g. values(boolean) = {false,true} and values([-2..2]) = {-2,-1,0,1,2}.
syms ⊢ e : T
syms ⊢ WFStatement(ss)
∀ j ∈ [1..n] • syms ⊢ WFStatement(ssj) && syms ⊢ cj : T && (cj →e vj ⇒ vj ∈ values(T))
∀ i,j ∈ [1..n] • i ≠ j && ci →e vi && cj →e vj ⇒ vi ≠ vj
syms ⊢ WFStatement(case e of when c1: ss1 when c2: ss2 ... when cn: ssn default ss end)
Dynamic Semantics
Skip statement
The skip statement does nothing. The less it does the better.
Multiple assignment
For a multiple assignment, all the expressions in all the assignments are evaluated before any of the variables are assigned, and then all of the assignments take place. Note that it doesn't matter in what order the simple assignments are done because the left sides are all distinct and the expressions have all been evaluated using the values of the variables before any assignments take place (and the language does not have any side effects).
Case statement
The case statement evaluates its (scalar) expression and then selects the branch for which the value of the expression is equal to the label on that branch and executes the corresponding statement. If the value of the expression is not equal to any of the labels of any of the branches then the default branch is executed. If there is no default branch, the programs stops using instruction STOP with an exit code of StackMachine.CASE_LABEL_MISSING.
On completion of the execution of the appropriate branch within a case statement, the next statement executed is the one following the whole case statement. That is, it is NOT like a switch statement in which execution of one branch flows into the next branch unless a break statement is placed at the end of the first branch -- a language feature that is the source of many a bug.
Object Code
The PL0 compiler generates code for the Stack Machine. A description of the stack machine is available in Inst.pdf. See also the class StackMachine for details of the instruction set.
The code generation is done in class CodeGenerator by making calls to methods in class Code that stores instructions in a code sequence. There are also utility methods within Codethat may be of use for implementing the code generation.
The code generation for the skip statement is trivial.
The code generated for the multiple assignment does not need to make use of additional temporary variables, but it can (should) use the stack. It can be done quite simply (once you work out how) with the same number of instructions as the equivalent single assignments.
The code generation for the case statement can be implemented via a branch table, i.e., it uses a branch that branches to one of a sequence of branches all stored one after the other (or a table of addresses). Which branch the first branch branches to is determined dynamically by the value of the case expression. The selected branch then transfers control (branches) to the code for the corresponding statement list for that case. The code generation needs to handle values of the expression that are outside the range of values of the labels in the case statement as well as gaps in the case statement. The code generation for the case statement is quite intricate. You should draw up a plan of attack before diving into code. Most of the code for the branches can be generated and saved and then the code for the case statement can be assembled using the knowledge of the sizes of all the branches, etc.
There are methods within Code that may be of use for implementing the code generation. In particular, genJumpIfFalse and genJumpAlways generate branches to a given offset. These two methods also return the index into the Code array of the LOAD_CON instruction that loads the offset ready for the branch. This allows one to later patch the offset usingupdateLoadCon in Code. But beware that the index of an instruction in the code array does not correspond to its offset in the generated machine code because the LOAD_CONinstruction takes two words.
There are two basic approaches to generating the code for the case statement. One generates all the fragments of the code, except for jumps, and then assembles it and adds the jumps once the size of all the fragments is known. Working out the exact offsets for the jumps requires some careful planning. The other approach generates all the code, including jumps, but if the destination of a jump is unknown at the time it is generated, a dummy value is used which is later patched (using updateLoadCon) to the correct offset as soon as that is known, i.e. patch all the references to an address in the code when the code generation has reached that point.
Files
All sources for the assignment 1 PL0 compiler are available on the web page as a1-compiler.zip. Please be sure to use the version for assignment 1 and not the one used for the tutorials. There are differences (like the lexical tokens you need for this assignment are only defined in the assignment version). There is a file called a1-README.html that contains a brief description of the files.
See compiler.html for details on running the compiler and its options.
Submission
You need to submit the following list of individual files, not a .zip file. Note that file names are case-sensitive. You must only modify the files
· Parser.java
· StaticChecker.java
· CodeGenerator.java
· StatementNode.java
· StatementVisitor.java
· StatementTransform.java
for evaluation and marking. You can submit your assignment multiple times, but only the last copy submitted will be retained for marking.
Please keep the length of lines in your files below 120 characters, so that we can print them sensibly. You should avoid using tabs or set your tabs stops to 4 spaces so that when we print them (with tab stops set to 4 spaces) they will print sensibly. Don't forget to remove any code generating debugging output before submission.