程序代写代做代考 interpreter compiler scheme Java Excel Haskell # Macquarie University, Department of Computing

# Macquarie University, Department of Computing

## COMP332 Programming Languages 2019

## Assignment 3: Lintilla translation and execution

Due: 11.55pm Sunday 10th November (week 13)
Worth: 15% of unit assessment

Marks breakdown:

* Code: 50% (of which tests are worth 10%)
* Report: 50% (of which test description is worth 10%)

Submit a notice of disruption via [Ask@MQ](https://ask.mq.edu.au) if you are unable to submit on time for medical or other legitimate reasons.

Late penalty without proper justification: 20% of the full marks for the assessment per day or part thereof late.

### Overview

This assignment asks you to extend the translator for the Lintilla programming language. The new translations you will add will allow the Lintilla compiler to generate code for logical operators (and, or and not), array operations and `for` loops.

The Lintilla translator targets a simple abstract machine which is a version of Landin’s highly influential [SECD machine](https://en.wikipedia.org/wiki/SECD_machine). An implementation of this machine is provided, so once your translator is working you will be able to run Lintilla programs.

The framework repository provided for this assignment supports parsing and semantic analysis for the entire Lintilla language as described in the `README.md` file. It also contains an implementation of a Lintilla to SECD code translator which is able to translate all constructs of the Lintilla language *except* for those relating to logical operators, arrays and `for` loops. Before you start this assignment we strongly recommend that you should:

* Carefully read the latest version of the `README.md` file in the framework repository. This contains a full description of the syntactic (lexing and parsing) and semantic (name and type analysis) rules of the Lintilla language.
* Thoroughly peruse the `SyntacticAnalysis.scala`, `SemanticAnalysis.scala` and `Translator.scala` modules in the `src/main/scala` directory of the framework repository. Compare these to the corresponding modules that you wrote for the expression language in the workshops, and make sure you have understood how they work and what it do.

Your translator is only expected to translate programs that have passed through the earlier phases of the compiler without producing any errors. In particular, you can assume that any source tree your translator will be presented with satisfies all of the semantic rules given in the `README.md` file.

Extending this translator will give you insight into the way that source language features correspond to target language features. It will show you how abstract machines can be used to provide a run-time system for languages. You will also gain specific experience with how Lintilla is compiled and executed, which is similar in many respects to the implementation of full general-purpose functional programming languages. Once you have completed this assignment you will have a working implementation of Lintilla.

**Please note:** If this is your second attempt at COMP332, and you completed the Lintilla assignments in 2018, then please be aware that there are some important respects in which the compiler and runtime in this assignment are quite dissimilar to those that you worked on last year. So even if you are confident that you remember what was going on in last year’s assignment, you should take great care to read this documentation and the source code of the current Lintilla compiler before you commence this exercise. There are some very important respects in which the backend of this compiler has changed this year, and if you are not careful these changes are likely to trip you up.

### The SECD machine

The “SECD” in the name SECD machine stands for Stack, Environment, Code and Dump. The interpreter that implements our SECD machine can be found in the file `SECDMachine.scala` (in the `src/main/scala` directory of the framework bundle). The instruction set of the SECD machine in defined in the `SECDTree.scala` module. The action of each SECD instruction is briefly described in the comments above the case class definition that defines that instruction type.

In a practical programming language compiler the SECD code generated by the Lintilla compiler would not be executed by an interpreter. Instead a further compiler phase would translate it into much more efficient native machine code (possibly on the fly). A production compiler would also provide phases devoted to optimising the generated SECD code prior to its conversion into machine code.

We can describe the activities of any computing machine, whether a abstract one like the SECD machine or an actual microprocessor, in terms of a *machine state* and a clear specification of how the *instructions* understood by that machine update that machine state when they are executed.

The state of the SECD machine has four components, which are encapsulated in the following case class definition

“`scala
case class State(
opnds: Stack,
env: Environ,
code: Frame,
dump: Option[State]
) extends Status
“`

which you can find in the `SECDMachine.scala` module. These components perform the following roles:

* `opnds: Stack` this is called the *operand stack*. It plays the role of the *addressable data registers* in a standard machine architecture. The instructions of the SECD machine pop their operands of this stack and push any results they compute back onto it. In that sense the SECD machine is closely related to the Java Virtual Machine (JVM), they are both *stack machines* rather than *register machines* (like the microprocessor in your laptop).

* `env: Environ` this is called the *environment*. This is a map which associates named variables with the values they are currently bound to. At any point in the execution of a program this environment contains *bindings* for all of the variables that are currently in scope. When a function is called the values passed to the function will be bound to the names of the function arguments in the environment. When the body of the function refers to a name we will look it up in the current environment to get its value.

* `code: Frame` this is called the *code dump*. This is the list of *machine instructions* to be executed. The next instruction to be executed is the first one in this list, and once its has been executed it is discarded from the front of the code dump list. You can think of the code dump as being analogous to the *program counter* of a microprocessor, since it always points to the next instruction to be executed (the one at the head of the code dump list). Indeed, in a practical and efficient implementation of the SECD machine we would simply represent the code dump as a program counter containing the address of the next instruction to execute.

* `dump: Option[State]` this is called the *control dump* (or more simply just the *dump*). You should think of the control dump as being a stack of saved machine states, which is used to store the information required to return from a function call. In that way, the control dump carries out exactly the role played by the stack of execution frames used to manage local variables and return addresses in a conventional programming language.

The operand stack of the SECD machine contains *machine values* of which there are 5 kinds. These are defined in the `SECDTree.scala` module and are represented as objects of the following classes which are children of the abstract `MValue` class:

* **Booleans**, values `true` and `false` represented by objects of type `MBool`,
* **Integers**, 32-bit signed numbers represented by objects of type `MInt`,
* **Arrays**, mutable and extendable 1-d arrays of values of type `MValue` represented by objects of type `MArray`,
* **Closures**, a function definition packaged up with a *referencing environment*, which captures the environment that was current when the closure was constructed and is used to provide values for the free variables in the body of the function. These are represented by objects of type `MClosure`, and
* **Continuations**, records that contain a saved state of the SECD machine. These are represented by objects of type `MContinuation`.

The first four of these kinds of machine values should be familiar to you from your programming / COMP332 experience. Continuations, however, are new and they provide a powerful mechanism for implementing a wide range of control structures, including loops, iterators, `try..catch` style exception handlers, and threads. In the SECD machine continuations are created by taking a “snapshot” of the current state of the machine, and they may be used later to reset, or *resume* the machine state back to that saved state. You can think of the process of creating a continuation as that of setting a label that refers to some point in the execution of a program. Later you can resume the continuation to jump back to that labelled point. Consequently, we will be using continuations in our translation of `for` loops.

When the SECD machine is started the operand stack and the environment will be empty and the code dump will be initialised to contain the sequence of SECD instructions that comprise the program to execute. At each step of the execution the machine interpreter will look at the first instruction in the code sequence and decide what to do based on what that instruction is. In each case the instruction will update the machine state in some way. Execution will then continue with the next instruction in the sequence.

#### The SECD calling sequence

In the SECD machine it doesn’t really make sense to say that we call a function because, in order to execute the commands in its body we will also need to know what values to use for its free variables. In practice, those values are supplied by giving an environment in which to lookup free variables. This is, of course, the purpose of the environment that is stored, along with a function definition, inside a closure. So the SECD machine provides instructions to make closure values and to execute such closure values.

The `IClosure` machine instruction creates a closure value and pushes it onto the operand stack. That closure encapsulates a list of formal parameter names and a list instructions (the *body* of the closure), which are specified in the `IClosure` instruction itself. It also *captures* the machine environment that is current at the point where the `IClosure` instruction is executed.

When the `ICall` machine instruction is executed it pops a closure value off the top of the operand stack, then:

1. pops sufficient values from the operand stack to act as actual parameters to be bound to the formal parameters of the closure,
2. makes a copy of the resulting state of the of the machine and pushes that onto the control dump,
3. empties the operand stack,
4. replaces the environment with one constructed by adding the bindings of formal to actual parameters to the environment stored in the closure, and
5. replaces the code dump with the list of instructions stored in the closure.

After that it the commences execution of the new contents of the code dump, that is to say it proceeds to execute the body of closure. When the last instruction in the code dump is executed, leaving it empty, the machine takew the following actions:

1. pop the first saved state from the control dump,
2. set the environment, code dump and control dump to those stored in that saved state, and
3. add the values in the operand stack to the operand stack in the saved state and make that extended stack the new operand stack.

These steps leave the machine in the state it was in after the instruction pointer was advanced but just before the `ICall` instruction was executed, with the actual parameters popped from the stack and the return results of closure execution pushed onto it.

#### Continuations

A *continuation* value is, quite simply, nothing more than a machine value that contains a saved state of the SECD machine. These are just like the saved machine states stored in control dump, except that they can be pushed onto the operand stack and can be manipulated directly by certain machine instructions.

Continuation values can be created using the `ICallCC` instruction, which stands for *call with current continuation*. The idea is that `ICallCC` acts very much like `ICall`, except that it handles the binding of formal parameters to actual parameters in a slightly different way:

1. If the closure being called takes `n` parameters then only `n-1` values are popped from the operand stack and these are bound to the first `n-1` formal parameters of the closure.
2. When the saved machine state is pushed onto the control dump it is also made into a continuation machine value and bound to the last parameter of the closure.

Consequently, if we call a closure using `ICallCC` we can access the machine state that will eventually be used to return from the function directly; by accessing the last parameter of the closure.

This becomes particularly useful in combination with the `IResume` instruction, which is executed as follows:

1. pop a continuation value from the operand stack,
2. set the environment, code dump and control dump to those stored in the state saved in the continuation, and
3. add the values in the operand stack to the operand stack in the saved state and make that extended stack the new operand stack.

These actions should be compared with the process of returning after executing the body of a closure, as described in the last section. The only difference is that the return process takes its return information from the control dump whereas `IResume` takes it from a continuation on the top of the control stack.

So why is this important? Well for one thing we can capture the return continuation of a closure call and use it, and `IResume`, at any point in the closure body to return early. This allows us to add a `return` construct to Lintilla, which could be used return a value from any point within a function body.

More profoundly, however, continuations are just like any other SECD machine value and, in particular, they can be returned from the the closures that capture them and used outside of the body of that closure. Resuming a continuation that has been liberated from its capturing closure has the effect of “jumping back” to the point just after the original closure call that created the continuation. This provides us with a very general mechanism for marking points in the execution of a program and jumping back to those points from anywhere else.

This mechanism is used in the following SECD machine code:

“`scala
IClosure(
None, // The name of the closure created.
List(“_continuation”), // Formal parameter list
List( // Start of the closure body.
IVar(“_continuation”) // Push the captured closure on the operand stack.
) // Return contents of operand stack to the caller.
),
ICallCC(), // Capture the continuation of the current instruction,
// bind it to the parameter “_continuation” of the closure
// on the operand stack and execute the closure body.

// At this point the continuation captured by the last call is
// on the top of the operand stack.

IClosure(
None,
List(“_label”), // This parameter will be bound to the continuation
// left on the operand stack by the ICallCC on line 8.
List( // Start of closure body
IInt(0), // Push 0 on the operand stack.
IPrint(), // Print the value (0) on the top of the operand stack.
IVar(“_label”), // Push the value bound to _label onto the operand stack…
IVar(“_label”), // … twice
IResume() // Resume the continuation on the top of the operand stack
// returning the first copy of that same continuation that
// was also pushed onto the stack.
)
),
ICall()
“`

The effect of this code is to capture a continuation which “labels” the instruction just after the `ICallCC` instruction, bind that continuation to variable `_label` and then resume it in order to jump back to the point just after the `ICallCC`. In other words, this code loops forever printing the number 0.

We can be a little more sneaky and use a slightly modified form of the last example to increment and print a counter on each iteration:

“`scala
IClosure(
None,
List(“_continuation”),
List(
IInt(0),
IVar(“_continuation”)
)
),
ICallCC(), // this call returns two values on the stack,
// a count and the continuation to jump back to.
IClosure(
None,
List(“_count”, “_label”), // Parameters to bind the count and continuation to.
List(
IVar(“_count”),
IPrint(), // Print the current count.
IVar(“_count”),
IInt(1),
IAdd(), // Leave current count + 1 on operand stack.
IVar(“_label”), // Push the continuation bound to _label onto the stack…
IVar(“_label”), // …twice.
IResume() // And resume it to jump back to the point just after
// the ICallCC instruction, leaving the new count
// and the _label continuation on the operand stack
// for the next iteration.
)
),
ICall()
“`

This structure will form the basis of our translation of `for` loops. It might seem a little mind bending, but please don’t worry. You won’t need to understand much about continuations to implement the `for` translation we will describe below.

#### SECD machine instructions

The SECD machine instructions are defined in the file `SECTree.scala`. There are the following instruction types:

* Push value instructions (`IBool`, `IInt` or `IArray`): Push a single Boolean, integer or (empty) array value onto the operand stack.

* Push variable instruction (`IVar`): Push the value of a given variable onto the operand stack. This variable must be bound in the current environment, otherwise the machine halts with an *unknown variable* error.

* Arithmetic instructions: (`IAdd`, `IDiv`, `IMul` and `ISub`): Pop two integers off the operand stack, perform the given operation on them and push the result onto the operand stack. The first value popped will be the right operand of the operation and the second value popped will be the left operand. The machine will halt if the top of the operand stack does not contain two integer values. If the instruction is `IDiv` the machine will halt with an error if the right operand is zero.

* Equality instruction: (`IEqual`): Pop two values off the operand stack, compare them for equality and push the result of the comparison as a Boolean onto the operand stack. The machine will halt if the top of the operand stack does not contain two values that are either both integers or both Booleans.

* Less than instruction: (`ILess`): Pop two values off the operand stack, compare them for equality and push the result of the comparison as a Boolean onto the operand stack. The first value popped will be the right operand of the operation and the second value popped will be the left operand. The machine will halt if the top of the operand stack does not contain two integer values.

* Print instruction: (`IPrint`): Pop a value off the operand stack and print it. The machine will halt if there is no value on the operand stack.

* Branch instruction: (`IBranch`): The instruction contains two code sequences (also called *frames* in the code). Pop a Boolean value from the operand stack. If the popped value is true continue execution with the first code sequence from the instruction; otherwise continue with the second code sequence. The machine will halt if the top of the operand stack does not contain a Boolean value.

* Closure instruction: (`IClosure`): The instruction contains a (possibly empty) list of argument names, and a sequence of instructions that constitute the body of a function. When this instruction is executed it pushes a closure value onto the operand stack that represents this function. The closure will contain the list of parameter names, the body code sequence, and the current environment (top of the environment stack). The environment value is therefore *captured* so that it can be used later to look up the values of free variables in the function body when the function is called. The machine will halt if the environment stack is empty, since in that case there will be no environment to capture.

* Call instruction: (`ICall`): Pop a value from the operand stack; this value must be an closure value, since it will provide the function that is called. Then pop one value from the operand stack for each forma; parameter of that closure, and continue execution of the closure body with the closure’s parameters bound to these values. Arguments to this call should be pushed into the stack in the order that they occur in the function call. So the first argument is the first to be pushed into the stack. The machine will halt if the first value popped isn’t a closure _or_ if the operand stack doesn’t contain as many values as that function call requires as arguments. This behaviour of this instruction is described in more detail above.

* Array instructions (`IArray`, `IDeref`, `ILength`, `IUpdate` and `IAppend`): These create and manipulate 1-dimensional mutable arrays; they have the following effects when executed:

* `IArray`: make a new empty array value and push it onto the operand stack.
* `IDeref`: pop an integer index and an array from the top of the operand stack (in that order) and push the value found in that array at the given index back onto the stack.
* `ILength`: pop an array value from the top of the operand stack and push its length, an integer, onto the stack in its place.
* `IUpdate`: pop a value (of any type), an integer index and an array from the top of the operand stack and update the cell of that array at the given index to contain the specified value.
* `IAppend`: pop a value (of any type) and an array from the top of the operand stack and extend the array by adding the given value to its end.

* Operand stack manipulation instructions (`IDropAll`): These act on the operand stack to move entries around on the top of the stack or to remove and discard entries. At the moment the only such operation provided is `IDropAll` which empties the operand stack entirely. This is largely used in situations where we want to clear the operand stack prior to pushing return values to return from a function call.

* Continuation operations (`ICallCC` and `IResume`): These provide facilities for capturing the current machine state as a continuation value in the operand stack and then returning the machine to that save state at a later time. These are described in much greater detail above.

**Note that** many of these instructions can cause fatal errors which will abruptly halt the execution of the SECD machine. For example, if an `IAdd` instruction does not find two integer values on the top of the operand stack the SECD machine will immediately stop and print `”FatalError: “int and int expected in IAdd”`. In normal operation of the SECD machine, however, the only fatal error that should ever be raised is an array index out of bounds. All other error conditions should never be produced by “correct” SECD code generated by the Lintilla compiler.

For example, if the Lintilla compiler generates an `IAdd` instruction then it must have previously generated code to push its two integer operands onto the operand stack. So when the `IAdd` instruction is executed we can guarantee that the top of the operand stack does indeed contain two integers. In principle, if we have correctly designed and implemented the Lintilla compiler we should be able to write a _mathematical proof_ that shows that any Lintilla program that passes the semantic rules of the language will be translated into SECD code which cannot generate any fatal errors except for array indexing ones.

### Translating Lintilla to the SECD machine

This section describes how translator given in the framework repository translates Lintilla code (except for logical operators, arrays and `for` loops) to SECD machine instructions. The more routine parts of this translation are as follows:

1. Boolean expressions, integer expressions and applied occurrences of identifiers translate into instances of `IInt`, `IBool` or `IVar` instructions, which push the relevant values onto the operand stack.

2. Arithmetic operations, equality and less-then comparisons translate into SECD code to push each of their operands onto the operand stack, followed by the relevant machine operation. Left hand operands are evaluated before right hand ones, so when the code generated from the expression

“`rust
{ print 10; 10 } + { print 5; 5 }
“`

is executed it prints the number `10` before it prints the number `5`.

3. An `if` expression translates into SECD code to evaluate its condition followed by an `IBranch` instruction containing the SECD machine code generated from its *then* and *else* clauses. So the expression `if (cond) { then_code } else { else_code }` translates to:

“`scala
,
IBranch(
,

)
“`

4. A `print` expression should translate into SECD code to evaluate its operand expression followed by an `IPrint` instruction. So the expression `print exp` translates to:

“`scala
,
IPrint()
“`

5. An application expression should translate into SECD code to evaluate each one of its arguments, in order, followed by SECD code that evaluates its function component, and followed by an `ICall` instruction. In other words, a function call `f(e_1,…,e_n)` should translate to code of the following form:

“`scala
,
,

,
,
ICall()
“`

6. A function declaration

“`scala
fn f(v_1 : type_1,…, v_n : type_n) -> type {
body
}
“`

should translate to the following instruction

“`scala
IClosure(
Some(“f”),
List(“v_1″,…,”v_n”),

)
“`

followed by code to bind the identifier `f` (in the remainder of the current block) to the closure object that is pushed onto the stack by this instruction. We shall discuss the construction of that binding code a little later.

7. A `let` declaration should translate to code that evaluates its initialisation expression and then binds the resulting value on the stack to the specified identifier. We examine this binding translation next.

The tricky part of this translation involves the translation of sequences of expressions (at the top level or within a block) which contain `let` and `fn` declarations. Suppose that we are given a sequence of expressions that comprise some section of the body of a block (or of the whole program) from some expression onwards to the end of the block:

“`rust
expression_1;
expression_2;

expression_n
“`

Then, generally speaking, we would simply translate this sequence by translating each individual expression in turn and concatenating the results:

“`scala




“`

Were we to represent this sequence of expressions as a list `exp :: rest`, comprising the first expression `exp` and the list of remaining expressions `rest`, then we might express this translation as a recursive template:

“`scala
=>


“`

Another way of looking at this is that we could write a recursive translation function to do this task. This might look something like

“`scala
def translateSeq(list : List[Expression]) {
list match {
case (exp :: rest) =>
translateExp(exp)
translateSeq(rest)
case _ => ()
}
}
“`

where `translateExp` is a function to translate a single expression and add the generated code to an instruction buffer (as we did in the translation code for the expression language).

We must, however, adjust this translation in the case where the first expression `exp` in a list of expressions is a `let` or a `fn`. The partial translations of `let` and `fun` given above leave a value on the top of the stack, and that value should be popped and bound to a variable whose scope extends over the translations of the remaining expressions in `rest`. A quick perusal of the SECD machine code, however, reveals that the only way we have to bind a variable to a value on the stack is to call a closure that has that variable as its parameter.

This leads us to conclude that if the first expression `exp` is a `let` or `fn` declaration then we should modify our translation of `exp :: rest` as follows:

“`scala
=>
<(partial) SECD code translation of exp discussed above>
IClosure(
None,
List(), // = name of the variable to bind.

),
ICall()
“`

In other words, this code will push the value to be bound onto the stack, then it will push a closure containing the SECD code generated from the subsequent expressions, and finally it executes a call to execute that closure. This translation can be implemented by adding two specialised cases (for `LetDecl` and `FnDecl`) nodes to the `match` expression in the function `translateSeq` above.

#### Translation Example

As a simple example of translation, consider this Lintilla program:

print 3 * (12 / 4)

This program would be translated into the followed SECD machine code:

List(IInt(3), IInt(12), IInt(4), IDiv(), IMul(), IPrint())

As a more complex translation example, consider the following Lintilla program

let x = 100;
print x;
fn inc (a : int) -> int { a + 1 };
print inc(x)

which should print numbers `100` and `101` when executed.

Here is the SECD machine code that is the translation of this program. The comments in the code are there to assist your understanding; they are not part of the code itself.

List(
IInt(100), // Translation of the initialisation expression on line 1
IClosure( // Create closure to bind the variable ‘x’.
None,
List(“x”),
List( // Frame containing translation of lines 2-4.
IVar(“x”),
IPrint(),
IClosure( // Create closure implementing the function ‘inc’
Some(“inc”),
List(“a”),
List(
IVar(“a”), // List of instructions translated from body of ‘inc’
IInt(1),
IAdd())),
IClosure( // Create closure to bind the variable ‘inc’
None,
List(“inc”),
List( // Frame containing translation of line 4
IVar(“x”),
IVar(“inc”),
ICall(),
IPrint()))
ICall())), // Call to bind ‘inc’
ICall()) // Call to bind ‘x’

The `src/test/resources` contains the sources of a handful of Lintilla programs along with the output generated when we compiled and executed them using our reference compiler. These are provided as resources for further investigating the relationship between pieces of Lintilla code and the SECD code they translate to.

### Translations of logical operators, array operations and `for` loops.

Your task is to extend the Lintilla translator, in the `Translator.scala` module, to translate:

* Logical operators, and (`&&`), or (`||`) and not (`~`),
* Array operations, for creating new arrays (`array`), dereferencing values in arrays (`!`), appending values to the end of arrays (`+=`) and assigning new values to array entries (`:=`),
* For loops, comprising the `for` loops itself and the associated `loop` operation, which terminates the current iteration of the smallest enclosing `for` loop and starts the next one, and `break` operation, which exits the smallest enclosing `for` loop.

Your translator should translate these constructs as follows:

### Logical operators

The logical operators `&&`, `||` and `~` should be implemented using the *short circuit evaluation* approach discussed in the lectures, so:

* An expression `b_1 && b_2` should be executed by evaluating `b_1` and if that gives the value:
* `true` then evaluate `b_2` a boolean value and return that as the result of the overall expression, or
* `false` then return `false` as the result of the overall expression, without evaluating `b_2`.
* An expression `b_1 !! b_2` should be executed by evaluating `b_1` and if that gives the value:
* `true` then return `true` as the result of the overall expression, without evaluating `b_2`, or
* `false` then evaluate `b_2` to a boolean value and return that as the result of the overall expression.
* An expression `~b` should be executed by evaluating `b` and if that gives the value:
* `true` then return `false` as the result of the overall expression, or
* `false` then return `true` as the result of the overall expression.

In particular, this means that if we evaluate an expression like

“`rust
{ print true; true } || { print false; false }
“`

then the result returned would be `true` and the single line

“`rust
true
“`

would be printed. On the other hand, when we evaluate the expression

“`rust
{ print false; false } || { print true; true }
“`

then the result returned would again be `true` but this time the two lines

“`rust
false
true
“`

would be printed. In each case this semantics can be implemented using the `IBranch` instruction. For example, we could translate an expression `b_1 && b_2` into the code sequence:

“`scala

IBranch(
,
List(IBool(false))
)
“`

We will leave it up to you to work out translation schemes for `||` and `~` along similar lines. You should describe the translation schemes you design for these logical operators in your report, using the kind of notation we have used for our translation schemes above.

### Array operations

The array operations that you should provide translations for are:

* `array t` which creates and returns a new empty array which can contain elements of type `t`.
* `v!i` which evaluates to the value stored in the `i`th entry of the array `v`.
* `v!i := e` which evaluates the expression `e` and puts the resulting value into the `i`th entry of the array `v`.
* `v += e` which evaluates the expression `e` and appends the resulting value to the end of the array `v`, extending its length by 1.
* `length(v)` which returns the number of entries in the array `v`.

These operations roughly correspond to the SECD machine instructions `IArray`, `IDeref`, `IUpdate`, `IAppend` and `ILength`. The translation of one these constructs is a simple matter of generating code to evaluate each of it operands in turn and then adding the corresponding SECD machine instruction to the end of the resulting code sequence. In each case the operands of these constructs should be evaluated in order from left to right.

One thing to note is that the semantic analyser ensures that the only expression that can legally occur on the left hand side of an assignment (`:=`) is a dereference (or pling `!`) expression. So you may treat the expression `v!i := e` as a single ternary operator and translated it using a case matching clause of the form:

“`scala
case AssignExp(DerefExp(v, i), e) => …
“`

We leave the details of these translation schemes to you. Again, you should describe the translation schemes you design for these array operators in your report, using the kind of notation we have used for our translation schemes above.

### `for` loops, `loop` and `break` operations

Lintilla’s `for` loops are translated using a variant of the continuation based loop examples we discussed above. In order to describe this translation we start by reminding you of the syntax of the `for` loop, which is given by the production:

“`haskell
forexp : “for” idndef “=” logexp “to” logexp (“step” logexp)? “do” block
“`

Loops satisfying this syntax are represented by nodes in the abstract syntax tree whose structure is defined by the following case class declaration:

“`scala
case class ForExp(
idn: IdnDef,
from: Expression,
to: Expression,
step: Option[Expression],
body: Block
) extends Expression
“`

One thing to note here is that the semantic rules of Lintilla say the following about the optional step expression:

* when it is present it should be a constant (integer) expression, and
* when it is absent the step is taken to default to 1.

Constant integer expressions are those containing only arithmetic operators and constant integers. They can, and the step expression should, be evaluated at translation time to a single integer, using the simple recursive function `evalIntConst` provided in the `Translator.scala` module.

Now a `for` expression represented by a Lintilla tree node `ForExp(IdnDef(control_var), from, to, step, body)` should translate to the following SECD code:

“`scala
1:
2:
3: IClosure(
4: None,
5: List(“_from”, “_to”, “_break_cont”),
6: List(
7: IClosure(
8: None,
9: List(“_loop_cont”),
10: List(
11: IVar(“_from”),
12: Ivar(“_loop_count”)
13: )
14: ),
15: ICallCC(),
16: IClosure(
17: None,
18: List(control_var, “_loop_cont”),
19: List(
20: ,
21: IBranch(
22: List(
23: IVar(“_break_cont”),
24: IResume()
25: ),
26: List()
27: ),
28: ,
29: IVar(control_var),
30: IInt(step_value),
31: IAdd(),
32: IVar(“_loop_cont”),
33: IVar(“_loop_cont”),
34: IResume()
35: )
36: ),
37: ICall()
38: )
39: ),
40: ICallCC()
41: …
“`

The following notes should help you to explain how this translation works:

1. The code sequences on lines 1 and 2 compute the from and to expressions before we enter the loop.
2. The outer closure, created by the `IClosure` instruction at line 3, binds the resulting values to variables `_from` and `_to`.
3. That closure is called by the `ICallCC` at line 40, so its `_break_cont` parameter is bound to a continuation which when resumed will jump to line 41 and exit the loop (call this the *break continuation*).
4. The closure at line 7 is called by the `ICallCC` at line 15, which binds the parameter `_loop_cont` to a continuation which when resumed will jump to line 16 (call this the *loop continuation*).
5. The body of that closure pushes the starting value of the loop, which is bound to `_from`, and the loop continuation bound to `_loop_cont` onto the operand stack so that those two values are returned to the calling context at line 16.
6. The closure at line 16 is called by the `ICall` at line 37, that binds the control variable of the loop and the variable `_loop_cont` to the top two elements on the operand stack. On the first time it is called this binds the control variable to the starting value of the loop and it binds `_loop_cont` to the loop continuation.
7. The term `step_value` referenced on line 30 is the constant integer obtained by evaluating the constant step expression of the loop, if such is present, otherwise it is equal to 1. The semantic rules of Lintilla ensure that `step_value` is not equal to 0.
8. There are two possibilities for the translated code at line 20:
* `step_value > 0` then the following translation is used:

“`scala
IVar(“_to”),
IVar(control_var),
ILess()
“`

* `step_value < 0` then the following translation is used: ```scala IVar(control_var), IVar("_to"), ILess() ``` 9. After the code generated from the loop body (line 28), lines 29-31 add the step value to the current value of the control variable leaving that new value for the control variable on the operand stack. 10. We then push two copies of the loop continuation, which is bound to `_loop_cont`, and resume the second one. This causes execution to jump back to line 16 to execute the next iteration of the loop. Notice, however, that the values that are now left on the operand stack at line 16 are the new value for the control variable and the loop continuation. So when we call the closure pushed at line 16, by executing the `ICall` at line 37, we bind that new value to the control variable and we rebind the loop continuation to `_loop_cont`. The translations for the `loop` and `break` constructs are much simpler: * `break` should translate to the simple SECD code sequence ```scala IDropAll(), // Flush the operand stack, to avoid spurious return values. IVar("_break_cont"), // Push the break continuation onto the operand stack and... IResume() // ...resume it, thereby jumping to the exit point of the loop. ``` * `loop` should translate to the slightly more complex SECD code sequence ```scala IDropAll(), // Flush the operand stack, to avoid spurious return values. IVar(control_var), IInt(step_value), // Compute the new value for the control variable... IAdd(), // ... and leave it on the operand stack. IVar("_loop_cont"), // Push the loop continuation onto the operand stack... IVar("_loop_cont"), // ... twice. IResume() // And resume it, jumping back to the head of the loop. ``` which is largely identical to the code that triggers the next iteration of the loop on lines 29-34 of the `for` translation scheme above. These translations seem quite straightforward, but there is a fly in the ointment when it comes to actually implementing the translation for the `loop` construct. It refers to the values `control_var` and `step_value`, but those were computed during the translation of the enclosing `for` loop and so they are not immediately available at the point where we construct the translation of a `loop` construct. To solve this problem we can adopt an old `C` programmer's trick. First we introduce two mutable variables that are fields of the `Translator` object, an so therefore visible from anywhere within the scope of that object: ```scala var control_var: String = "" var step_value: Int = 0 ``` The idea is that wherever we are within the translator execution these should always be setup to contain the name of the control variable and the step size of the smallest enclosing `for` loop. We do that by surrounding the code that translates `for` loops by code that saves the original values of the variables `control_var` and `step_value` in local variables before they are updated for the current `for` loop and then restores them from those local variables on exit: ```scala case ForExp(IdnDef(id), from, to, step) =>
// Save original values of control_var and step_value in local variables
val old_control_var = control_var
val old_step_value = step_value
// Update control_var and step_value for the loop we are currently translating
control_var = id
step_value = …

// …. Code to translate this for expression

// Restore the values that control_var and step_value had on entry
control_var = old_control_var
step_value = old_step_value
“`

### What you have to do

You have to write, document and test extensions to the translator for the Lintilla language, according to the description above. You are strongly advised to complete portions of the assignment before moving onto the next one, rather than trying to code the whole solution in one go.

The translations of logical and array operators are worth approximately half the marks, and are really quite straightforward. So we **very strongly** recommend that you start by concentrating on writing, testing and documenting the translator for those much simpler constructs. Only once you have gotten all of that completed (code and report) should you attempt to move on to implementing the translator for the more challenging `for` loop. This will then provide you with an excellent insurance policy against running out of time later in the semester.

A skeleton sbt project for the assignment has been provided on BitBucket as the [dominicverity/comp332-lintilla](https://bitbucket.org/dominicverity/comp332-lintilla) repository. The modules are very similar to those used in the practical exercises. The skeleton contains the modules you will need, particularly for Weeks 10 and 11. For this assignment you should **not** modify any parts of the implementation except the translator (`Translator.scala`) and its related tests (`ExecTests.scala`).

A partially complete translation function is given to get you started; all you need do is add clauses to translate the logical, array and loop constructs discussed above (look for `FIXME` in the code for some places where new code has to go).

### What you must hand in and how

A zip file containing all of the code for your project and a type-written report.

Submit every source and build file that is needed to build your program from source, including files in the skeleton that you have not changed. Do not add any new files or include multiple versions of your files. Do not include any libraries or generated files (run the sbt `clean` command before you zip your project). We will compile all of the files that you submit using sbt, so you should avoid any other build mechanisms.

Your submission should include all of the tests that you have used to make sure that your program is working correctly. Note that just testing one or two simple cases is not enough for many marks. You should test as comprehensively as you can.

Your report should describe how you have achieved the goals of the assignment. Do not neglect the report since it is worth 50% of the marks for the assignment.

Your report should contain the following sections:

* A title page or heading that gives the assignment details, your name and student number.
* A brief introduction that summarises the aim of the assignment and the structure of the rest of the report.
* A description of the design and implementation work that you have done to achieve the goals of the assignment. Listing some code fragments may be useful to illustrate your description, but don’t give a long listing. Leaving out obvious stuff is OK, as long as what you have done is clear. A good rule of thumb is to include enough detail to allow a fellow student to understand it if they are at the stage you were at when you started work on the assignment.
* A description of the testing that you carried out. You should demonstrate that you have used a properly representative set of test cases to be confident that you have covered all the bases. Include details of the tests that you used and the rationale behind why they were chosen. Do not just reporduce the tests in your report without explanation.

Submit your code and report electronically in a single zip file called `ass3.zip` using the appropriate submission link on the COMP332 iLearn website by the due date and time. Your report must be in PDF format.

DO NOT SUBMIT YOUR ASSIGNMENT OR DOCUMENTATION IN ANY OTHER FORMAT THAN ZIP and PDF, RESPECTIVELY. Use of any other format slows down the marking and may result in a mark deduction.

### Marking ###

The assignment will be assessed according to the assessment standards for the unit learning outcomes.

Marks will be allocated equally to the code and to the report. Your code will be assessed for correctness and quality with respect to the assignment description. Marking of the report will assess the clarity and accuracy of your description and the adequacy of your testing. 20% of the marks for the assignment will be allocated to testing.


[Dominic Verity](http://orcid.org/0000-0002-4137-6982)
Last modified: 22nd October 2019
[Copyright (c) 2019 by Dominic Verity. All rights reserved.](http://mozilla.org/MPL/2.0/)