CS计算机代考程序代写 chain Haskell algorithm interpreter # Assignment 4

# Assignment 4

The assignment is to complete the implementation of an interpreter for a Haskell subset we’ll call WHaskell, for “Wee Haskell”. The interpreter has a typechecker and an evaluator. Interpreters often use imperative features like side-effects (such as modifying a global variable). The WHaskell interpreter will illustrate the use of monads as an alternative to these features.

**Submitting.** Complete the Haskell file and submit that. All your code must be at the end, where the stubs are. Do not change anything before that. Test your code using `run` on the examples `gcdTest` and `pairTest`. There are only four functions for you to write, but please don’t underestimate the work: you’ll need to thoroughly understand the provided code in order to complete them.

**What you can use.** There are no restrictions on what parts of the language you can use or what libraries, as long as you only use libraries that are part of the standard Haskell distribution. However, you shouldn’t need much. Here’s everything I used beyond the core language and functions defined in the file: error, ++, any, map, Just, Nothing, do, return, concat, Map.singleton, Map.insert.

**What to read/understand.**
**1.** You will need to read and understand all the provided code _except_ for the parser. You can treat that as a black box. You just need to know there’s a function
“`
parse :: String -> Exp
“`
**Warning** the parser is unforgiving and gives useless error messages, e.g. complaining about taking the head of an empty list. There are a bunch of examples of well-formed WHaskell programs you can use as a guide, near the end of the file.
**2.** The details below on WHaskell and the type inference algorithm.

## Definition of WHaskell

WHaskell is a familiar subset of Haskell, but with a different syntax in order to make parsing easier.

**Expressions:**
– constants succ, pred, 0, 1, …
– variables x, y, …
– defined identifiers F, G, …
– lambda-abstraction `%x -> e`
– application `e e’`
– conditional `ifz(e1, e2, e3)`

We use `%` since backslash has a special meaning in Haskell and Markdown strings. We make a distinction between variables and the names of defined functions, using a capital letter at the beginning to distinguish the two.

**Values**
The values of WHaskell are numbers, `succ`, `pred` and all closed expressions of the form `%x -> e`.

**Evaluation** Evaluation is leftmost-outermost reduction to a value, never reducing redexes under a `%`. We include as a redex any conditional expression `ifz(v,e2,e3)` where v is a value. It reduces (in 1 step) to e2 if v is 0, otherwise it reduces to e3. So, we are using `0` in place of a boolean `True`, and all other values in place of `False`.

_Note:_ we only evaluate closed terms. Since we start with a closed term, and never reduce inside an expression `%x -> e`, reductions will always involve only closed terms, so we don’t need to worry about variable capture when reducing a β-redex `(%x -> e) e’`.

**Types:**
– the type of integers `Int`
– type variables a, b, c …
– function types `S -> T`

## WHaskell type inference

A detailed example should be enough to make clear how type inference works in principle. The assignment will fill in the remaining implementation details. Here, and in the assignment, we will assume the term we are typechecking has no lambda-abstractions. This means there are no bound variables in it. Since it must be closed as well, it has no variables in it at all.

We will run the type inference algorithm on the expression
“`
id id succ 17
“`
with the typing context consisting of the single definition
“`
id :: a -> a
id = %x -> x
“`
To see the subexpression structure more clearly we rewrite the term using App as notation for an application, à la abstract syntax (see the Exp type in the Haskell file).
“`
(App
(App
(App
id
id)
succ)
17)
“`
Note that, in the above, every subexpression starts on its own line.

### Step 1: attach a preliminary type to every subterm

The preliminary type of a subexpression is the best type we can come up with without looking at the context or at any of its subexpressions. Specifically,
– for a defined identifier: a _fresh_ (see below) instance of its declared type
– succ and pred: `Int -> Int`
– any other kind of subexpression: a fresh type variable.

A _fresh_ type variable is one that hasn’t been used before. A fresh instance of a type is obtained by replacing each variable occurring in the type by a fresh one.

We can notate the assignment of preliminary types by writing the type on the line where the subexpression starts. So, the type in the first line is the type for the whole expression.
“`
v5 (App
v4 (App
v3 (App
v1 -> v1 id
v2 -> v2 id)
Int -> Int succ)
Int 17)
“`

### Step 2: compute constraints/equations

By looking through the term we get a bunch of type constraints. In every application (App) expression, the function part has to have a function type and the argument’s type has to match the expected argument type. In general, in an application subexpression
“`
t (App
u e
u’ e’)
“`
the argument has type `u’` and the result of the application has type `t` so the function part of the application has to have a type that’s an instance of `u’ -> t`. So, we get the equation/constraint `u = u’ -> t`.

Processing the “App” subexpressions from top to bottom, we get three equations.
“`
(Int -> v5) = v4
((Int -> Int) -> v4) = v3
((v2 -> v2) -> v3) = (v1 -> v1)
“`

### Step 3: solve the equations

Recall that the solution to a set of type equations is a substitution of types for type variables that makes them all true. In other words, for each equation `e = e’`, if we apply the substitution to both sides, the resulting terms are identical.

We proceed by successively simplifying the set of equations using two operations.

1. _Reduce._ Find an equation of the form `a = t` or `t = a` where a is a variable. Any substitution that solves all the equations has to make `a` the same as `t` so we can safely get a new set of equations by removing `a = t` (or `t = a`) and substituting `t` for `a` everywhere (?).

2. _Split._ Find an equation of the form
“`
(s1 -> t1) = (s2 -> t2)
“`
and replace it by two equations `s1 = s2` and `t1 = t2`.

**But** we overlooked critical point above, where the “?” is. What if the variable `a` occurs in the type expression `t`? For example, suppose the equation is `a = a -> a`. No matter what we substitute for `a`, the right and left hand sides of this equation will not be the same. The right side will always be bigger. This means that there is no solution and we should immediately signal an error. However, if ‘a’ does _not_ occur in `t`, then when we replace `a` by `t` in the other equations, we have _eliminated_ the variable `a` from the set of equations and have a simpler problem to solve.

**So,** Before any reduce step we do an **”occurs check”**, checking if the variable occurs in the non-variable side of the equation. If it does, we halt and report an error, otherwise we continue.

Here is the example’s set of equations again.
“`
(Int -> v5) = v4
((Int -> Int) -> v4) = v3
((v2 -> v2) -> v3) = (v1 -> v1)
“`
We have a choice of whether to split or reduce. It doesn’t matter what order we apply the two solving methods in. Let’s reduce using the first equation. It passes the occurs check so we can substitute for `v4` in the remaining equations, getting
“`
((Int -> Int) -> (Int -> v5)) = v3
((v2 -> v2) -> v3) = (v1 -> v1)
“`
Let’s do a split step next. This replaces the second equation with two equations, and we get:
“`
((Int -> Int) -> (Int -> v5)) = v3
(v2 -> v2) = v1
v3 = v1
“`
Now we’ll just keep cranking through split and reduce steps.
Reduce using the third equation gives
“`
((Int -> Int) -> (Int -> v5)) = v1
(v2 -> v2) = v1
“`
Reducing using the second equation gives
“`
((Int -> Int) -> (Int -> v5)) = (v2 -> v2)
“`
Splitting, we get
“`
(Int -> Int) = v2
(Int -> v5) = v2
“`
Reducing using the first equation:
“`
(Int -> Int) = (Int -> v5)
“`
Splitting:
“`
Int = Int
Int = v5
“`
Getting rid of the trivial equation (left and right sides identical) and reducing using the second equation:
“`
Int = Int
“`
Removing the trivial equation, we have no equations left!

So, now what?

### Collecting the solution

At each stage, we transformed the set of equations so that if the new set was solvable, so was the old one. The empty set of equations is vacuously solvable (all the equations are true since they’re aren’t any equations!). This means that the original set was solvable, so all the type constraints in the original expression can be satisfied. This means that the program typechecks and we can go ahead and run it without encountering a run-time type error.

But we will also, in general, want to report the inferred type of the original expression. This is just a matter of collecting up all the individual substitutions we did a long the way in the reduce steps. Here are all the reduced steps, with the variable being eliminate on the left.
“`
v4 = (Int -> v5)
v3 = v1
v1 = (v2 -> v2)
v2 = (Int -> Int)
v5 = Int
“`
To get the final substitution that solves all the original equations, we need to put these together. E.g. in the first step we substituted `Int->v5` for `v4`, and in a later step we eliminated `v5`, replacing it by `Int`. Chaining together the reduction substitutions, we get
“`
v4 = (Int -> Int)
v3 = ((Int -> Int) -> (Int -> Int))
v1 = (v2 -> v2) = ((Int -> Int) -> (Int -> Int))
v2 = (Int -> Int)
v5 = Int
“`
You can check for yourself that when you apply these substitutions to the original equations
“`
(Int -> v5) = v4
((Int -> Int) -> v4) = v3
((v2 -> v2) -> v3) = (v1 -> v1)
“`
both sides of each equation are identical.