F# 编译器代写

Expand the accompanying archive Project.zip and put the extracted folder Project on the Windows (or MacOs) desktop. The folder contains a few files that you will need. Write your solutions as instructed below. Then compress Project to a zip archive called Project.zip and submit the archive. Make sure you submit the new zip file with your solution, not the original one!

Note: Your files must compile with no syntax/type errors. You may receive serious penalties for code that does not compile correctly.

1 The PLC Language

In this project you will develop in F# an interpreter and related code for an extension of the PLC language introduced in Hw6. The language incorporates several of the programming concepts studied during the course. It is purely functional, strict, statically-typed, lexically-scoped, and higher-order. With respect to the version from Hw6, this version of PLC has more features, including a list type and related primitive functions, anonymous functions, and a print command. Also, the concrete syntax for PLC types is slightly different. However, the abstract syntax is essentially unchanged and very similar to that of micro-ML.

Main limitations with respect to real world functional languages, introduced for simplicity, are that there are no commands to read input from the console or files; functions are monomorphic and can be recursive but not mutually recursive; formal parameters of functions must be explicitly typed; and recursive functions must declare their return type. Finally, important basic types such as characters and strings or structured types such as algebraic datatypes and records are missing.

Figure 1 contains an example of a PLC program. The program defines a non-recursive first-order function inc of type Int -> Int; a non-recursive first-order function add of type Tuple[Int,Int]

1

fun inc (x : Int) = x + 1;
fun add (x : Int, y : Int) = x + y;
fun highAdd (x : Int) = fn (y : Int) => x + y end;
var y = add(3, inc(4));
var x = highAdd(3)(7-y);
var z = x * 3;
fun rec fact (n : Int) : Int =
  if n = 0 then 1 else n * fact(n - 1);
print x; print y;
x :: y :: z :: fact(z) :: ([] : List[Int])

Figure 1: A PLC program.

-> Int; a non-recursive higher-order function highAdd; variables x, y and z; and a recursive first- order function fact. The scope of each of these functions and variables includes the declarations and expressions that follow. In the example, the expressions after the declaration of fact are also separated by semicolons. When used with expressions, semicolon is, as in F#, a right-associative binary operator such that e1 ; e2 evaluates to the value of e2 for all expressions e1 and e2. In the example program, the first expression prints to the console the value of x and y and then returns the list consisting of the values of x, y, z , fact(z) and y. Function highAdd takes an integer x and returns the anonymous function fn (y:Int) => x + y end, which in turn takes an integer y and returns the value of x + y. Function fact is the usual factorial function whose input and output values are explicitly declared to be of type Int.

In this version of PLC, function declarations must include the output type only if the function is recursive. Declarations of such functions need the rec qualifier after the fun keyword. Since PLC has anonymous functions, declarations of non-recursive functions are in fact syntactic sugar. That is, a program of the form

is treated as the program

fun f(x : t) = e ; e1
varf =fn(x:t)=>eend;e1

where f becomes a variable of higher-order type t -> te (with te being the type of e) whose value is the anonymous function fn (x : t) => e end. The only true function declarations are those of recursive functions then.

One restriction on the use of ; is that (variable or function) declarations cannot follow expres- sions unless they are included in a brace-delimited block. For example;

is not allowed whereas

 1 - 3; var x = 4; 2 * x

1 – 3; {var x = 4; 2 * x}

is. In any case, the last argument of ; must be an expression, it cannot be a declaration. Sequences of declarations and expressions enclosed in braces are treated as atomic expressions, which means that they can go anywhere an expression can go. This allows one for instance to

declare local variables and functions within another function, as in the program in Figure 3. 2

var E = ([] : List[Int]);
fun reverse (l : List[Int]) = {
  fun rec rev (l1 : List[Int], l2 : List[Int]): List[Int] =
    if ise(l1) then

l2 else {

      var h = hd(l1);
      var t = tl(l1);
      rev(t, h::l2)

}; rev(l, E)

};
reverse (1::2::3::E)

Figure 2: A PLC program with locally defined functions.

fun twice (f : Int -> Int) = fn (x : Int) => f(f(x)) end ;
fun rec map (f : Int -> Int) : (List[Int] -> List[Int]) =
  fn (l: List[Int]) =>
    if ise(l) then l else f(hd(l)) :: map(f)(tl(l))
  end ;
fun square (x : Int) = x * x ;
fun inc (x : Int) = x + 1 ;
var E = ([] : List[Int]) ;
var l1 = map (fn (x:Int) => 2*x end) (10::20::30::E) ;
var l2 = map (twice(inc)) (l1) ;
(l1, l2)

Figure 3: A PLC program with higher-order combinators.

2 Types, type annotations and static typing

Lists in PLC are essentially the same as in F#, with [] denoting empty list constant and :: denoting the list constructor. Note, however, that the empty list must be explicitly typed anywhere it occurs, as shown in the programs of Figures 1– 3. The reason is that this makes type checking considerably easier to implement. It is the same reason formal parameters of functions must be explicitly typed and recursive functions must declare their return type. The latter simplifies the type checking of the function’s body, which includes occurrences of the function name (in recursive calls).

Since the language is higher-order, we can define and use in it the usual combinators, with the only restriction that they cannot be polymorphic.1 An example of such functions is provided in Figure 2. The function map is the usual one except that it is restricted to integers lists as input and as output, and has a more verbose declaration than in F#.

1 This is truly limiting, because now one needs for instance to define a map function for each possible concrete instance, such as (Int -> Int) -> List[Int] -> List[Int], of the parametric type (’a -> ’b) -> List[’a] -> List[’a] that map could have if polymorphism was allowed as in F#. This restriction too is to simplify type checking.

3

2.1 Types and operators

The language has the following types and operations on them. Your interpreter should support all of them.

Unit type The type Unit, similar to unit in F#, contains a single value. Predefined operators dealing with Unit values are: () : Unit, the only value of this type, and print : t -> Unit, for any type t. The latter function always returns () but has the side effect of printing to the console (standard output) a textual representation of its input value.

Boolean type The type Bool is the usual Boolean type. In addition to the constants true and false, it has the predefined operators && : Tuple[Bool, Bool] -> Bool for Boolean con- junction and ! : Bool -> Bool for Boolean negation. Two more operators are = and !=, both of type Tuple[t, t] -> Bool for any equality type t (see below), respectively for equal- ity and disequality comparison.

Integer type The type Int is the usual integer type whose constants are all the numerals. It has the usual infix binary operators +, -, *, /, <, and <= with the expected meaning. The first four have type Tuple[Int,Int] -> Int. The last two have type Tuple[Int, Int] -> Bool. The – operator is also unary, with type Int -> Int.

Tuple types For any PLC types t1, …, tn with n > 1 it is possible to construct tuples of type Tuple[t1, …, tn]. The tuple constructor is the multi-arity mixfix operator ( , …, ) similar to the pair constructor ( , ) of Hw6. For all n > 0, i ∈ {1,…,n} and types t1, …, tn, there is also a postfix element selector #i : Tuple[t1, …, tn] -> ti that returns the i’s element of its input tuple.

Function types Functions that take an input of type t1 and produce an output of type t2 have type t1 -> t2. The arrow operator -> is right-associative.

List types For any PLC type t it is possible to construct lists of type List[t]. Note that this means that it is possible to construct lists of lists, lists of tuples, and so on. The predefined, and polymorphic, operators dealing with list values are listed below.

  • [] : List[t], for any type t. The empty list of elements of type t.
  • :: : Tuple[t, List[t]] -> List[t], for any type t. The usual infix, right-associative list construction operator.
  • ise : List[t] -> Bool, for any type t. Returns true if the input list is empty and false otherwise.
  • hd : List[t] -> t, for any type t. Returns the head of the input list if the input is not empty, and raises an exception otherwise.
  • tl : List[t] -> List[t], for any type t. Returns the tail of the input list if the input is not empty, and raises an exception otherwise. Equality types These are the types with no occurrences of -> in them. They are defined induc- tively as follows: (i) Bool, Int, and Unit are equality types; (ii) if t is an equality type, so is List[t]; (iii) if t1, …, tn with n > 1 are equality types, so is Tuple[t1, …, tn]; (iv) 4

nothing else is an equality type. Recall that = and != apply only to values of an equality type.

An additional predefined infix operator is ; which has type Tuple[t1, t2] -> t2 for any types t1 and t2. It works exactly as in F# by returning the value of its second input. It is most useful when its first argument contains applications of the print function.

3 Concrete Syntax

The concrete syntax of PLC is described by the grammar rules below, where non-terminal symbols are written in angular brackets and the top symbol is <prog>.

3.1 Production rules

<prog> ::= <expr> | <decl> ; <prog>
<decl> ::=
    var <name> = <expr>
  | fun <name> <args> = <expr>
  | fun rec <name> <args> : <type> = <expr>
<expr> ::=
    <atomic expr>
  | <app expr>
  | if <expr> then <expr> else <expr>
  | ! <expr>
  | - <expr>
  | hd <expr>
  | tl <expr>
  | print <expr>
  | ise <expr>
  | print <expr>
  | <expr> + <expr>
  | <expr> - <expr>
  | <expr> * <expr>
  | <expr> / <expr>
  | <expr> = <expr>
  | <expr> != <expr>
  | <expr> < <expr>
  | <expr> <= <expr>
  | <expr> :: <expr>
  | <expr> ; <expr>
  | <expr> # <nat>
<atomic expr> ::=
    <const>
  | <name>
  | { <prog> }
  | ( <expr> )
  | ( <comps> )
atomic expression
function application
conditional expression
unary operator application
binary operator application

5

constant literal
function, variable or parameter name
local scope block
parenthesized expression
tuple
  | fn <args> => <expr> end
<app expr> ::=
    <atomic expr> <atomic expr>
  | <app expr> <atomic expr>
<const> ::=
    true | false

| <nat>

  • |  ()
  • |  ([ ]:<type>) <comps> ::= <expr> , <expr> | <expr> , <comps> <args> ::= () | ( <params> ) <params> ::= <typed var> | <typed var> , <params> <typed var> ::= <name> : <type> <type> ::= <atomic type> | Tuple [ <types> ] | List [ <type> ] | <type> -> <type> <atomic type> ::= Unit | Bool | Int | ( <type> ) <types> ::= <type> , <type> | <type> , <types>

3.2 Lexical rules

anonymous function
function application
numerals
unit value
type-annotated empty list
tuple components
function arguments
typed variable
tuple type
list type
function type
unit type
Boolean type
integer type

The non-terminal <name> is a token defined by the regular expression [’a’-’z’ ’A’-’Z’ ’ ’][’a’-’z’ ’A’-’Z’ ’ ’ ’0’-’9’]*

excluding the following names, which are keywords:

                     Bool else end false fn fun hd if Int ise
                    List print rec then tl true Tuple Unit var

6

The non-terminal <nat> is a token defined by the regular expression [0-9]+. 3.3 Operator precedence

The various operators and keywords have the following precedence, from lower to higher, with operators on the same line having the same precedence.

; ->
if
else
= !=
< <=
::
+-
* / &&
not hd tl ise print f #

where f is any user-defined function name. 4 Abstract Syntax

(right-associative) (non-associative) (left-associative) (left-associative) (left-associative) (right-associative) (left-associative) (left-associative) (non-associative) (non-associative)

For uniformity, and to make your task easier, we fix an abstract syntax for PLC types, expressions and values as the F# algebraic data types in Figure 4. You must use this abstract syntax in your implementation.

4.1 Types

F# terms of type plcType are used to encode PLC types. Here are examples of PLC code and their corresponding abstract syntax:

Concrete syntax

Int
Unit
Int -> Int
Int -> Int -> Bool
(Int -> Int) -> Bool
Tuple[Int, Int, Bool]
Tuple[Int, Int] -> Bool
List[Int]
List[Tuple[Bool,Int]]

Abstract syntax

IntT
TupT []
FunT (IntT, IntT)
FunT (IntT, FunT (IntT, BooT))
FunT (FunT (IntT, IntT), BooT)
TupT [IntT; IntT; BooT]
FunT (TupT [IntT; IntT], BooT)
List IntT
List (Tuple [BooT; IntT])

Note that the plcType constructor TupT is used to represent both the Unit type, with TupT [], and tuples types, with TupT [t1; …; tn] for n > 1.

7

type plcType =
  | IntT
  | BooT
  | FunT of plcType * plcType
  | TupT of plcType list
  | LisT of plcType
type expr =
  | ConI of int
  | ConB of bool
  | EList of plcType
  | Var of string
  | Let of string * expr * expr
  | Letrec of string * string * plcType
              * expr * plcType * expr
  | Prim1 of string * expr
  | Prim2 of string * expr * expr
  | If of expr * expr * expr
  | Call of expr * expr
  | Tuple of expr list
  | Sel of expr * int
  | Anon of string * plcType * expr
type plcVal =
  | BooV of bool
  | IntV of int
  | LisV of plcVal list
  | TupV of plcVal list
  | Clos of string * string * expr *
// Int
// Bool
// type -> type
// Unit and Tuple[type, ..., type]
// List[type]
// integer constants
// Boolean constants
// typed empty list constant
// variables
// expressions with variable declaration
// expressions with recursive function decl.
// unary operators
// binary operators
// if construct
// function application
// Unit Constant / tuple construction
// Tuple selector application
// anonymous function
// Booleans
// integers
// lists
// tuples
// closures

plcVal env

4.2 Expressions

Figure 4: Abstract syntax for PLC programs.

F# terms of type exp are used to encode PLC programs and expressions. Here are examples of PLC code and their corresponding abstract syntax:

8

Concrete syntax

15
true
()
(6, false)
(6, false)#1
([]:List[Bool])
print x; true
3::7::t
fn (x:Int) => -x end
var x = 9; x + 1
fun f(x:Int) = x; f(1)
fun rec f(n:Int) =
  if n <= 0 then 0
  else n + f(n-1) ;

f(5)

Abstract syntax

ConI 15
ConB true
Tuple []
Tuple [ConI 6; ConB false]
Prim2 ("#", Tuple [ConI 6; ConB false], ConI 1)
EList BooT
Prim2 (";", Prim1 ("print", Var "x"), ConB true)
Prim2 ("::", ConI 3, Prim2 ("::", ConI 7, Var "t"))
Anon ("x", IntT, Prim1("-", Var "x"))
Let ("x", ConI 9, Prim2 ("+", Var "x", ConI 1))
Let ("f", Anon ("x", IntT, Var "x"), Call ("f", ConI 1))
Letrec ("f", "n", IntT,
  If (Prim2 ("<=", Var "n", ConI 0), ConI 0,
    Prim2 ("+", Var "n", Call (Var "f", ...))), IntT,
    Call (Var "f", ConI 5))

The Tuple constructor, which takes a list of expressions as arguments is used to represent tuple expressions. It is also used to represent the Unit expression (), as Tuple []. Note that the empty list constant EList carries the list type with it, which is needed for type checking. Also note that #, represented by the Sel constructor, is treated as binary operator for convenience; however, its second argument must be a numeral.

Anonymous functions of the form fn (x:t) => e end are represented as Anon (x, t′, e′) where t′ is the abstract syntax representation of type t and e′ is the abstract representation of the function’s body e. Otherwise, the conversion to abstract syntax should be generally done as in Hw6. In particular, multi-argument functions should also be converted as in Hw6, using nested Let expressions.

4.3 Values

F# terms of type plcValue are used to encode PLC values. The PLC interpreter is essentially a converter from expr terms to plcValue terms. Here are examples of such conversions.

Expression

  1. ConI 15
  2. ConB true
  3. Tuple []
  4. Tuple [ConI 6; ConB false]
  5. Prim2 (“#”, Tuple [ConI 6; ConB false], ConI 1)
  6. EList BooT
  7. Prim2 (“;”, Prim1 (“print”, ConI 27), ConB true)
  8. Prim1 (“print”, ConI 27)
  9. Prim2 (“::”, ConI 3, Prim2 (“::”, ConI 4, Prim2 (“::”, ConI 5, EList IntT)))
  10. Anon (“x”, IntT, Prim1(“-“, Var “x”))
  11. Let (“x”, ConI 9, Prim2 (“+”, Var “x”, ConI 1))
  12. Let (“f”, Anon (“x”, IntT, Var “x”), Call (“f”, ConI 1)) 9

Value

  1. IntV 15
  2. BooV true
  3. TupV []
  4. TupV [IntV 6; BooV false]
  5. IntV 6
  6. LisV []
  7. BooV true
  8. TupV []
  9. LisV [3; 4; 5]
  10. Clos (“”, “x”, Prim1(“-“, Var “x”)), []) (in case of an empty environment)
  11. IntV 10
  12. IntV 1

Anonymous function expressions of the form Anon (x, t, e) should evaluate to the value Clos (“”, x, e, env) where env is the current environment.

With expressions of the form Prim1(“print”, e), the interpreter should first evaluate e to some value v, convert v to a string representation in concrete syntax, and then print that string to the standard output followed by a new line character. For the string conversion, you can use the helper function val2string : plcVal -> string already provided in module Absyn.

What other well-typed PLC expressions should evaluate to should be clear from Hw6. If you are not clear about specific cases, please ask the instructors.

5 Implementation

Your implementation of PLC should be divided in the following F# modules. Each module should be in its own file, with the same name and with extension .fs. You are required to follow this modularization both for your own sake, and to ease our evaluation of your code.

• Environ
This module defines a generic environment type and associated lookup function. It is already provided in file Project/Environ.fs in Project.zip. You will need instances of that type and will use lookup in the type checker and in the interpreter.

• Absyn
This module defines the abstract syntax. It is already provided in the file Project/Absyn.fs in Project.zip. It contains the helper function val2string that can be use to implement print.

• PlcParserAux
This module defines a few helper functions for the parser, as in Hw6. It is already provided in the file Project/PlcParserAux.fs. Use its functions in the parser specification.

• PlcParser
This module contains the parser for the PLC language. You should generate it with FSYacc in a file called PlcParser.fs from the provided file PlcParser.fsy which contains a partial FSYacc specification of the language. You are to complete the specification in PlcParser.fsy by adding production rules. Do not change any of the already defined tokens and their prece- dence.

10

  • Lexer
    This module contains the lexer for the PLC language. You should generate it with FSLex from a file name Lexer.fsl that you have to write. That file should use the tokens defined in PlcParser.fsy and recognize the operators and keywords of PLC. Your lexer may support comments, which in PLC have the form (* … *), but it does not need to.
  • Parse
    This module defines a function fromString, to parse a PLC program from a string, and fromFile, to parse a PLC programs from a text file. You can use these functions to test your parser. The module is already provided for you in file Parse.fs.
  • PlcChecker
    This module contains the type checker. It should provide a function teval : expr -> plcType env -> plcType that, given an abstract syntax expression e and a type environ- ment for the free variables in e, if any, returns the type of e if e is well-type and fails (with failwith) otherwise. You are to implement teval in file PlcChecker.fs following the typing rules specified in Appendix A.
  • PlcInterp
    This module contains the interpreter. It should provide a function eval : expr -> plcValue env -> plcValue that, given a well-typed expression e and a value environment for the free variables of e, if any, returns the value of e in that environment. You are to implement eval in file PlcInterp.fs. Note that it is expected that eval will diverge (never returning a value) if e denotes a non- terminating computation; for instance, if e comes from a program like
    fun rec f(x:Int):Int = f(x – 1); f 0.
  • Plc
    This module defines a function run : expr -> string that takes an abstract syntax expres- sion e, type checks the expression with teval, evaluates it with eval, and then returns a string containing the value and type of e in concrete syntax. It is already provided for you in file Plc.fs. You can use it together with parse.fromString or parse.fromFile to test your implementation of the type checker and the interpreter. For your convenience, the archive Project.zip contains also a bin folder with the FSLex and FSYacc executables. 6 Extra credit, optional
    6.1 Extending list syntax (low difficulty) Extend the lexer and the parser as needed to support the F#-style syntax [e1; e2; …; en] for lists. Treat [e1; e2; …; en] as syntactic sugar for e1::e2::· · ·::en::([]:t) where t is the type of the list. 11

6.2 Curried Functions (moderate difficulty)

Extend your implementation to support an F#-style curried multi-argument syntax for functions, allowing for instance the declarations in the following program.

var highAdd = fn (x:Int) (y:Int) => x + y end ;
fun leq (x:Int) (y:Int) = x <= y ;
fun rec move (l1 : List[Int]) (l2 : List[Int]) : List[Int] =
  if ise(l1) then l2 else move (tl(l1))(hd(l1) :: l2) ;

Observe that these functions are higher-order (highAdd : Int -> Int -> Int, leq : Int -> Int -> Bool, move : List[Int] -> List[Int] -> List[Int]) hence the can be evaluated par- tially, as in {var add3 = highAdd(3); …}. Also, you should still allow the individual arguments to be of tuple type as in fun pairAdd (x1:Int, x2:Int) (y1:Int, y2:Int) => (x1 + y1, x2 + y2).

A Typing rules for PLC

In the following, x denotes variable/function names; n denotes numerals; e, e1, e2 denote PLC expressions; s, t, ti denote PLC types; ρ denotes a type environment, that is, a partial mapping from variable/function names to types; ρ[x 􏰉→ t] denotes the environment that maps x to t and is otherwise identical to ρ; type(e, ρ) = t abbreviates the statement: “the type of expression e in environment ρ is t.”

The rules below define the type system of PLC. An expression e is well typed and has type t in a typing environment ρ if and only if you can conclude type(e, ρ) = t according to these rules.

  1. type(x, ρ) = ρ(x)
  2. type(n, ρ) = Int
  3. type(true, ρ) = Bool
  4. type(false, ρ) = Bool
  5. type((), ρ) = Unit
  6. type((e1, …, en), ρ) = Tuple(t1, …, tn) if n > 1 and type(ei, ρ) = ti for all i = 1,…,n
  7. type(([] : t), ρ)=t if tisalisttype.
  8. type(var x = e1 ; e2, ρ)=t2 if type(e1, ρ) = t1 and type(e2, ρ[x 􏰉→ t1]) = t2 for some type t1

9.type(funrecf (x:t):t1 =e1 ;e2,ρ)=t2

if type(e1, ρ[f 􏰉→ t -> t1][x 􏰉→ t]) = t1 and type(e2, ρ[f 􏰉→ t -> t1]) = t2 10. type(fn (x : s) => e end, ρ)=s -> t if type(e, ρ[x􏰉→s])=t

12

  1. type(e2(e1), ρ) = t2 if type(e2, ρ) = t1 -> t2 and type(e1, ρ) = t1 for some type t1
  2. type(if e then e1 else e2, ρ) = t if type(e, ρ) = Bool and type(e1, ρ) = type(e2, ρ) = t
  3. type(!e, ρ) = Bool if type(e, ρ) = Bool
  4. type(-e, ρ) = Int if type(e, ρ) = Int
  5. type(hd(e), ρ) = t if type(e, ρ) = List[t]
  6. type(tl(e), ρ) = List[t] if type(e, ρ) = List[t]
  7. type(ise(e), ρ) = Bool if type(e, ρ) = List[t] for some type t
  8. type(print(e), ρ) = Unit
  9. type(e1 && e2, ρ) = Bool if type(e1, ρ) = type(e2, ρ) = Bool
  10. type(e1 :: e2, ρ) = List[t] if type(e1, ρ) = t and type(e2, ρ) = List[t]
  11. type(e1 op e2, ρ) = Int if op ∈ {+,-,*,/} and type(e1, ρ) = type(e2, ρ) = Int
  12. type(e1 op e2, ρ) = Bool if op ∈ {<, <=} and type(e1, ρ) = type(e2, ρ) = Int
  13. type(e1 op e2, ρ) = Bool if op ∈ {=, !=} and type(e1, ρ) = type(e2, ρ) = t for some equality type t
  14. type(e # i, ρ) = ti if type(e, ρ) = Tuple(t1, …, tn) for some n > 1 and types t1, …, tn, and i ∈ {1,…,n}
  15. type(e1 ; e2, ρ)=t if type(e2, ρ)=t

13