2 The H2035 language
You are going to write a Prolog program which will handle expressions of the future H2035 language. It is a statically typed functional language, with parametric overload and polymorphism similar to Haskell. Your job will be to implement the development phase which does type inference, syntactic sugar removal, and conversion to DeBuijn, as well as the evaluator.
The syntax of the language is as follows:
where forall is the type given to polymorphic expressions c represents any predefined constant, this includes for example integers and list operations; lambda and app define and call functions, and let introduces a set of local declarations, where those placed between […] can be (mutually) recursive (hence the r in rdecl).
As we can see, the syntax includes almost no type annotation, because the types will be completely inferred, just like in Haskell. The only type annotation is of the form (e:τ), which helps to help inference when necessary or to verify that the inference finds the correct type. In addition to this, the syntax also includes the term? Which is the inverse of type inference, that is, the code will be inferred based on the expected type.
The language syntax is based on the syntax of Prolog terms, so for example, x = e can also be written = (x, e), and + (e1, e2) can be written e1 + e2, and your code n You don’t have to worry about it because these differences only affect the parser, which is already provided by Prolog itself.
The language includes among its primitives operations on lists whose names derive from those used in Lisp: cons, car, cdr, nil. At runtime, these lists are represented by Prolog lists.
3 Elaboration
After the syntactic analysis (which we do not see in the code since it is made by Prolog), the code goes through a development phase which will implement certain features of the languages: the elimination of syntactic sugar, the inference of types (and code), and “disambiguation” of identifiers.
3.1 Syntactic sugar
H2035 uses a little syntactic sugar that the development phase will have to eliminate. More specifically, the following equivalence rules should be used:
3.2 Overload
H2035, unlike Psil, Slip, and Haskell, offers what is known as overloading. This feature allows you to have several active definitions for the same identifier. Usually, when an identifier is defined more than once, the closest definition hides the others (in English, we use the term shadowing), but in H2035 this is not the case. Thus, each use of an identifier introduces an ambiguity if there are several definitions which correspond to it. The elaboration will resolve these ambiguities using the type information. Example :
These three definitions of x each have a different type, so the elaboration will be able to use this typing information to know that in the final expression, each of the three x refers to another definition.
The expression? Generalizes this idea in case the name of the variable is not specified and the elaboration can use any variable whose type matches. Of course, one would expect the programmer to use this term only if there is only one variable of the correct type.
The code returned by the elaborate predicate (and expected by eval) no longer has such ambiguities, because it uses what are called de Bruijn indices rather than variables with names. Specifically, it must use the following syntax:
where var (i) is a reference to the i-th environment variable (counting the most recently added variable as 0). So the code above will become:
where Np1, Nc, and Np2 are the integers corresponding to the respective position (or depth, depending on how you look at it) ofons, and + in the environment. Note that Np1 and Np2 refer to + but will not be equal; more specifically, Np2 = 1 + Np1 since the environment for the second use of + contains an additional variable (the ) compared to the environment for the first use of +.
3.3 Type inference
H2035 uses static typing similar to Haskell’s, where types are inferred. At first, I recommend that you do a simple inference, without polymorphism (nor mutual recursion), then in a second step you will add
this is called let-polymorphism, using the Hindley-Milner type inference algorithm.
The typing rules without polymorphism are given in Fig. 1. In these rules, the judgment
Γ├e: τ means that the expression e has type τ in a context Γ which gives the type of each variable whose scope covers e.
As you can see, the let rule does not allow let to simultaneously declare more than one identifier, and therefore does not allow mutual recursion. You may also notice that these rules assume that the syntactic sugar has already been removed, which is why there is no rule for example for the form x (e1, .., en).
3.3.1 Mutual recursion and inference of polymorphism
The second step introduces Hindley-Milner let-polymorphism which will therefore automatically infer when a definition is polymorphe, as well as automatically specialize a polymorphic definition each time it is used. This step also adds recursive definitions.
The new typing rules for the second step are given in Fig. 2. These rules replace the variable typing rule, and the let typing rule. In these new rules, σ is used to indicate a type that can be polymorphe (i.e. with a forall), while τ represents a monomorphe type (i.e. without forall).
In the variable typing rule, the σ⊂τ indicates that τ is an instance of σ, for example, σ could be forall (α, α → α)), and τ could be int → int which is the corresponding instance at the choice of int for α. It is therefore in this rule that forall are eliminated.
The let typing rule uses an auxiliary function gen which will generalize a definition, by discovering its polymorphism. It simply works by looking in the type τ of the definition for uninstantiated (or free: fv means “free variables”) variables: these can be generalized (unless they also appear in the context Γ), then giving a polymorphic type. It is therefore this rule that introduces the forall.
This new rule of law not only allows the introduction of multiple simultaneous definitions
(x1 … xn), but also allows mutual recursion. This can be seen in the fact that ei is typed in an environment which is not limited to Γ but also includes the identifiers x1 … xn, which therefore allows ei to make recursive references not only to xi but also to n ‘Any of the other identifiers defined by this let.
Important detail for the let rule: if we have for example the declarations y = x, x = 1 and we infer the type of y before inferring the type of x, the type of y will be incomplete that the type of x will not be inferred. For this reason, it is important to infer the type of all ei before passing them to gen, otherwise gen is likely to think that y is polymorphic.
4 Your work
Your job is to complete the code for the elaborate and eval predicates. The code provided to you already contains a skeleton of these predicates as well as a few other predicates which will be useful mainly for the second phase:
—Instantiate: implements the σ⊂tau rule.
—Freelvars: implements the fv function.
—Generalize: does the job of gen.
5 Discount
You must submit two files, h2035.pl and rapport.tex. These files will be delivered in electronic form, through Moodle / StudiUM.
6 Details
The score is based on the one hand on automatic tests, on the other hand on the reading of the code, as well as on the report. The most important criterion, and that your code must behave correctly. In general, a simple solution is more often correct than a complex solution. Then comes the quality of the code: the simpler, the better. If there are a lot of comments, it is usually a symptom that the code is not clear; but of course, without comments the code (even simple) and often incomprehensible. The efficiency of your code is irrelevant, unless it uses a really particularly ridiculously inefficient algorithm. Points will be distributed as follows: 25% on the report, 25% on the automatic tests, 25% on the code of the ‘elaboration, and 25% on the evaluator’s code. – The code must in no case exceed 80 columns. – Check the course web page, for any errata, and other additional indications. – All use of material (code or text) borrowed from someone else (found on the web, …) must be duly mentioned, otherwise it will be considered plagiarism.