CSCI 3366 Programming Languages
Problem Set 3: Tokens, Trees & Substitutions
Due: Tuesday February 22, 2022 6PM (10 Points)
IMPORTANT: In this assignment you will be editing 3 dif- ferent files. Make sure you commit all 3.
Copyright By PowCoder代写 加微信 powcoder
This is a relatively short problem set with just 3 problems. The first is easy, while the second and third are both more challenging. You should read through the problems right away so you can start thinking about them.
If you take a look in the src/bin directory, you’ll see that we’ve broken these problems out into their own modules.
> ls src/bin/
dune subst.ml token.ml typ.ml unify.ml
main.ml subst.mli token.mli typ.mli unify.mli
Recall from the lecture notes on modules that the companion files subst.ml and subst.mli make an OCaml module Subst. The subst.ml gives the imple- mentation, and subst.mli gives the signature of the operations exported by the subst module. Similarly for the other pairs of files giving rise to modules Token, Typ and Unify.
• Problem 1. relates to tokenizing and should be written in the token.ml file.
• Problem 2. relates substitutions and should be written in the subst.ml file.
• Problem 3. relates to a process called unification and should be written in the unify.ml file.
The file main.ml contains a test harness that will call the functions that you write in the files subst.ml, token.ml and unify.ml. You can test your solutions for this problem set in the same way as in the previous problem set:
> dune exec bin/main.exe test
1. For the purposes of this problem set, we’ve renamed a few things. Not least among them, what we called trees in problem set 1, are here called types, we use the symbol typ to avoid a conflict with the OCaml reserved word type.
2. There’s a bit of Latex below, you’ll probably want to read the companion .pdf file instead of this .md file.
1. (2. Points) A tokenizer is responsible for aggregating sequences of characters in the input text into sequences of tokens. For example, in OCaml, a tokenizer might have a token type:
type t = LPar | RPar | Plus
| Int of int
where Int i is the token derived from a sequence of digits representing a number. The challenge in handling the Int token is that the digit sequences representing numbers can be of different lengths. We saw how the righty example parser in class handled tokenizing integers and floating points with a special helper function.
In this exercise, your task is to implement a tokenizer for the above type of tokens. You can assume that integers are unsigned. However, in addition to supporting integers expressed in base-10, your tokenizer must also support integers expressed in base-2 (binary) and base 16 (hexadecimal). To distinguish hexadecimal and binary numbers from decimal numbers, we will prefix the first two with special strings. More specifically, integers can be expressed as:
• Base 10: any sequence of contiguous digits from 0-9 with no spaces between. (e.g. “360” should be converted to Int 360)
• Base 2: The string “0b” followed by a sequence of 0s or 1s representing the number in binary. (e.g. “0b101” should be converted to Int 5)
• Base 16: The string “0x” followed by a sequence of digits from 0-9 or letters from A-F representing the number in hexadecimal.
(e.g. “0xFACE” should be converted to Int 64206)
Write the function Token.tokenizer : string -> Token.t list such that given a call (Token.tokenizer string), the tokenizer function returns a list of the tokens in string. For example, the call
(tokenizer “)0b10 + 23 *( 24”)
would produce a list of tokens
[RPar; Int 2; Plus; Int 23; Times; LPar; Int 24]
You don’t have to worry about the integers overflowing or being too large to store in the OCaml integer type. You can assume that the hexadecimal numbers do not use lowercase letters. Feel free to use recycle some of your
code from problem set 1 and feel free to use the Lib.explode function to break the string into a list of characters.
2. (4 Points) Lets say we have tree-shaped types as we did in problem set 1. In the typ.ml file we have:
type t = C
| Var of int
| Arrow of { from : t ;too :t
let t1 = Arrow { from = Arrow { from = C ; too = C
} ; too = Var 1
let t2 = Arrow { from = Var 2; too = C }
let t3 = Arrow { from = Arrow { from = C ; too = C
; too = Arrow { from = C
Consider the types
representing (resp.) the trees:
; too = C }
t1 =-> t2= -> /\/\/\
-> v1 v2 C -> ->
/\ /\/\ CC CCCC
Write a function Unify.disagreement : Typ.t list -> Typ.t list such that a given call (disagreement typs) returns a list of the topmost and leftmost subtrees where not all of typs agree. For example, the call (disagreement [t1; t2; t3]) would return the list
[ ->; v2; -> ] /\ /\
The typs t1, t2 and t3 agree at their roots but not all of their from fields match. The call (disagreement [t1; t1; t1; t1]) would return the empty list [] – there is no disagreement, and the calll (disagreement [t1; t3]) would return the list of trees:
[v1; ->] /\
3. (4 Points) This problem relates to substitutions. There’s a bit of La- tex formatting code here so make sure that you’re reading the README.pdf version of this problem set writeup.
Let t and u be types (i.e., trees as defined in the Typ module) and let θ = {t1/x1,…,tk/xk} and λ = {u1/y1,…,uj/yj} be substitutions. This problem is concerned with computing the composition of two substitutions, θ ◦ λ. With θ and λ defined as above, the composition θ ◦ λ is obtained from the set of tree/variable pairs
{t1λ/x1,…,tkλ/xk,u1/y1,…uj/yj} by removing two types of pairs:
1. remove any tree/variable pair tλ/x for which tλ = x, and
2. remove any tree/variable pair u/y for which y ∈ {x1 , . . . , xk }.
The notation tλ/x, though it might look complicated, is simply the tree/variable pair resulting from the application of the substitution λ to the tree part of the tree/variable pair t/x. You wrote the function for this in problem set 2 and it is provided for you in this problem set in the Subst module.
As an example, with θ = {(C → y)/x, z/y} and λ = {C/x, C/y, y/z}, the composition θ ◦ λ is obtained from the set
{(C → y)λ/x, zλ/y, C/x, C/y, y/z} = {(C → C)/x, y/y, C/x, C/y, y/z} by deleting the tree/variable pair y/y and by deleting both C/x and C/y.
This leaves over the substitution {(C → C)/x, y/z}.
Write the function Subst.compose : Subst.t -> Subst.t -> Subst.t such that for substitutions s1 and s2, the call (Subst.compose s1 s2) yields the composition of s1 and s2.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com