9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 1/11
Macquarie University
Department of Computing
COMP3000 Programming Languages 2021
Assignment Two
Syntax Analysis
Due: see submission page
Worth: 15% of unit assessment
Marks breakdown:
Code: 90%
Your unit tests: 10%
Submit a notice of disruption via ask.mq.edu.au if you are unable to submit on time for medical or other legitimate reasons.
Late penalty without proper justification: 20% of the full marks for the assessment per day or part thereof late.
Overview
This assignment asks you to develop a lexical analyser, parser and tree builder for a simple functional programming language called
HasLang. We will build on these components in assignment three to complete a full implementation of a subset of this language.
Building this implementation will give you insight into the way that programming language implementations work in general, as well as
specific experience with how functional language programs are written, how they are compiled, and how they are executed.
This kind of task often arises in programming situations other than language implementation. For example, many applications have
configuration files that are written in simple languages. The application must be able to read these files reliably and understand their
structure, just as a compiler must read program files and understand them.
https://ask.mq.edu.au/
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 2/11
HasLang
HasLang is a language that contains elements from lazy functional languages such as Haskell, Clean and Miranda; it uses a Haskell-like
syntax.
The description here is a brief overview of the HasLang language. Aspects such as checking the validity of names or types and
translating the program into an executable form are beyond the scope of syntax analysis and hence are not considered in this
assignment. We will address some of this in Assignment Three.
The basic unit of a HasLang program is the expression; there are no statements. In fact, a program is just a single expression. For
example, here is a simple program that returns the result of a simple arithmetic operation:
2 + 3 * 4
When this program is run it will print the value of the expression: the number 14.
Let expressions are used to build programs out of smaller expressions. A let expression is the let keyword followed by one or more
definitions, followed by the in keyword and a single expression. The idea is that the definitions can give names to values and functions.
The value of a let expression is given by its final expression, which can use the defined names. For example, here is a program
consisting of a let expression that uses two values:
let
a :: Int = 5;
b :: Int = a + 1
in
a * b
The definitions in a let expression are separated by semicolons. This means the last definition and the body expression are not
terminated by semicolons.
This program will print the result of multiplying a by b, so 30 will be printed. �The name a can be used in the definition of b since b is
defined later, but that is a name analysis issue, so we don’t need to worry about it here.) There are no assignment statements, so the
value bound to a particular occurrence of a name cannot be changed.
All variable declarations (which are also called “bindings”) must include their type. In the example above they are both integers (using ::
to specify type). Some of the types are Int, Bool, and the type of a function which takes a value and returns a value (for example, the
type of a function that takes an integer and returns a boolean is Int -> Bool).
Definitions can also define functions. For example, here is a program that defines a value and a function, and calls the function passing
the value as a parameter:
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 3/11
let
x :: Int = 100;
inc :: Int -> Int = \a :: Int -> a + 1
in
inc x
}
This program will print 101.
Parameter passing is not done with parenthesis as in C/C��/Java/Scala, it is done simply by putting the parameter to the right of the
function name as above.
Functions in HasLang are defined in the very same way as variables, in fact it is best to think of them as variables which contain
functions rather than primitive values. In the example above, inc’s value is an anonymous function; the notation for this is a lambda (\)
followed by the single argument parameter (with type), then an arrow and the body of the function. So inc in the above example is
equivalent to the java method:
int inc (int a) {
return a + 1
}
A lambda function is an expression that can be used by itself without becoming the value of a variable (as with inc). For example, the
following expression evaluates to 4:
(\a :: Int -> a + 1) 3
Functions can also be defined in a multi-line form; for example, here is a definition of the factorial function fac (which would need to be
inside a let):
fac :: Int -> Int
fac 0 = 1.
fac n = n * fac (n – 1)
In this form, the first line provides the name and function signature. The second and third lines define how the function works by pattern
matching on the left of =. �Note that the third line is separated by “.” at the end of the second line. This is something that is not in Haskell
but is needed here as we are not using an end-of-line token.) The second line indicates that if fac has a parameter value of 0 then the
result will be 1. The third line, only used if the pattern match on the second line fails, instantiates a new variable n with the value of fac’s
parameter. There are no constraints on the pattern matching, so the third line will always succeed (providing the second line has failed).
With the new variable n, the expression on the right of = is evaluated. Note that the brackets are required otherwise n * (fac n) – 1 would
be evaluated.
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 4/11
All of these programs have just one expression at the top level. In fact, that’s the definition of a program in HasLang: a single expression.
Expression forms are interchangeable as long as they have the correct type. E.g., anywhere we can put a number, can also take a block
or some other kind of expression that evaluates to a number. For example, here is an artificial program that uses lets nested inside an
arithmetic operation.
(let
a :: Int = 3
in
a + 1
)
*
(let
b :: Int = 10
in
b – 1
)
This program will print 36 since it is multiplying 4 times 9.
We’ve seen a few different forms of expression: numbers, addition expressions, multiplication expressions and function call expressions.
There are also other arithmetic operations, Boolean values, Boolean literals, relational operators and conditional expressions.
There are also lists and tuples like in Scala and the list cons operator : (whereas in Scala it is ::). The complete syntax of HasLang is
given below.
Finally, HasLang comments are as in Haskell: either beginning with two dashes and continuing to the end of the line, or surrounded by {-
and -}.
— The answer to life the universe and everything
42 {- I’m also a comment -}
Types in HasLang
We will have three basic types:
Int – integers
Bool – Booleans true, false
() – this is the Unit type, meaning no type, e.g. for a function that doesn’t return anything
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 5/11
The composite types we will use:
function types of the form: type → type
list types – written with square brackets, e.g. [Int] for a list of integers
tuple types – written with round brackets and commas, e.g. (Int,Bool)
Patterns in HasLang
As described in the fac example, pattern matching is done on the values of function parameters. The basic allowed patterns (for us) are:
a literal value – for us that is just integers and true, false; will match if the parameter value is the same as the literal value
an identifier – will always match, instantiating the variable
wildcard – will always match; written as an underscore (like in Scala)
The composite patterns are recursively defined where pat is another pattern:
a list – written as: [pat, … , pat] which matches the elements of a list value; can also have just []
a tuple – written as: (pat, … , pat) which matches the values of a tuple
a cons – written as: pat : pat which matches head and tail of a list value
Here is another example of a function that calculates the length of a list of integers using list and cons patterns:
length :: [Int] -> Int
length [] = 0.
length h:t = 1 + length t
length is a function that takes a list of integers �[Int]) as its input and outputs an integer (→ Int). If length’s argument is the empty list ([])
then the result is 0; otherwise decompose a non-empty list into is head and tail (h:t where : means cons) and then the result is one more
than the length of the tail.
The syntax of HasLang
To guide your implementation, here is a context-free grammar for the HasLang language on which you can base your parser.
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 6/11
First, the syntax of programs and expressions:
program : exp.
exp : exp exp
| “let” definitions “in” exp
| “if” “(” exp “)” “then” exp “else” exp
| exp “==” exp
| exp “<" exp
| exp "+" exp
| exp "-" exp
| exp "*" exp
| exp "/" exp
| exp ":" exp
| "false"
| "true"
| ident
| integer
| "(" exp ")"
| "[" "]" | "[" exp ( "," exp )* "]"
| "(" exp ( "," exp )+ ")"
| "\" ident "::" tipe "->” exp.
The syntax of definitions that can occur in lets:
definitions : defn (“;” defn)*.
There are two forms of definitions:
single line like the x and inc examples
multi-line like the fac example; it uses funline for each line after the first line (and they must be separated by “.”)
defn : ident “::” tipe “=” exp
| ident “::” tipe funline ( “.” funline )*.
funline : ident pat* “=” exp.
Functions that are defined in the same let are grouped together and they can call each other.
The syntax of types:
tipe : “Bool”
| “Int”
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 7/11
| “(” “)”
| tipe “->” tipe
| “(” tipe “)”
| “[” tipe “]”
| “(” tipe ( “,” tipe )+ “)”.
We use the word “tipe” instead of “type” since the latter is a Scala keyword which prevents us from using it as the name of a parser in
our code. A function type is specified using the arrow -> and describes the type of a function that takes a value of the type on the left of
the arrow and returns a value of the type on the right of the arrow. �Type analysis issues are also out of scope for this assignment.)
The syntax of patterns:
pat : “_”
| integer
| “true”
| “false”
| ident
| pat “:” pat
| “(” pat “)”
| “[” pat ( “,” pat )* “]”
| “(” pat ( “,” pat )+ “)”.
The grammar above is not immediately suitable for encoding as a parser. The exp, tipe and pat non-terminals are ambiguous since they
make no allowance for precedence and associativity of the operators. You should rewrite the grammar productions to implement the
following precedence and associativity rules:
The following expression constructs have precedence as shown from lowest to highest with constructs on the same line having the
same precedence:
1. conditional expression (if/then/else)
2. equal and less than
3. list cons
4. addition and subtraction
5. multiplication and division
6. all other kinds of expression: let, lambda function, function application
All binary expression operators are left associative, except for:
subtraction and division which are right associative (note: this is NON�STANDARD and it is just for the purposes of this
assignment so as to better test right assocativity)
the relational operators (equality and less than) which are not associative
the list cons operator is right associative.
the function type (->) is right associative.
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 8/11
Note that function application needs to be treated like the other binary operators in terms of handling its precedence and
associativity.
The parser skeleton you will be given already handles the lexical issues such as parsing integers, identifiers and comments.
What you have to do
You have to write and test a Scala syntax analyser including tree builder for HasLang.
You are strongly advised not to try to solve the whole assignment in one go. It is best to write code to handle the parsing and tree
construction for some simple constructs first and then build up to the full language.
Your code must use the Scala parsing library as discussed in lectures and practicals. You should use the expression language syntax
analyser and tree builder from the weekly classes as a guide for your implementation.
The supplied code bundle has modules that are very similar to those used in the practical exercises for weeks 5, 6. The skeleton
contains the modules you will need. Some of the parsing and tree construction is given to you as an illustration; you must provide the
rest (look for FIXME in the code).
As well as lexing and parsing the input, your program should construct a suitable source program tree to represent the parsed result.
See HasLangTree.scala in the skeleton for the full definition and description of the tree structures that you must use. You do not need to
modify the tree classes, just create instances in your parser code.
As an example of the desired tree structure, here is the tree that should be produced from the first program above:
Program(PlusExp (IntExp (2), StarExp (IntExp (3), IntExp (4))))
Notice that the higher precedence of multiplication over addition has been taken into account in this tree.
As a more complex example, here is the tree for the inc function program above:
Program(
LetExp(
Vector(Defn(
IdnDef(“x”, IntType()),
Vector(FunLine(“”, Vector(), IntExp(100)))),
Defn(
IdnDef(“inc”, FunType(IntType(), IntType())),
Vector(FunLine(“”, Vector(), LamExp(
IdnDef(“a”, IntType()),
PlusExp(IdnUse(“a”), IntExp(1)))))
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 9/11
)),
AppExp(IdnUse(“inc”), IdnUse (“x”))))
The program contains a single let expression with two definitions: one for “x” and one for “inc”. The function definition contains two
children: one for the name, and one for the function itself. The lambda expression contains two children: the argument name and type,
and one for the body expression. Finally, the let has the function call as its value expression.
Here is the length function in a let:
let
x :: [Int] = [3, 1, 4];
length :: [Int] -> Int
length [] = 0.
length h:t = 1 + length t
in
length x
And the tree it produces:
Program(
LetExp(
Vector(Defn(
IdnDef(“x”, ListType(IntType())),
Vector(FunLine(“”, Vector(),
ListExp(Vector(IntExp(3),
IntExp(1), IntExp(4)))))),
Defn(
IdnDef(“length”, FunType(ListType(IntType()),
IntType())),
Vector(FunLine(“length”, Vector(ListPat(Vector())),
IntExp(0)),
FunLine(“length”,
Vector(ConsPat(IdentPat(“h”),
IdentPat(“t”))),
PlusExp(IntExp(1),
AppExp(IdnUse(“length”),
IdnUse(“t”)))))
)),
AppExp (IdnUse (“length”), IdnUse (“x”))))
Running the syntax analyser and testing it
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 10/11
The skeleton for this assignment is designed to be run from within sbt. For example, to compile your project and run it on the file
test/simple.hal you use the command
run test/simple.hal
Assuming your code compiles and runs, the run will print the tree that has been constructed (for correct input), or will print a syntax
error message (for incorrect input).
The project is also set up to do automatic testing. See the file SyntaxAnalysisTests.scala which provides the necessary definitions to test
the syntax analyser on some sample inputs. Note that the tests we provide are not sufficient to test all of your code. You must augment
them with other tests which you will put at the BOTTOM of the tests file.
You can run the tests using the test command in sbt. This command will build the project and then run each test in turn, comparing the
output produced by your program with the expected output. Any deviations will be reported as test failures.
What you must hand in and how
A zip file containing only:
SyntaxAnalysis.scala
SyntaxAnalysisTests.scala
Your submission should include all of the tests that you have used to make sure that your program is working correctly. Note that just
testing one or two simple cases is not enough for many marks. You should test as comprehensively as you can.
Submit your code electronically as a single zip file called ass2.zip using the appropriate submission link on the COMP3000 iLearn
website by the due date and time.
DO NOT SUBMIT YOUR ASSIGNMENT IN ANY OTHER FORMAT THAN ZIP containing .scala files. Use of any other format slows down the
marking and may result in a mark deduction.
Marking
The assignment will be assessed according to the assessment standards for the unit learning outcomes.
Your code will be assessed for correctness and quality with respect to the assignment description. Marking also will assess the
adequacy of your tests – this will be 10% of the marks for the assignment.
9/26/21, 10:21 AM COMP3000 Assignment Two description
https://ilearn.mq.edu.au/pluginfile.php/7309123/mod_resource/content/12/COMP3000_Ass2_2021.html 11/11
Tony Sloane, Matt Roberts, Kym Haines
Last modified: 28 Aug 2021
Copyright (c) 2021 by Macquarie University. All rights reserved.