Assessed Exercise, Task 2- Semantic analysis
Generic expressions Generic expressions
Function
invocations
Γ ⊢⊢ subr : (α1 , … , αn) → β Γ ⊢⊢ E1 : α1 … Γ ⊢⊢ En : αn
Γ ⊢⊢ subr (E1, … , En) : β
Block
expressions
Γ ⊢⊢ E1 : α1 … Γ ⊢⊢ En : αn
Γ ⊢⊢ E1 ; … ; En : αn
If statements
Γ ⊢⊢ C : bool Γ ⊢⊢ B1 : α Γ ⊢⊢ B2 : α
Γ ⊢⊢ if C then {B1} else {B2} : α
Numerical expressions Numerical expressions
Constants
Γ ⊢⊢ ⟨IntLit⟩ : int
Addition
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P + Q) : int
Subtraction
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P – Q) : int
Multiplication
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P * Q) : int
Division
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P / Q) : int
Comparisons Comparison expressions
Equality
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P == Q) : bool
Less than
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P < Q) : bool Greater than Γ ⊢⊢ P : int Γ ⊢⊢ Q : int Γ ⊢⊢ (P > Q) : bool
Less than
or equal
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P <= Q) : bool
Greater than
or equal
Γ ⊢⊢ P : int Γ ⊢⊢ Q : int
Γ ⊢⊢ (P >= Q) : bool
Boolean operations Boolean operators
Logical AND
Γ ⊢⊢ P : bool Γ ⊢⊢ Q : bool
Γ ⊢⊢ (P && Q) : bool
Logical OR
Γ ⊢⊢ P : bool Γ ⊢⊢ Q : bool
Γ ⊢⊢ (P || Q) : bool
Logical XOR
Γ ⊢⊢ P : bool Γ ⊢⊢ Q : bool
Γ ⊢⊢ (P ^^ Q) : bool
Statements Statements
Skip statements
Γ ⊢⊢ skip : unit
Assignments
Γ ⊢⊢ x : α Γ ⊢⊢ P : α
Γ ⊢⊢ x := P : unit
While loops
Γ ⊢⊢ C : bool Γ ⊢⊢ B : unit
Γ ⊢⊢ while C do {B} : unit
Repeat loops
Γ ⊢⊢ B : unit Γ ⊢⊢ C : bool
Γ ⊢⊢ repeat {B} until C : unit
Assessed Exercise, Task 2: Semantic analysis
Summary
In this task you are going to implement, in Java, a type
checker for ASTs for the same simple programming language.
The input for this task is in the form of an S-expression
representing an AST, according to the specification described
for Task 1 of the coursework, read from System.in. (We
assume that the input program has passed the syntax analysis
phase, so you may assume that the S-expression is properly
formatted.) The type checker will do the checks (as detailed
below) and print the appropriate output to System.out.
To help you get up to speed, we provide a skeleton project
which you can use as a basis. However, you can write the type
checker in any way you like, as long as it adheres to the
requirements specified below. Please note that we require
type errors to be reported through Java’s exception
handling mechanism, and according to a specific format
(to facilitate automated testing and feedback.) To help in
this, we provide a custom RuntimeException class (feel free to
modify interfaces, arguments, etc., but please keep the
generated output identical).
Once again, you will need a preprocessor (e.g., a lexer and a
parser) to turn the input AST (represented by an S-expression)
into a data structure that you can work on directly. To save your
time, we also provide this as a part of the skeleton project —
however, as the input is now guaranteed to be correctly
formatted, and is in the form of an S-expression, it may now be
manageable to write one manually if you wish. In any case,
please think early and carefully about how you will implement
pattern matching in your code and make sure it works.
Please also carefully read the submission guidelines.
Function declarations and identifier scope
Functions may be declared in the top-level scope, in any order,
and can invoke functions which are defined earlier or later in
the source file. (That is: the order in which functions are
declared, does not matter.) The functions must have distinct
names. We suggest that you use a symbol table to maintain a
record of the functions which have been declared, and their
types.
Any parameters taken by a function must have type either int
or bool. In type-checking a function declaration, it may be
useful to define a separate symbol table for the parameters of
that function, and add the names of those parameters together
with their declared types. This type information can then be
used to type-check the function body. The body of the function
itself is a block, whose type must agree with the return type of
the function.
We may summarise the rules about scope for identifiers as
follows.
The definition of one function fnA, may involve the
invocation of another function fnB, which is declared either
before or after the function fnA.
No two functions may have the same name, and no two
parameters of the same function fnA may have the same
name.
The parameters of a function fnA, may have the same name
as some function fnB.
A function parameter is only in-scope in the body of the
declaration of that function — there are no global variables or
side-effects.
A function parameter must have type either int or bool.
Type deduction rules
The following are the rules which specify how expressions of
various kinds are to be typed.
These are expressions which
take values of type int, in
particular arising from
arithmetic operations on two
values of type int. For any
expression involving an
arithmetic operator as below,
where one of the sub-
expressions involved is not of
type int, should result in an
error.
Comparison expressions are
ones which relate two
expressions of type int, to
produce an expression of type
bool. For any expression
involving a comparison
operator as below, where one
of the sub-expressions
involved is not of type int,
should result in an error.
These are expressions
which combine two values of
type bool using a logical
operation, to produce
another value of type bool.
For any expression involving
a logical operator as below,
where one of the sub-expressions involved is not of type
bool, should result in an error.
These are ‘expressions’
which represent
statements — not having
an interesting value
(represented by assigning
them the type unit). Any
statement such as those
below, which involve a
mismatch of types where
type-agreement is required, or which involve an
expression other than type bool where a boolean
expression is expected, should result in an error.
These are
expressions,
whose types
depend entirely
on the types of
other identifiers
or expressions.
In any case where there is a mismatch of types which
are required to agree, or where an expression other than
type bool is found a boolean expression is expected, an
error should be returned.
Input/Output specification
Your type checker will read an AST, in the form of an S-
expression, from System.in as input. (The format will be the
same as the output of the previous task, but may also contain
whitespace.) It will conduct the following checks (not
necessarily in this order):
1. check that the program has a main function, which takes no
arguments, and which has return type int;
2. check that all functions have distinct names, and all of the
parameters of each function have distinct names;
3. check that all identifiers used in a function, are either the
names of parameters of that function, or the names of other
functions which are defined somewhere;
4. Based on the type deduction rules above, type-check the
function bodies contained in the AST.
If any one of these conditions fail, your program will throw an
instance of an Exception class that we provide for this purpose,
which will present diagnostic information in a standardised
format. If none of these errors are present, the input program
will be printed on System.out (again as an S-expression), but
with the main function becoming the first declared function.
We provide a skeleton project, for you to add code to, to help
you to realise this Task. It performs the work of reading the S-
expression at the input, storing it in an abstract syntax tree; it
also contains the code necessary to verify the existence of a
main procedure and to produce output with this being the first
declaration.
The specific requirements for the output of your code are as
follows:
In the case where an exception is thrown, it should be caught
by the main method in your Java code, which should then
invoke the report method of the exception.
In the case where the program is semantically valid, you do
not have to use whitespace in any specific way in the printed
S-expression, but you may like to use the Java class which
we provide to pretty print these expressions (this is also
included in the skeleton project), for your own
benefit/convenience.
Example #1
Consider the following input AST (as an S-expression).
[
[FunDecl,
Idfr(“fun”), IntType,
[[Idfr(“x”), IntType], [Idfr(“y”), IntType], [Idfr(“z”), IntType]],
[
[IfStmt,
[BinOpExpr, Eq, Idfr(“x”), Idfr(“y”)],
[
[BinOpExpr, Plus, Idfr(“z”),
[BinOpExpr, Eq, IntLit(1), IntLit(1)]
]
],
[IntLit(0)]
]
]
],
[FunDecl,
Idfr(“main”), IntType, [],
[
[FunInvoc,
Idfr(“fun”), [IntLit(1), IntLit(2), IntLit(3)]
]
]
],
[FunDecl,
Idfr(“fun2”), UnitType, [],
[ Skip ]
]
]
This is lexically and syntactically valid, but [BinOpExpr, Eq,
IntLit(1), IntLit(1)] is of type BoolType, and so is
invalid as an operand of Plus. The type checker gives the
following output:
Invalid operands in arithmetic expression:
[BinOpExpr,
Plus,
Idfr(“z”),
[BinOpExpr,
Eq,
IntLit(1),
IntLit(1)]
]
and
[BinOpExpr,
Plus,
Idfr(“z”),
[BinOpExpr,
Eq,
IntLit(1),
IntLit(1)]]
—
Idfr(“z”)
has type INT
,
[BinOpExpr,
Eq,
IntLit(1),
IntLit(1)]
has type BOOL
Example #2
Consider the following input AST (as an S-expression).
[
[FunDecl,
Idfr(“fun”), IntType,
[[Idfr(“x”), IntType], [Idfr(“y”), IntType], [Idfr(“z”), IntType]],
[
[WhileLoop,
[BinOpExpr, Eq, Idfr(“x”), Idfr(“y”)],
[
IntLit(1),
[BinOpExpr, Plus, IntLit(1), IntLit(3)],
Skip,
Idfr(“z”)
]
]
]
],
[FunDecl,
Idfr(“main”), IntType, [],
[
[FunInvoc,
Idfr(“fun”), [IntLit(1), IntLit(2), IntLit(3)]
]
]
],
[FunDecl,
Idfr(“fun2”), UnitType, [],
[Skip]
]
]
This is again lexically and syntactically valid, but the last
expression Idfr(“z”) in the body of the while loop is of
type IntType, rather than UnitType as required by our
typing rules. The type checker gives the following output for the
corresponding AST (S-expression):
Invalid final expression in loop body:
[WhileLoop,
[BinOpExpr,
Eq,
Idfr(“x”),
Idfr(“y”)],
[…,
…,
…,
Idfr(“z”)]
]
—
Idfr(“z”)
has type INT
Do not worry about the formatting of / ANSI colour codes
(which are used in the skeleton project to decorate printed
code) in the output – these are something which we can take
care of in the automated testing.
Resources
The skeleton project (extracted from a sample solution based
on ANTLR); the main file that needs to be modified is
SExpressionsAnalyser.java. Please note that the
skeleton project is currently an incomplete implementation, and
as such only one error will be detected and reported, which is
inadequate. To give you some hints, methods that require
modification are marked with TODO in comments. Some
additional input/output samples can also be found in the
archive; you can run the input samples with the script run-
tests.sh.
The custom RuntimeException class (this is also included in
the skeleton project).
2021/11/27 10:28
⻚码:1/1