Assessed Exercise 1, Task 1
Introduction. In this task you will write a lexer for the simple programming language given here. You are free to write the lexer in any form you want, whether ‘by hand’ or with lexer generator, as long as it adheres to the requirements specified below. Writing the lexer by hand will give you a better understanding of lexical analysis, while using a tools achieves the goal with many fewer lines of code (my example solution, written using the JFlex lexer generator is around 60 lines of code). Another implementation technique (probably easier) is using Java’s regular expression library.
Task. Your lexer takes strings as input, and returns either a suitable token list when the input is lexically valid, or a suitable error exception (when the input is lexically invalid). Note that you are not asked to check if the input is syntactically correct w.r.t. to the context free grammar here. You should only check for lexical correctness.
The lexer has to have the following signature.
class LexicalException extends Exception {
public String msg;
public LexicalException ( String _msg ) { msg = _msg; } }
class Task1Exception extends Exception {
public String msg;
public Task1Exception ( String _msg ) { msg = _msg; } }
interface Lexer {
public List<Token> lex ( String input ) throws LexicalException,
Task1Exception; }
Here List<…> is the list class imported from java.util.List. The lex method takes a string that is to be lexed. The interface Token and all its implementations are given here. You need to use those tokens. The class LexicalException should be thrown whenever input is encountered that does not adhere to the specification. Do not use this exception to indicate any other form of error — use Task1Exception for any other error. Errors should be reported by exception only.
You will also have to implement a method create that, when called, returns an instance of the interface Lexer. For this please use the following fragment, replacing … with the appropriate code.
class Task1 {
public static Lexer create () { … } }
For your convenience, here are the remaining definitions in one file.
Examples. Here is a lexically valid program in the language.
def f(x,y,z) = { if x == y then { z } else { 0 } }
It gives rise to the following token stream.
T_Def
T_Identifier ( “f” )
T_LeftBracket
T_Identifier ( “x” )
T_Comma
T_Identifier ( “y” )
T_Comma
T_Identifier ( “z” )
T_RightBracket
T_EqualDefines
T_LeftCurlyBracket
T_If
T_Identifier ( “x” )
T_Equal
T_Identifier ( “y” )
T_Then
T_LeftCurlyBracket
T_Identifier ( “z” )
T_RightCurlyBracket
T_Else
T_LeftCurlyBracket
T_Integer ( 0 )
T_RightCurlyBracket
T_RightCurlyBracket
Note that the following is also a valid (at the lexical level) input.
;;{{{}}{{{ {{}}}} }}}}}}}}10 10 if if then then then else
Introduction. This task is about parsing. You will write a parser for a fragment of the simple programming language given here.
Task. Write a parser for the language for the language specified by the CFG below.
INT → …
BLOCK → { ENE }
ENE → E | E; ENE
E → INT | BLOCK | skip
The starting symbol is BLOCK. Your parser will take as input a token list, using the definition of tokens from Task 1 (given here). Your parser should return a suitable AST, provided the input adheres to the syntax given above. Otherwise it should throw a SyntaxException given below. As the parser takes a token list as input, it might be helpful to see the grammar in a form where terminal symbols are token names. Since the grammar above does not parse the full language, not all valid tokens are used for valid programs, e.g. T_Then .
BLOCK → T_LeftCurlyBracket ENE T_RightCurlyBracket
ENE → E | E T_Semicolon ENE
E → T_Integer | BLOCK | T_Skip
Note that you are not asked to write a parser for the full language given here, only for the fragment above. You are free to write the parser in any form you want, whether ‘by hand’ or with a parser generator like CUPS.
The parser is to have the following signature.
class SyntaxException extends Exception {
public String msg;
public SyntaxException ( String _msg ) { msg = _msg; } }
class Task2Exception extends Exception {
public String msg;
public Task2Exception ( String _msg ) { msg = _msg; } }
interface Parser {
public Block parse ( List < Token > input ) throws SyntaxException,
Task2Exception; }
Here List<…> is the list class imported from java.util.List. The AST class Block is given here. You need to use those classes. The class SyntaxException should be thrown whenever input is encountered that does not adhere to the syntactic specification. Do not use this exception to indicate any other form of error — use Task2Exception for any other error. Errors should be reported by exception only.
You will also have to implement a method create that, when called, returns an instance of the interface Parser. For this please use the following fragment, replacing … with the appropriate code.
class Task2 {
public static Parser create () { … } }
For your convenience, here are the remaining definitions in one file.