CS 415: Compilers
Project 2: A Pascal Subset Compiler Front-End
Due date: Thursday, April 17
Announcement
- Due date changed from April 16 to April 17.
- In “for statements”, lower and upper loop bounds changed from “constant” to “integer_constant”.
- Project 2 Available (03/31/2014)
Pascal Subset
Using flex and bison, you are to write a parser and code generator for a subset of Pascal. Programs consist of single blocks. There are no procedure declarations. Base types are limited to integer and boolean. Composit types are limited to single dimensional arrays of base types, indexed by integers. Only the following statements are included: while-do, for (single assignment in body), if-then, if-then-else, assignment, write, and compound. Operators are restricted to arithmetic, logical, and relational. The grammar that we are using for this language is as follows:
start | ::= | program ID ; block . |
block | ::= | variables cmpdstmt |
variables | ::= | var vardcls | empty string |
vardcls | ::= | vardcls vardcl ; | vardcl ; |
vardcl | ::= | IDlist : type |
IDlist | ::= | IDlist , ID | ID |
type | ::= | array [ integer_constant .. integer_constant ] of stype |
| | stype | |
stype | ::= | integer | boolean |
stmtlist | ::= | stmtlist ; stmt | stmt |
stmt | ::= | ifstmt | astmt | wstmt | fstmt | cmpdstmt | writestmt |
wstmt | ::= | while condexp do stmt |
fstmt | ::= | for ID := integer_constant , integer_constant do stmt |
ifstmt | ::= | ifhead then stmt else stmt | ifhead then stmt |
ifhead | ::= | if condexp |
cmpdstmt | ::= | begin stmtlist end |
writestmt | ::= | writeln ( exp ) |
astmt | ::= | lvalue := exp |
exp | ::= | rvalue | exp + exp | exp – exp | exp * exp |
| | exp and exp | exp or exp | exp exor exp | not exp | |
| | ( exp ) | constant | |
condexp | ::= | exp != exp | exp == exp | exp < exp | exp <= exp | ID | boolean_constant |
lvalue | ::= | ID | ID [ exp ] |
rvalue | ::= | ID | ID [ exp ] |
constant | ::= | integer_constant | boolean_constant |
integer_constant | ::= | ICONST |
boolean_constant | ::= | true | false |
Front-End: Project Description
As part of this project, you will
- Augment the scanner to recognize the key words (tokens) while , do , for , and boolean ,
- Write the parser and detect and report at least the following syntax errors by inserting appropriate error productions :
- “\n***Error: illegal variable declaration\n”
- “\n***Error: illegal statement\n”
- “\n***Error: illegal expression\n”
- “\n***Error: illegal conditional expression\n”
- for statement
“\n***Error: lower bound not integer constant\n”
“\n***Error: upper bound not integer constant\n”
Note: Reporting the same error occurrence multiple times is okay. The grammar given above is ambiguous. However, we used features ofbison to deal with operator precedence and associativity to resolve the ambiguity. - Augment your parser to perform semantic analysis, in particular type checking. This will include the implementation a single-level symbol table. Your type checker has to detect and report at least the following semantic errors:
- Declarations
“\n***Error: duplicate declaration of %s\n” - Array declaration
\n***Error: lower bound exceeds upper bound\n” - If stmt
“\n***Error: exp in if stmt must be boolean\n” - While stmt:
“\n***Error: exp in while stmt must be boolean\n” - Writeln stmt (Note: accepts only integer values)
“\n***Error: illegal type for write\n” - Assignment stmt
“\n***Error: assignment types do not match\n”
“\n***Error: assignment to whole array\n” - Array subscript operation id[exp] in exp
“\n***Error: subscript exp not type integer\n”
“\n***Error: id %s is not an array\n” - for statement
“\n***Error: lower bound exceeds upper bound\n”
“\n***Error: induction variable not scalar integer variable\n” - Identifier
“\n***Error: undeclared identifier %s\n” - Arithmetic, logical, and relational operators in exp
“\n***Error: types of operands for operation %s do not match\n”;
- Declarations
Specific Details
Names are case sensitive .
Efficiency is not the main concern of this project. You may a simple unordered list to implement your symbol table. If you design your symbol table interface well, it should be simple to replace the underlying table implementation with a hashing scheme later.
Logical operators (and, or, exor, not) require boolean arguments. The arithmetic and relational operators (e.g., +) require integer arguments. Integer and boolean arguments cannot be mixed. The equality/inequality operators (==, !=) apply to any type, as long as the two argument types match.
Getting Started
The following code is provided as a starting point for your project. The files listed below live on the ilab cluster in subdirectory~zz124/cs415_2014/projects/proj2 . Please copy the files into your subdirectory of choice using the command
cp -r ~zz124/cs415_2014/projects/proj2 myProjectSubdirectory . You may modify any of these files for your project.
- declaration of types and attribute operations: attr.h and attr.c
There is nothing useful yet in these files. - declaration of symbol table and symbol table operations: symtab.h and symtab.c
There is nothing useful yet in these files. - Scanner: scan.l (flex)
- Parser: parse.y (bison)
- Makefile
- demo-simple, a program that can be successfully parsed with the partial parser that can be generated from the provided code.
- ReadMe files contains a few reminders
In order to test your compiler, we provide ten test cases in the “testcases” subfolder of proj2 . We will use more hidden test cases for grading purpose.
You can generate an executable called parser by typing make . The parser expects the input on stdin, i.e., you can call the parser on an input program as follows: ./parser < demo-simple .
Due Date
The project is due at 11:59 pm on April 17, 2014–that is, one minute before midnight.
If you have specific, overriding, personal reasons why these dates are unreasonable, you should discuss them directly with the instructor beforethe deadline.
Submission Procedure
Please submit a single tar-file with all your source files, including scan.l, attr.h, attr.c, symtab.h, symtab.c, parse.y, ReadMe, to Sakai. The submission window for project II is open. Your ReadMe file may contain comments that you want the grader to know about. Do a “make clean” before tar-ing your files. Do not submit your parser as an executable. We will use the following late policy : You will receive a penalty of 20% of your overall grade for each day late. A day is a working day (Monday through Friday). It is your reponsibility to verify that your files have been submitted correctly .
Grading Criteria
The project will be mainly graded on functionality. You will receive no credit for the entire project if your code does not compile and run on theilab cluster . We may use automatic testing tools, so make sure that your error messages are exactly as described above.