CS/CE 4337 Project Fall 2021
Project (Part 1)
This project will explore the topics learned in class, and give you some simple hands-
on experience. After completing all three parts, you should have a working interpreted
language that is unique to you. You are encouraged to have fun and go above and beyond
if you have the time.
Part 1 gives you a simple imperative language. Study the syntax and semantics pre-
sented in this document. Be sure to understand the language defined. You will then need
to make a small, but substantial change, and create the lexer for the modified language.
You will be making other changes to your language as the semester goes forward. So, take
that into consideration when writing your code.
All code should be written in either C,C++,Java, or Python.
1 What to do
• Study the syntax and semantics
• Modify the language:
The modification should be significant, requiring a change in both syntax and seman-
tics. You can add a new feature (like a for loop) or modify the way a feature works
(Like making user input more advanced).
Examples:
– Add a for loop
– Change get to accept a list of IDs and get multiple integers
– Add a simple printf
– add simple array
• Construct a state diagram for the lexer. The tokens and lexemes may be affected by
your changes above.
• Implement the lexer in your chosen language. You should include a test driver pro-
gram that runs the lexer and prints all tokens to the screen – like the example from
class.
2 What to turn in
Upload your submission as a zip archive containing the following:
• Source code (c, c++, python, or java files)
– Source code should not require a particular IDE to compile and run.
– Should work on the cs1 and cs2 machines
• Readme (Plain text document)
Organization of Programming Languages Page 1
CS/CE 4337 Project Fall 2021
– List the files included in the archive and their purpose
– Explain how to compile and run your project
– Include any other notes that the TA may need
• Write-up (Microsoft Word or pdf format)
– The state transition diagram
– In addition, if you did not complete some feature of the project, why not?
∗ What unsolvable problems did you encounter?
∗ How did you try to solve the problems?
∗ Where do you think the solution might lay?
· What would you do to try and solve the problem if you had more time?
3 Grading
The grade for this project will be out of 100, and broken down as follows:
The change to the language 25
The state transition diagram 15
The lexer code 50
Correct Output 10
If you were not able to complete some part of the program discussing the problem and
potential solutions in the write-up will reduce the points deducted for it. For example,
suppose there is a bug in your code that sometimes allows two customers to approach the
same worker, and could not figure out the problem before the due date. You can write 2-3
paragraphs in the write-up to discuss this issue. Identify the error and discuss what you
have done to try to fix it/find the problem point, and discuss how you would proceed if
you had more time. Overall, inform me and the TA that you know the problem exists and
you seriously spend time trying to fix the problem. Normally you may lose 5 points (since
it is a rare error) but with the write-up you only lose 2. These points can make a large
difference if the problem is affecting a larger portion of the program.
4 The Language
4.1 Syntax
|
| (5)
|
|
Organization of Programming Languages Page 2
CS/CE 4337 Project Fall 2021
|
|
Ñ “get” ID(12)
| “and”
| “or”
| “+”
| “-”
| “*”
| “/”
| “%”
| “>”
| “>=”
| “<”
| “<=”
| “==”
| “!=”
| “not”
| “-”
| ID(40)
| INT(41)
4.1.1 Tokens
This subsection describes the token used in the above grammar. Provided for each token is
a regex and a description. The regex is for those that know regular expressions and prefer
it as a description. The description says the same thing in English. Preprocessing describes
Organization of Programming Languages Page 3
CS/CE 4337 Project Fall 2021
how the lexeme is transformed before passing it to the parser.
STRING
As a regex: “([ˆ”]|\”)*”
Description: A quotation mark followed by zero or more characters, where quo-
tation marks must be preceded by a backslash, followed by another quotation
mark.
Preprocessing: The first and last quotation marks are removed. Scanning from left
to right, “\\” is replaced with “\”, “\t” is replaced with a tab, “\n” is replaced
with a newline, “\”” is replaced with “””, and any “\” that is followed by anything
else is removed.
ID
As regex: [_a-zA-Z][_a-zA-Z0-9]*
Description: A letter or underscore followed by a combination of zero or more let-
ters, underscores or digits.
INT
As Regex: (+|-)?[0-9]+
Description: an optional “+” or “-” followed by one or more digits.
4.2 Static Semantics
1
3
4
5
6
7
8
9
11
12 .id “ ID .id
13
Organization of Programming Languages Page 4
CS/CE 4337 Project Fall 2021
16
18
19
20
22
23
24
26
27
28
29
31
32
33
34
35
36
37
38
39
40 Predicate: ID .id P
Organization of Programming Languages Page 5
CS/CE 4337 Project Fall 2021
4.3 Dynamic Semantics
This section gives the dynamic semantics of the language using denotational semantics.
Consider the demsem function the denotational semantics for this language. We will use
a mapping from variable name to value to represent the symbol table of the program
during execution, and in code can be represented as a HashMap or similar datatype in
your language of choice. We will use a sequence of characters to represent the output of a
program, with � representing the empty sequence. I will also assume that all strings will be
represented as sequences of characters. Assume there is a function append that, when given
two sequences, appends the second sequence to the first. Also assume, there is a function
seq that takes an integer and gives a sequence of characters representing that integer as
text. Assume there are the functions head, which maps a sequence to its first element,
tail, which maps a sequence to a new one created by removing the first element, clean,
which maps a sequence of input characters to a new sequence by removing any non-digits
from the front of the sequence, and int that maps a sequence of digits to the corresponding
integer. If the sequence is empty, int will give zero. A state, as well as the meaning of
a program, will be a 3-tuple consisting of a variable name mapping function, a sequence
of input characters and an output sequence. The initial state for any program is ptu, i, �q,
where i is some sequence of characters the user will input. If a token (represented by all
caps and bold font) appears as a value on the right hand side of a function definition, then
replace it with its lexeme. So if a ID was generated by the lexer from an x, then replace
ID with x.
densemp�, pθ, i, pqq “ pθ, i, pq
densemp
densemp “print” STRING , pθ, i, pqq “ pθ, i, appendpp, STRING qq
densemp “print”
where out “ exprsemp
densemp “get” ID , pθ, i, pqq “ pθ1, i1, pq
where
px, i1q “ getIntpcleanpiqq
θ1pnq “ if n “ ID then x else θpnq
densemp ID “=”
where
θ1pnq “ if n “ ID then exprsemp
densemp
then densemp
else densemp
densemp
then pθ, i, pq
else densemp
densemp
Organization of Programming Languages Page 6
CS/CE 4337 Project Fall 2021
exprsemp
then exprsemp
else bexprsemp
exprsemp
exprsemp
then exprsemp
else texprsemp
exprsemp
exprsemp
then exprsemp
else fexprsemp
exprsemp
exprsemp
then exprsemp
else vexprsemp
exprsemp
exprsemp “(”
exprsemp “not”
exprsemp “-”
exprsemp ID , θq “ θp ID q
exprsemp INT , θq “ INT
bexprsemp “and”
bexprsemp “or”
texprsemp “+”
texprsemp “-”
fexprsemp “*”
fexprsemp “/”
v
exprsemp
fexprsemp “%”
vexprsemp “>”
vexprsemp “>=”
vexprsemp “<”
vexprsemp “<=”
vexprsemp “==”
vexprsemp “!=”
getIntpiq “ pintpxq, i1q
where px, i1q “ getIntSeqp�, iq
getIntSeqpi1, i2q “ if digitpheadpi2qq
then getIntSeqpappendpi1, headpi2qq, tailpi2qq
Organization of Programming Languages Page 7
CS/CE 4337 Project Fall 2021
else pi1, i2q
4.4 Some Example Programs
This section contains a few example programs with output. You can use these to check
your understanding of the syntax and semantics. In the output, italics are used to indicate
the input given by the user. The “ê” symbol is used to show where a newline was printed
Code Output
print “Hello World!”; Hello World
Code Output
get x;
get y;
get z;
if (x > y and y) – z
then
print “\t It is true!\n”;
else
print “\t It is false!!\n”;
end;
3
2
1
It is false!!ê
Code Output
print “Welcome to my program.\n”;
print “Enter a number: “;
get x;
if x > 0
then
x=x*2;
else
x=-x;
end;
print x;
Welcome to my program.ê
Enter a number: 3
6
Organization of Programming Languages Page 8
CS/CE 4337 Project Fall 2021
Code Output
print “How many numbers will you input? “;
get num;
i=0;
sum=0;
while i < num
do
get _x;
sum = sum + _x;
end;
q = sum / num;
r = sum % num;
print "The average is ";
print q;
print " ";
print r;
print "/";
print num;
print "\n"
How many numbers will you input? 3
2
4
5
The average is 3 2/3ê
Organization of Programming Languages Page 9