程序代写 CSE340 Fall 2022 Project 2

CSE340 Fall 2022 Project 2
Due: October 28, 2022 by 11:59pm MST on GradeScope
Weeks of programmings can save you hours of planning – Unknown1.
A shortcut is the longest distance between two points. – (Economist)

Copyright By PowCoder代写 加微信 powcoder

I had a running compiler and nobody would touch it. They told me computers could only do arithmetic. – Hopper (Inventor of the first compiler)
1 General Advice
You should read the description carefully. Multiple readings are recommended. You will not be able to understand everything on a first reading. Give yourself time by starting early and taking breaks between readings. You will digest the requirements better.
• The answers to many of your questions can be found in this document.
• Do not start coding until you have a complete understanding of the requirements and a clear plan for the implementation.
• Ask for help early. I said it before and I say it again: I and the TAs can save you a lot of time if you ask for help early. You can get help with how to approach the project to make the solution easier and have an easier time implementing it. When you ask for help, you should be prepared and you should have done your part.
• Have fun!
2 Overview
The goal of this project is to introduce you to code generation for a simple straight line programs (no branching). You will write a C++ program that reads an input which is a program (sequence of assignments) written in a small programming language defined for this project. You program will read the sequence of assignments and your generate an intermediate code. The specific intermediate representation that your program will generate and what it does with that intermediate representation depends on a command-line argument that is passed to your program. We provide you with code to read the command line argument into an integer variable. Depending on the value of the variable, your program will invoke the appropriate functionality. The following are the three options (total 180 points and there are 50 points bonus , so if you get 130, that counts as 100% on the project
(which is 13% of the final course grade); anything above 130 is bonus):
1. Task 1 (100 points): Parse the input and print an abstract syntax tree.
2. Task 2 (50 points): Parse the input and do type checking
3. Task 3 (30 ): Generate an executable representation of the program. The representation will be break down large expressions into smaller expressions that are linked together in a linked list that is passed to the “execute” function that we provided (and which you must not modify). This part is much more involved than the first two parts to do completely. Last semester,
no one was able to get full credit on it even though two of your UGTAs got very close.
1I was made aware of this quote by a similar statement (UGTA) made on discord: “Remember, days of debugging can save you hours of planning”

Unlike the first project, this project requires you to write a parser using operator precedence parsing (precedence table provided). The rest of the document is organized as follows:
• Section 3 describes the input format
• Section 4 describes what the input represents and introduces the type system rules for the small language • Section 5 describes what the output of your program should be for each of the three tasks.
• Section 6 Gives examples of outputs for tasks 1–3.
• Section 7 discusses command line arguments and how you should run and test your program.
• Section 8 describes the grading scheme.
• Section 9 addresses some submission concerns.
Input Format
The following context-free grammar specifies the input format:
program decl-section scalar-decl-section array-decl-section id-list
assign-stmt output-stmt variable-access variable-access variable-access expr
→ decl-section block
→ scalar-decl-section array-decl-section
→ SCALAR id-list
→ ARRAY id-list
→ ID id-list
→ LBRACE stmt-list RBRACE
→ stmt stmt-list
→ assign-stmt
→ output-stmt
→ variable-access EQUAL expr SEMICOLON → OUTPUT variable-access SEMICOLON
→ ID LBRAC expr RBRAC
→ ID LBRAC DOT RBRAC
→ expr MINUS expr
→ expr PLUS expr
→ expr MULT expr
→ expr DIV expr
→ LPAREN expr RPAREN
→ expr LBRAC expr RBRAC
→ expr LBRAC DOT RBRAC
A program consists of a declaration section followed by a “block” which contains the statements to be executed. The declaration section consists of a scalar declaration section for declaring scalar variables followed by an array declaration section for declaring array variables. To keep things simple, there is no size specified for arrays (this is discussed further in the Semantics section). The “block” consists of a sequence of statements. We only have two kinds of statements: assignment statements and output statements. An assignment statement has a lefthand side which is a variable-access (representing an assignable variable) followed by an EQUAL followed by an expression followed by a semicolon. In this simple programming language, operations on whole arrays as well as assignments to whole arrays operations are supported, so a variable access can be a simple identifier or an element of an array or a whole array. The meaning of the various parts of the input is explained in Section 4 – Semantics.

The tokens used in the above grammar description are defined by the following regular expressions (dot operator omitted in the definitions):
= (S)(C)(A)(L)(A)(R)
= (A)(R)(R)(A)(Y)
= (O)(U)(T)(P)(U)(T)
= letter (letter | digit)* =0| pdigit digit*
SEMICOLON = ‘;’
Where digit is the digits from 0 through 9 , pdigit is the digits from 1 through 9 and letter is the upper and lower case letters a through z and A through Z. Tokens are case-sensitive. Tokens are space separated and there is at least one whitespace character between any two successive tokens. We provide a lexer with a getToken() function to recognize these tokens. You should use the provided lexer in you solution.
4 Semantics
A program consists of a declaration section, which introduces the variables of the program, followed by a block which contains a sequence of statements to be executed. I describe each in what follows.
4.1 Declaration Section
The declaration section lists all the variables used in the program. There are two kinds of variables: scalar variables and array variables.
Scalar Variable Variable names that appear in the scalar-decl-section subtree of the program parse tree are scalar variables. For our purposes, you can assume that a scalar variable can be represented as a long integer in the implementation.
Array Variables Variable names that appear in the array-decl-section subtree of the program parse tree are array variables. Array variables represent arrays of integers. To keep things simple, we do not specify the size of these arrays. You can assume that each array has size 10. Same as for scalar variables, we represent each entry in the array as a long integer in the implementation.
4.2 Locations and Values
Variables have memory locations associated with them. The memory location associated with a variable contains the value of the variable. The value of scalar variables is a scalar integer value. The value of array variables is a an array of 10 scalar integer values.
4.3 Type System
The type system specifies the rules under which assignments are valid (type compatibility rules) and the type of expressions given the types of their constituent parts (type inference rules).

The type inference rules for expressions are the following:
1. If x is a lexeme that appears in the id-list of the scalar-decl-section, then the expression x has type scalar.
2. The type of an expression consisting of a single NUM is scalar.
3. If x appears in the id-list of the array-decl-section, then the expression x[.] has type array.
4. If x appears in the id-list of the array-decl-section, and expr has type scalar, then the expression x[expr] has type scalar.
5. If expr1 and expr2 have type scalar, then expr1 OP expr2 (where OP is PLUS, MINUS, MULT or DIV ) has type scalar.
6. If expr1 and expr2 have type array, then expr1 OP expr2 (where OP is PLUS or MINUS) has type array.
7. If expr1 and expr2 have type array, then expr1 MULT expr2 has type scalar.
8. If expr1 has type array and expr2 has type scalar, then expr1[expr2] has type scalar.
9. If expr1 has type scalar, then expr1[.] has type array.
10. If expr2 has type other than scalar, then expr1[expr2] has type error.
11. If expr1 has type scalar or error, then expr1[expr2] has type error.
12. If expr1 and expr2 have different types, then expr1 OP expr2 (where OP is PLUS ,MINUS MULT or DIV ) has type error.
13. If expr1 and expr2 have type array, then expr1 DIV expr2 has type error.
14. If x is a lexeme that does not appear in the id-list of the scalar-decl-section or the id-list of the array-decl-section, then the expression x has type error.
15. If none of the above conditions hold, the expression has type error.
The type inference rules for variable-access are the following:
1. If x is a lexeme that appears in the id-list of the scalar-decl-section, then the variable access x has type scalar.
2. If x is a lexeme that appears in the id-list of the array-decl-section and expr has type scalar, then x[expr] has type scalar.
3. If x is a lexeme that appears in the id-list of the array-decl-section, then x[.] has type array.
4. If x is a lexeme that does not appear in the id-list of the scalar-decl-section then the variable access x has type error.
5. If x is a lexeme that does not appear in the id-list of the of the array-decl-section, then the variable access x[.] has type error.
6. If x is a lexeme that does not appear in the id-list of the array-decl-section, then the variable access x[expr] has type error.
The following is the only type compatibility rule for assignments:
1. An assignment of the form variable-access = expr is valid if the type of the variable-access is array or the type of expr is scalar.
Note that this compatibility rule allows assigning an array to an array, a scalar to a scalar as well as a scalar to an array. The meaning of assigning a scalar to an array is given in the next section.
4.4 Semantics of Expressions and Assignments
Note: You can omit this subsection on a first reading as it relates to Task 3 which you can start on after finishing with Tasks 1 and 2.
In this section, we define how expressions are evaluated and how assignments are executed. The description is independent of the specific code that your program will generate to evaluate expressions and execute assignments. The description in this section does not refer to the location of variables.

4.4.1 Variables, expressions, and values
We say that a variable x is a scalar variable if x appears in the id-list of the scalar-decl-section of the program. We say that a variable x is an array variable if x appears in the id-list of the array-decl-section of the program. Each variable has a value associated with it. We denote by S the set of scalar variables in the program and by A the set of array variables in the program. The sets S and A are disjoint and you can assume so in your implementation.
As the program executes, the values of variables can be changed by the statements being executed. The state of a program consists of: (1) the values of the variables of the program, (2) the program code, which, for our language, is a list statements to be executed, and (3) the next statement to be executed (in assembly, this is specified by the program counter). In this section, we are describing the execution at the statement level. In the implementation guide, we describe how a complex statement can be broken down into a sequence of statements. The description that follows will be a mix of formal and informal specification of the semantics of program execution.
The value of a scalar variable is an integer (long in the implementation). The value of an array variable is a vector of 10 integer values (array of long in the implementation). The content of the memory is specified by specifying the values of all the variables. More formally, we define the memory of the system to be a mapping that maps variables to values. M : S ∪ A 􏰀→ integer ∪ integer10, where integer10 is the cartesian product integer × integer × . . . × integer (10 times). If x ∈ S, M(x) ∈ integer. If x ∈ A, M(x) ∈ integer10. The mapping M changes as assignment statements are executed but is not affected by output statements. In what follows, we refer to the effect of the execution of assignment statements by specifying how the values of variables change or stay the same after the execution of the assignment, but we do not explicitly give a formal definition of the new mapping or of a transition function.
The initial values of all variables are zero. Initially, M(x) = 0 for every x ∈ S and M(x) = 010 for every x ∈ A (010 is a vector of 10 values, all of which are equal to 0).
4.4.2 Evaluating expressions
The value of an expression is defined recursively as follows: • base case
1. The value of the expression x, where x is a scalar or array variable, is equal to the value of x
2. The value of the expression n, where n is NUM, is equal to the integer value corresponding to n (i.e. corresponding to
the lexeme of n).
• induction
1. Let x and y be two expressions that have type scalar and values vx and vy, respectively,
– The value of the expression x + y is vx + vy , where the addition is integer addition (long in the implementations). – The value of the expression x − y is vx − vy , where the subtraction is integer subtraction (long in the implementa-
– The value of the expression x ∗ y is vx ∗ vy, where the multiplication is integer multiplication (long in the
implementations).
– The value of the expression x/y is vx/vy, where the division is integer division (long in the implementations).
2. Let a be an expression that has type array and value va and x be an expression that has scalar type and value vx. – Thevalueofa[x]isequaltova[vx]if0≤vx ≤9.
– Thevalueofa[x]isundefinedifvx <0or9 > < < < > < err > – > > < < < > < err > ∗ > > > > < > < err > / > > > > < > < err > ( <<<<<=>>>err>>err> [ < < < < < < < = = . errerrerrerrerrerrerrerr= ] >>>>err>>err>
num>>>>err>>err> id >>>>err>>err>
$ < < < < < err < err err < Table 1: Operator Precedence Parser Table err err > < < err err err err > err > err >
SNYATX EORRR !!!
Figure 1: Abstract syntax tree for a = (b+c)*d[i];
For parsing, you need to combine predictive recursive descent parsing with operator precedence parsing. The operator precedence parser is invoked for parsing expr. The operator precedence parsing table is given in Table 1.
Next, I define what an abstract syntax tree is and give examples input programs with corresponding outputs.
5.1.1 Abstract Syntax Trees
An abstract syntax tree is a tree that captures the essential syntax of the input. For example, a program block has curly braces around a stmt-list. The curly braces are part of the block’s concrete syntax but are not really needed to understand the structure of a block when that structure is represented as a tree.
The abstract syntax tree for the assignment statement a = (b+c)*d[i]; is shown in Figure 1.
Notice how the parentheses a

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com