编译器代写: CSC 348: Introduction to Compiler Design

CSC 348: Introduction to Compiler Design: 2018 Winter
Final Project

Last modified 2018 March 12

Purpose:

To go over everything, esp. the inner-workings of a compiler.

Assignment:

Please implement a subset of Pascal (tinyPascal) in a simplified virtual machine known as spvm.
I am giving you the:

  • the tokenizer
  • the grammar
  • the symbol table

You have to fill-in two C++ methods (compile() and writeAssembly()) of class Node and its derived classes.

Our tinyPascal only has types:

  • boolean
  • integer
  • real (what C calls float)
  • string

It does not have:

  • procedures or functions
  • pointers or programmer-defined datatypes

spvm Assembly Language

  • Each spvm machine word can hold an integer, boolean, string or float. (It can hold other things too, let us not worry about all that.)
  • Unlike with a CPU, constants are not in instructions. Instead, there is a constant pool that stores all the constants refered to by the source code.
  • Unlike with a CPU, are no registers. All values are on the stack as offsets from the stack pointer. (Just call them variables.)
  • Variables will be assigned their stack offsets by the assembler and will known by their offsets. For example, the first four are @thisSubj@thisJust@retSubj and @retJust. All variables will automatically be given offsets on the stack, therefore you do not have to worry about. The constant and variable declaration section at the beginning of every program will tell the stack offsets of all variables. In the assembly language variables are prefaced by VAR_PREFIX_CHAR (= @). You do not have to worry about adding it, the output operator does that for you already.
  • Like most assembly languages, there are labels to define where gotos actually go. Labels are postfixed by the %. When they label code they must be preceded with colons (LABEL_POSTFIX_CHAR). However, do not include the colon in the gotoPoin instructions.
      gotoPoin jumpOver% 	# No ":" in goto
      noOpPoin		# This is not done
    jumpOver% :		# A ":" in used label
      noOpPoin		# This *is* done
    
  • Programs have layout:
    $beginPreAsm
      %beginVarDecl
        ; Variables and named constants go here
      %endVarDecl
    
      %beginCode
        ; Assembly language code goes here
      %endCode
    $endPreAsm
    

    where all you have to do is print the Assembly Language$beginPreAsm$endPreAsm%beginCode%endCode, and the whole variable declaration section will be printed automatically.For example, this loop:

    PROGRAM eight;
    
    VAR i,sum : Integer;
    
    BEGIN
      sum := 0;
      FOR  i := 1 TO 10 DO
        sum := sum + i;
    
      WRITELN(sum)
    END.
    

    Could be translated as:

    $beginPreAsm
     %beginVarDecl
      %var	@compilerVar1,	Integer,	0
      %var	@compilerVar2,	Integer,	0
      %var	@compilerVar3,	Integer,	0
      %var	@compilerVar4,	Integer,	0
      %var	@i,	Integer,	0
      %var	@sum,	Integer,	0
     %endVarDecl
     %beginCode
      loadPoin @compilerVar1,0
      copyPoin @sum,@compilerVar1
      loadPoin @compilerVar2,1
      copyPoin @i,@compilerVar2
    label0%:
      loadPoin @compilerVar3,10
      greaterPoin @compilerVar4,@i,@compilerVar3
      ifTrueGotoPoin @compilerVar4,label1%
      addPoin @sum,@i
      copyPoin @sum,@sum
      addPoin @i, 1
      gotoPoin label0%
    label1%:
      stdOutPrintLnPoin @sum
     %endCode
    $endPreAsm
    
  • You will have the following machine instructions:
    noOp Does nothing
    copyPoin @destVar, @srcVar Does @destVar = @srcVar
    loadPoin @destVar, i (i is an integer constant) Does @destVar = i
    loadPoin @destVar, r (r is a real constant) Does @destVar = r
    loadPoin @destVar, "s" ("s" is a string constant) Does @destVar = "s"
    addPoin @destVar, @srcVar Computes @destVar += @srcVar
    (Important: @srcVar and @destVar must either both hold integers or both hold reals.)
    subPoin @destVar, @srcVar Computes @destVar -= @srcVar
    (Important: @srcVar and @destVar must either both hold integers or both hold reals.)
    mulPoin @destVar, @srcVar Computes @destVar *= @srcVar
    (Important: @srcVar and @destVar must either both hold integers or both hold reals.)
    divPoin @destVar, @srcVar Computes @destVar /= @srcVar
    (Important: @srcVar and @destVar must either both hold integers or both hold reals.)
    intToRealPoin @destVar, @srcVar Puts the floating point representation of integer @srcVar in @destVar
    lesserPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs < @rhs)
    (Important: @srcVar and @destVar must hold same type.)
    lesserEqualPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs <= @rhs)
    (Important: @srcVar and @destVar must hold same type.)
    greaterPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs > @rhs)
    (Important: @srcVar and @destVar must hold same type.)
    greaterEqualPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs >= @rhs)
    (Important: @srcVar and @destVar must hold same type.)
    intEqualPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs == @rhs)
    (Important: @srcVar and @destVar must both be integers.)
    intNotEqualPoin @destVar, @lhs, @rhs Computes @destVar = (@lhs != @rhs)
    (Important: @srcVar and @destVar must both be integers.)
    gotoPoin label% Jumps to label%
    ifTrueGotoPoin @var,label% Jumps to label% only when @var == true
    ifFalseGotoPoin @var,label% Jumps to label% only when @var == false
    stdOutPrintPoin @var Prints the value @var with no ending newline
    stdOutPrintLnPoin @var Prints the value @var with an ending newline

tinyPascal

tinyPascal only has 4 types: booleanintegerreal and string. It supports the following Pascal operators and functionality:

Declaration of integers, booleans and reals
Binary integer math ops (+ - * DIV)
Binary real math ops (+ - * /)
Comparison (< > <= >= = < >)
Assignment (:=)
if then and if then else
Grouping statements with begin and end

Real Pascal is case-insensitive. tinyPascal recognizes 3 “cases”:

  • lowercase
  • UPPERCASE
  • Capitalized

What you must do:

  1. Download 20178-2Win_CSC348_finalProj.zip
  2. Write the methods compile() and writeAssembly() for all classes except Node (where it is abstract), BooleanConstantNode (I gave to you so you have a basic example),IntegerConstantNode (I gave to you so you have a basic example), and StatementListNode (which uses C++ iterators, which have a funky syntax, so I just gave it to you).All of the work you have to do is in Node.hexcept VariableNode::compile() which is in tinyPascal.y.

    All classes all the member vars they need, and their constructors already initialize them.

  3. To make the program, type:
    make
    

A Word About C++ in general (and this program in particular)

  • Both compile() and writeAssembly() are recursive, tree-walking methods.
  • C++ has both pointers and references, so methods can “return” more than one thing. For example, check out the compile() method of class BooleanConstantNode (aka BooleanConstantNode::compile() in C++ parlance):
    VariableMention*
        compile
    	(simpleType_ty&	simpleType
    	)
    	{
    	  varPtr_	= new VariableMention(BOOLEAN_ST);
    	  simpleType	= BOOLEAN_ST;
    	  return(varPtr_);
    	}
    

    Obviously it returns the address of a new instance of VariableMention (the value in member variable varPtr_).Additionally, it sets the argument simpleType to value BOOLEAN_ST. So, if you call it like:

    simpleType_ty		whatsMyReturnType;
    VariableMention*	varPtr;
    
    varPtr	= nodePtr->compile(whatsMyReturnType);
    

    then not only will varPtr be set to the address of the VariableMention instance that will receives the value, but also whatsMyReturnType will be set to BOOLEAN_ST. Sometimes it will be necessary to check this return type, to ensure it agrees with what you expect (e.g. if-statements require boolean conditions)(For the curious, just like the type VariableMention* signifies “a pointer to a VariableMention instance“, so the type simpleType_ty& signifies “a reference to a simpleType_tyinstance“. The argument that you send to it must be a simpleType_ty variable, and that variable can have its value changed.)

Examples:

PROGRAM	one;

VAR	i : integer;

BEGIN
 i := 1
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@i,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,1
  copyPoin @i,@compilerVar1
 %endCode
$endPreAsm
PROGRAM	two;

VAR	i : integer;

BEGIN
  i := 2 * 3;
  WRITE(i);
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@i,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,2
  loadPoin @compilerVar2,3
  mulPoin @compilerVar1,@compilerVar2
  copyPoin @i,@compilerVar1
  stdOutPrint @i
 %endCode
$endPreAsm
PROGRAM	three;

VAR	b : BOOLEAN;

BEGIN
 b := true;
END.
$beginPreAsm
 %beginVarDecl
  %var	@b,	Boolean,	false
  %var	@compilerVar1,	Boolean,	false
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,true
  copyPoin @b,@compilerVar1
 %endCode
$endPreAsm
PROGRAM	five;

VAR	i : Integer;
	b : Boolean;

BEGIN
  i := 2;
  b := (i = i);
  WRITELN(b)
END.
$beginPreAsm
 %beginVarDecl
  %var	@b,	Boolean,	false
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Boolean,	false
  %var	@i,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,2
  copyPoin @i,@compilerVar1
  intEqualPoin @compilerVar2,@i,@i
  copyPoin @b,@compilerVar2
  stdOutPrintLn @b
 %endCode
$endPreAsm
PROGRAM six;

VAR	i,j: Integer;
	result : Boolean;

BEGIN
  i := 1;
  j := 2;
  IF  i = j THEN
    result := true
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@compilerVar3,	Boolean,	false
  %var	@compilerVar4,	Boolean,	false
  %var	@i,	Integer,	0
  %var	@j,	Integer,	0
  %var	@result,	Boolean,	false
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,1
  copyPoin @i,@compilerVar1
  loadPoin @compilerVar2,2
  copyPoin @j,@compilerVar2
  intEqualPoin @compilerVar3,@i,@j
  ifFalseGotoPoin @compilerVar3,label0%
  loadPoin @compilerVar4,true
  copyPoin @result,@compilerVar4
  gotoPoin label0%
label0%:
 %endCode
$endPreAsm
PROGRAM seven;

VAR	i,j: Integer;
	result : Boolean;

BEGIN
  i := 1;
  j := 2;
  IF  i = j THEN
    result := true
  ELSE
    result := false
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@compilerVar3,	Boolean,	false
  %var	@compilerVar4,	Boolean,	false
  %var	@compilerVar5,	Boolean,	false
  %var	@i,	Integer,	0
  %var	@j,	Integer,	0
  %var	@result,	Boolean,	false
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,1
  copyPoin @i,@compilerVar1
  loadPoin @compilerVar2,2
  copyPoin @j,@compilerVar2
  intEqualPoin @compilerVar3,@i,@j
  ifFalseGotoPoin @compilerVar3,label0%
  loadPoin @compilerVar4,true
  copyPoin @result,@compilerVar4
  gotoPoin label1%
label0%:
  loadPoin @compilerVar5,false
  copyPoin @result,@compilerVar5
label1%:
 %endCode
$endPreAsm
PROGRAM eight;

VAR i,sum : Integer;

BEGIN
  sum := 0;
  FOR  i := 1 TO 10 DO
    sum := sum + i;

  WRITELN(sum)
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@compilerVar3,	Integer,	0
  %var	@compilerVar4,	Integer,	0
  %var	@i,	Integer,	0
  %var	@sum,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,0
  copyPoin @sum,@compilerVar1
  loadPoin @compilerVar2,1
  copyPoin @i,@compilerVar2
label0%:
  loadPoin @compilerVar3,10
  greaterPoin @compilerVar4,@i,@compilerVar3
  ifTrueGotoPoin @compilerVar4,label1%
  addPoin @sum,@i
  copyPoin @sum,@sum
  add @i, 1
  gotoPoin label0%
label1%:
  stdOutPrintLn @sum
 %endCode
$endPreAsm
PROGRAM nine;

VAR i,sum : Integer;

BEGIN
  sum := 0;
  FOR  i := 10 DOWNTO 1 DO
    sum := sum + i;

  WRITELN(sum)
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@compilerVar3,	Integer,	0
  %var	@compilerVar4,	Integer,	0
  %var	@i,	Integer,	0
  %var	@sum,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,0
  copyPoin @sum,@compilerVar1
  loadPoin @compilerVar2,10
  copyPoin @i,@compilerVar2
label0%:
  loadPoin @compilerVar3,1
  lesserPoin @compilerVar4,@i,@compilerVar3
  ifTrueGotoPoin @compilerVar4,label1%
  addPoin @sum,@i
  copyPoin @sum,@sum
  add @i, -1
  gotoPoin label0%
label1%:
  stdOutPrintLn @sum
 %endCode
$endPreAsm
PROGRAM hello;

BEGIN
  writeln('Hello world')
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	String,	""
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,"Hello world"
  stdOutPrintLn @compilerVar1
 %endCode
$endPreAsm
PROGRAM	eleven;

VAR i : INTEGER;

BEGIN

FOR  i := 1 TO 10 DO
  WRITE(i);

END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Integer,	0
  %var	@compilerVar3,	Integer,	0
  %var	@i,	Integer,	0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,1
  copyPoin @i,@compilerVar1
label0%:
  loadPoin @compilerVar2,10
  greaterPoin @compilerVar3,@i,@compilerVar2
  ifTrueGotoPoin @compilerVar3,label1%
  stdOutPrint @i
  add @i, 1
  gotoPoin label0%
label1%:
 %endCode
$endPreAsm
PROGRAM twelve;

VAR	i : INTEGER;
	r : REAL;

BEGIN
  i	:= 2;
  r	:= 2.5;
  WRITELN(i * r)
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Real,	0.0
  %var	@compilerVar3,	Real,	0.0
  %var	@i,	Integer,	0
  %var	@r,	Real,	0.0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,2
  copyPoin @i,@compilerVar1
  loadPoin @compilerVar2,2.5
  copyPoin @r,@compilerVar2
  intToReal @compilerVar3,@i
  mulPoin @compilerVar3,@r
  stdOutPrintLn @compilerVar3
 %endCode
$endPreAsm
PROGRAM thirteen;

VAR	i : INTEGER;
	r : REAL;

BEGIN
  i	:= 2;
  r	:= 2.5;
  WRITELN(r * i)
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Integer,	0
  %var	@compilerVar2,	Real,	0.0
  %var	@compilerVar3,	Real,	0.0
  %var	@i,	Integer,	0
  %var	@r,	Real,	0.0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,2
  copyPoin @i,@compilerVar1
  loadPoin @compilerVar2,2.5
  copyPoin @r,@compilerVar2
  intToReal @compilerVar3,@i
  mulPoin @r,@compilerVar3
  stdOutPrintLn @r
 %endCode
$endPreAsm
PROGRAM fourteen;

VAR	r0 : REAL;
	r1 : REAL;

BEGIN
  r0	:= 2.5;
  r1	:= 2.5;
  WRITELN(r0 * r1)
END.
$beginPreAsm
 %beginVarDecl
  %var	@compilerVar1,	Real,	0.0
  %var	@compilerVar2,	Real,	0.0
  %var	@r0,	Real,	0.0
  %var	@r1,	Real,	0.0
 %endVarDecl
 %beginCode
  loadPoin @compilerVar1,2.5
  copyPoin @r0,@compilerVar1
  loadPoin @compilerVar2,2.5
  copyPoin @r1,@compilerVar2
  mulPoin @r0,@r1
  stdOutPrintLn @r0
 %endCode
$endPreAsm
PROGRAM	 bad1;

VAR	i : INTEGER;

BEGIN
  i := true
END.
Error: Variable expression type mismatch
PROGRAM	 bad2;

VAR	b : BOOLEAN;

BEGIN
  b := 0
END.
Error: Variable expression type mismatch
PROGRAM	bad3;

VAR	i : Integer;

BEGIN
  i := (1 + true);
END.
Error: Attempt to do mathematical operation on non mathematical operand
PROGRAM	bad4;

VAR	i : Integer;

BEGIN
  i := 'two' * 2
END.
Error: Attempt to do mathematical operation on non mathematical operand
PROGRAM	bad5;

VAR b : Boolean;

BEGIN

  IF 6 THEN
    b := true
  ELSE
    b := false

END.
Error: Expected boolean expression if condition
PROGRAM bad6;

BEGIN
  sum := 0;
END.
Error: Undeclared variable