IT代写 9/29/21

9/29/21
Compilation
COMP712 Programming Languages
AUT
1
Program words symbols
Every statement turns into a tree, Sn
tokens
Syntax Analyser
A list of trees
A program is transformed into a list of trees
2
Lexical Analyser, so far
Program
…… S1 S2….
12
Program words tokens symbols
Syntax Analyser
A list of trees
An Example
Every statement turns into a tree, Sn
X := Y + 5; Print x;
:=
x+ y5
print While x < 100 do x Begin : end while block S1 S2.... < x5 3 : : What do you do if it is a name? Analysing a sentence at a time Program words symbols x:=y+5; x Print x; Syntax A list of tokens avariablename trees Analyser 4 34 Analysing a sentence at a time x := y + 5; x Print x; : You hash it into the s.name Hash Table Syntax Analyser What else is passed on to the syntax analyser? symbol table. What is returned here? NameTable “x” P1 P2 flags P3 Symbol Table 5 Analysing a sentence at a time x := y + 5; Print x; : : x s.name Hash Table Syntax Analyser NameTable “x” P1 P2 flags P3 But why point to here? s.name ? S.Name and a pointer to the symbol table Symbol Table 6 56 1 9/29/21 x := y + 5; Print x; : : : s.colon? No, you really want to return s.becomes Analysing a sentence at a time Read next symbol Just like reading in a comment /*...*/, when you read in a “/”, you have to check if it is followed by a *. So, reading in a “:”, you have to check if it is followed by a “=“. 7 Analysing a sentence at a time x := y + 5; := Print x; Syntax Analyser NameTable “x” P1 P2 flags P3 Build the tree for this statement. What does it look like? : : Do you need to hash this to the symbol table? No, but what do you do next? s.becomes Hash Table Symbol Table 8 78 Analysing a sentence at a time := x x := y + 5; Print x; : : := Syntax s.becomes Analyser But, what is x? 9 Hash Table NameTable “x” P1 P2 flags P3 Symbol Table Analysing a sentence at a time := s.name x := y + 5; := Print x; : : Syntax Analyser NameTable “x” P1 P2 flags P3 s.becomes Hash Table Symbol Table 10 9 10 Hash Table NameTable “x” The symbol table thus holds information about the variable, known as run-time properties. It is organized as a push- down stack. It has 3 pointers, P1, P2, & P3. Analysing a sentence at a time P1 P2 flags P3 Symbol Table Q: why use a pushdown stack and how is it implemented? Because you could have different types of x’s. 11 Push-down stack: refresh Information coming in: 1, 2, 3, 4. Information going out? Rationale A pushdown stack thus prevents access to earlier information until the current information is being processed. 4, 3, 2, 1 i.e. it is a FILO queue Implementation 1 12 4 3 2 1 A stack list A linked 2 11 12 2 9/29/21 Using a linked list: Push and Pop Top of the stack Push 3 onto the stack 2 Pop 3 off 1 the stack 3 A linked 2 list 1 You use a linked list when the information you push/pop is more complex. 13 Hash Table NameTable “x” The symbol table thus holds information about the variable, known as run-time properties. It is organized as a push- down stack. It has 3 pointers, P1, P2, & P3. 14 Lexical Analyser (cont.) P1 P2 flags P3 Symbol Table So, where is the PDS? Which pointer is used to implement the linked list? P1 13 14 Hash Table NameTable “x” P1 P2 flags P3 P2? pointer to the variable identity i.e. its actual name P3? pointer to the code of the procedure, if any. Again, why do we need a pushdown stack? 15 Lexical Analyser (cont.) So, which pointer is used to implement the linked list? P1 Symbol Table To handle multiple occurrences Begin integer x; : : Begin real x; : : Hash Table NameTable “x” P1 P2 flags P3 Having read the first “x”, your symbol table is as shown on the right. What happens when the lexical analyser reads the second “x”? 16 15 16 To handle multiple occurrences You compute the runtime properties of the second x and put it into the symbol table. How? Begin integer x; : : Begin real x; : : Hash Table NameTable “x” P1 P2 flags P3 real x P1 P2 flags P3 integerx 17 To handle multiple occurrences Begin integer x; : : Begin real x; : : Hash Table NameTable “x” P1 P2 flags P3 real x P1 P2 flags P3 integerx You put the real x at the top of the stack, why? So that all references to x now refer to it without having to search for it. And you don’t need to refer to integer x until you exit this block. 18 17 18 3 9/29/21 Hash Table NameTable “x” P1 P2 flags P3 realx P1 P2 flags P3 intx NameTable “x” P1 P2 flags P3 realx P1 P2 flags P3 intx To handle multiple occurrences Now, what happens when you exit the block? You pop out the description for real x so that from now on, any reference to x points to the descriptor for integer x. 19 To handle multiple occurrences Begin integer x; : : Begin real x; Integer x : : End; : x := 5; But, where did the descriptor for real x go? Attached to all the trees that refer to real x! Hash Table NameTable “x” P1 P2 flags P3 Descriptor for real x removed 20 19 20 Hash Table NameTable “x” The symbol table holds the run-time properties of the variables used in the program but what are these properties? Run-time properties of variables P1 P2 flags P3 Symbol Table Note that these properties are what we need to know about the variables when the program is executed. 21 These properties are: Its string name P2 Declaration status: yes/no boolean Variable type: e.g. integer, real,.. Is this a reserved word: yes/no If this is a function, its code P3 Block level number number Offset value number const boolean But, what are these two? 22 21 22 To understand what the block level number and offset value are for, something is amiss here: why is there no storage allocated for holding the runtime value of a variable? Isn’t its value a runtime property? “x” flags In other words, why don’t we have: P1 P2 / “..” value P3 23 Then, we could do: X := 5 This is known as static allocation of storage as found in early languages (ex. Fortran 77). “x” / flags “..” P1 P2 5 P3 Disadvantages? No recursion! Why? Because if you allocate value there, what happens when you call your function again? You don’t have storage for it or if you use the same store, you over-ride the previous value. We need to introduce the concept of a data frame. 24 23 24 4 9/29/21 Data Frame Any part of the program that allows the declaration of variables – procedures, blocks, etc. – needs a data frame A data frame is thus a collection of memory locations (i.e. storage for the variable) that are created during runtime. To create a data frame, we need to know how many variables are used in the program. 25 Import java.util.Scanner; Public class PrimeTest { public static void isPrime (int n) { boolean prime = true; If (n < 2) prime = false; for (int i=2; i < n; i++) { if (n%i==0) { prime = false; break; } } if (prime) Where are the data frames created in this program? System.out.printf(“%d %s\n”, n, “is a prime number”) else System.out.printf(“%d %s\n”, n, “is not a prime number”) } Public static void main (String[]args) { Scanner sc = new Scanner (System.in); Sytem.out.println( “enter a number to check if its prime or not”) int num = sc.nextInt(); isPrime (num); } 26 25 26 Import java.util.Scanner; Public class PrimeTest { public static void isPrime (int n) { boolean prime = true; If (n < 2) prime = false; for (int i=2; i < n; i++) { if (n%i==0) i: { prime = false; break; } } if (prime) System.out.printf(“%d %s\n”, n, “is a prime number”) else System.out.printf(“%d %s\n”, n, “is not a prime number”) } Public static void main (String[]args) { Scanner sc = new Scanner (System.in); Sytem.out.println( “enter a number to check if its prime or not”) int num = sc.nextInt(); isPrime (num); } prime: n: num: 27 Import java.util.Scanner; So, whenever you use this code, you need to create2 locations to hold “n” and “prime”. But you also need one when you enter the for- loop for “i”. And you need one location for “num”. Public class PrimeTest { public static void isPrime (int n) { boolean prime = true; If (n < 2) prime = false; for (int i=2; i < n; i++) { if (n%i==0) { prime = false; break; } } if (prime) System.out.printf(....) else System.out.printf(....) } Public static void main (String[]args) { Scanner sc = new Scanner (System.in); Sytem.out.println(....) int num = sc.nextInt(); isPrime (num); } 28 27 28 Data frames are created from a stack and are allocated only when that piece of code is executed. Imagine if you call: mainàisPrimeàisPrime. main 4loop isPrime 4loop isPrime 29 To add and remove data frames, you need two pointers – the base (blue) and the top (red) pointers main isPrime 4loop isPrime 4loop 30 29 30 5 9/29/21 To access the location assigned for each variable, you assigned each variable an offset number. Then do: base+offset Main: num(0); isPrime: n(0), prime(1) 4loop: i(0). main num = isPrime + 0 Offset numbers Pointer’s address prime = + 1 n= +0 31 But after you exit the for-loop and return to isPrime, what happens? You need to do the opposite (i.e. shift the pointers down) but how do you know where your pointers go? 4loop isPrime 4loop isPrime main 32 31 32 To restore the stack, you need to store the return pointer: 4loop 4loop 4loop 4loop In the 3rd version, store 2 pointers: where to return and where is the top 33 Possible flags: “x” / “..” Declaration status: declare or not Declaration type e.g. integer, real,.. P1 P2 flags P3 Variable type: reserved or user-defined Block level number What about Offset value block level 34 33 34 Begin integer x; : So, here is block 0 : Begin integer y; : y := y + x; : here is End; block 1 😡 : = 5 ; h e r e i s block 0 How do we get x? So, you need another pointer to point to the correct block 4loop 35 Such a method for allocating storage is known as dynamic allocation. Advantages? Allow recursion in PLs! How? First, observe no actual locations are reserved in the symbol table (as opposed to static allocation). What then is recorded in the symbol table? 2 numbers – block level and offset value 36 35 36 6 9/29/21 S Y M B O L T A B L E Create an entry for each variable and compute their runtime properties Pass runtime properties of variables to the tree Source Program read character Lexical analysis Lexical terms Syntax analysis Parser tree Simple translation Object code Loader Target program The process of compilation We are almost here! Optimise code (optional) 37 The task of lexical analysis is to divide up the input into lexical items and compute the runtime properties of variables used. The data structure needed is a symbol table. The task of syntax analysis is to combine the lexical items into phrases which eventually lead to legal sentences. The data structure used is a tree. Syntax analysis, thus, consists of two independent parts: how to recognise a phrase and how to produce a tree node. 38 37 38 But before we move into Syntax Analysis, let’s look at implementing a Lexical Analyser. What basic functions do we need? A function to read each character from a program and convert it to tokens. A function to lookup the name in the symbol table and if necessary, compute the runtime properties of the symbol. 39 A function to read each character from the program and convert it to token: Let Nextsym() $( switchon read-char() into $( case ‘+’: ? case ‘-’: ? case ‘0’...’9’: ? case ‘a’...’z’: ? : : return sym 40 39 40 A function to read each character from the program and convert it to token Let Nextsym() $( switchon read-char() into $( case ‘+’: sym := s.plus; endcase case ‘-’: sym := s.minus; endcase case ‘0’...’9’: ? case ‘a’...’z’: ? : : return sym 41 A function to read each character from the Java program and convert it to token: Let Nextsym() $( switchon read-char() into $( case ‘+’: sym := s.plus; endcase case ‘-’: sym := s.minus; endcase case ‘0’...’9’: a routine to read in a number case ‘a’...’z’: a routine to read in a name : : return sym 42 41 42 7 9/29/21 A function to lookup the name in the symbol table. Let lookupword() $( use the hash function to find an entry for the word read check name-table to find the correct entry. {how do you do this?} If there is, retrieve and return its type. If there isn’t, insert a new entry (what?) and return s.name (why?) $) 43 The Parsing Process COMP712 Programming Languages AUT 44 43 44 A typical view of the compilation process: parsing Your Program Lexical Analyser Syntax Analyser What is generated? Symbol Table Parse Tree Tokens = terminal words 45 A typical view of the compilation process: parsing What happens next? Code generation Interpretation i.e walk the tree 46 Your Program Lexical Analyser Syntax Analyser 45 46 A typical view of the compilation process: parsing But the above does not say whether one performs a complete lexical analysis prior to performing syntax analysis or doing both simultaneously. Your Program Lexical Analyser Syntax Analyser 47 Thus, we could have a two-pass parser: 1 pass Or, a single pass parser: 1 pass 2 pass In the previous lecture, we look at parsing from the LA standpoint. Now we look at it from the SA standpoint 48 Your Program Lexical Analyser Syntax Analyser Syntax Analyser Lexical Analyser Your Program 47 48 8 9/29/21 Consider (single) parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; First, what does parsing means? It means syntax analyser calls lexical analyser to get tokens and use them to build a parse tree immediately 49 Consider parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “begin”? LA finds the entry “begin” in the hash table It returns s.begin to the SA What happens next? 50 49 50 Consider parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; SA knows what s.begin means. So, what action is taken? SA ignores s.begin and starts the tree building process. SA initialises tree: Tree 51 Consider parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “integer”? LA finds the entry “integer” in the hash table. Why? What does it return to SA? s.integer 52 51 52 Consider parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; SA knows what s.integer means. So, what actions are taken? SA sets declaration = true SA sets decl_type = s.integer Global variables 53 Consider parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “x”? LA looks for the entry “x” in the hash table But it is not there! 54 53 54 9 9/29/21 LA creates an entry “x” in the symbol table: Begin integer x, y; x:=5; y := x; : : Begin real y; : end; : End; / “x” Isthatall? No P1P2 flags / Flags: Declared: yes Type: s.integer Reserved or user-defined Block level: 0 Offset: 0 55 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “,”? It returns s.comma to the SA SA ignores s.comma and knows more declaration coming. 56 55 56 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “y”? LA looks for the entry “y” in the hash table Again, it is not there! 57 Begin integer x, y; x := 5; y := x; : : Begin real y; : end; : End; / ......... “x” LA creates another entry in the symbol table. / “y” P1P2 flags / Flags: Declared: yes Type: integer Reserved or user-defined Block level: 0 Offset: 1 58 57 58 Block level tells you that these variables are at the same level i.e using the same data frame. Enter a new block level, you reset offset value to zero and increase block level by 1. Exit a new block level, you reset offset value to the previous value and decrease block level by 1. Offset value allows you to index its location on the stack. 59 Recall what is a data frame? The memory required by your program In this program, we need 2 locations at the outer block To get x, you do: Base address + offset of x Begin integer x, y; x := 5; y := x; : : Begin real y; : y := 0.5*x; end; : End; y: x: 60 59 60 10 9/29/21 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “;”? It returns s.semicolon to the SA SA ignores it and sets decl_type = nil. 61 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; What happens when you read “x”? LA finds the entry “x” in the hash table : It returns s.name and End; a pointer to “x” in the symbol table to SA 62 61 62 SA gets s.name and a pointer to the symbol table: Flags: Declared: yes Type: s.integer user-defined Block level: 0 Offset: 0 / “x” P1 P2 flags / Since SA did not encounter s.integer or s.real or s.procedure, it knows declaration has finished. So, declaration := false. It now expects an operator next. 63 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; After reading x, what is read? : not := and reading ”:”, what happens next? It has to check if the next symbol is “=“ and if so, returns? s.becomes to SA 64 63 64 SA gets s.becomes and starts to build the first sentence which is an assignment expression. / “x” := ......... / “y” s.name Note where this arrow is pointing to P1P2 flags / 65 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “5”? LA returns the value 5 and its type s.integer to SA SA knows the right hand side is an arithmetic expression. Call a function to build such a tree. 66 65 66 11 67 68 9/29/21 Continue parsing the simple Algol-like program: Begin integer x, y; x := 5; y := x; : Begin real y; : end; : End; What happens when you read “;”? LA returns s.semicolon to SA SA knows end of statement reached. 67 SA now completes the assignment expression tree. It checks that variable x is of the right type. / “x” ......... / “y” P1P2 flags / s.name := s.value 5 68 SA builds the first statement tree in the program Tree / “x” := ......... / “y” x5 P1 P2 flags / 69 69 12