Overview Language Processors Structure of a Compiler Structure of the Course
1. Introduction
NYU Courant Institute
Compiler Construction (CSCI-GA.2130-001)
Copyright By PowCoder代写 加微信 powcoder
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 1 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Acknowledgments
Adapted from CSCI-GA.2130-001 slides by and .
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 2 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors Structure of a Compiler Structure of the Course
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 3 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
source program → compiler → target program
semantically equivalent programs
source programs typically high level
target programs typically assembly or machine code
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 4 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
source program → compiler → target program
semantically equivalent programs
source programs typically high level
target programs typically assembly or machine code
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 4 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Roles of a Compiler
allow programming at an understandable abstraction level but execution based on low-level code
allow programs to be written in machine-independent languages but execution based on machine-specific code
help in verifying software
early discovery of programming errors
provide automatic code optimization
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 5 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Some Historical Highlights
coins the concept and writes the first compiler in 1952.
. Backus presents the first formally based compiler (FORTRAN) in 1957.
. Allen and introduce most of the abstract concepts in compilers and compiler optimization we use today; early 1960s.
. Aho and . Ullman (and others) formalize parsing in 1960s and 70s (finite automata, context-free grammars).
publishes attribute grammars in 1968, which defines modern compiler construction methodology.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 6 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors
Structure of a Compiler Structure of the Course
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 7 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors
Compiler: a program that translates a program into a semantically equivalent program.
Interpreter: a program for executing another program.
Usually compilers faster than interpreters.
Interpreters usually better at error diagnostics. Just-in-time compilers blur the boundary.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 8 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors
Compiler: a program that translates a program into a semantically equivalent program.
Interpreter: a program for executing another program.
Usually compilers faster than interpreters.
Interpreters usually better at error diagnostics. Just-in-time compilers blur the boundary.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 8 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors
Compiler: a program that translates a program into a semantically equivalent program.
Interpreter: a program for executing another program.
Usually compilers faster than interpreters.
Interpreters usually better at error diagnostics. Just-in-time compilers blur the boundary.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 8 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors
Compiler: a program that translates a program into a semantically equivalent program.
Interpreter: a program for executing another program.
Usually compilers faster than interpreters.
Interpreters usually better at error diagnostics. Just-in-time compilers blur the boundary.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 8 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Interpreters
Interpreter diagrams, I-diagrams:
S – Source language
I – Implementation language
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 9 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Compiler diagrams, T-diagrams:
S – compiled Source language T – generated Target language I – Implementation language
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 10 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Hybrid: The Java Compiler
S – compiled Source language
B – intermediate Bytecode language M – actual Machine language
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 11 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Some Common Examples
Compiled languages: C, C++, Haskell, ML, (Java) … Interpreted languages: Python, JavaScript, (Java) …
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 12 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language-Processing System
Preprocessor:
Compiler: Assembler: Linker: Loader:
expands macros and combines source program modules.
translates source language to assembly code. translates assembly code to machine code. resolves links to libraries and other files. combines executable object files in memory.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 13 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors Structure of a Compiler Structure of the Course
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 14 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Transformation Phases
source program
Characters Tokens Tree Tree IR IR Machine code
— — — —
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Representation Generator Optimizer
Code Generator Machine-Dependent Code Optimizer
Symbol Table
target program
Machine code
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 15 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Source program arrives as stream of characters: position = initial + rate * 60
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 16 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Lexeme is the smallest meaningful entity of a language. The lexemes here: position, =, initial, +, rate, *, and 60.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 17 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Lexical Analysis
position = initial + rate * 60
scanned into list of tokens, one for each lexeme: ⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 18 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Syntax Analysis
⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩ parsed into syntax tree (using precedence):
⟨id,1⟩ww = ”+ ⟨id,2⟩ww ”∗
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 19 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Semantic Analysis
enriched with semantic information (explicit type conversion):
⟨id, 3⟩ uu= ))
⟨id,1⟩ uu + ⟨id,2⟩
intToFloat ⟨num, 60⟩
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 20 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Intermediate Representation Generation
uu= )) ⟨id,1⟩ uu +
intToFloat ⟨num, 60⟩
typically translated to intermediate code:
t1 = intToFloat(60)
t2 = id3 * t1
t3 = id2 + t2
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 21 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Optimization
t1 = intToFloat(60)
t2 = id3 * t1
t3 = id2 + t2
optimized to:
t1 = id3 * 60.0
id1 = id2 + t1
Good target code: faster, shorter, using less power.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 22 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Code Generation
t1 = id3 * 60.0
id1 = id2 + t1
generates (performing same task):
lw t0, $id3
li t1, 60
fmul t0, t0, t1
lw t1, $id2
fadd t0, t0, t1
sw t0, $id1
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 23 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Symbol Table
Records variable attributes, such as: storage information
number and types of arguments for procedures how arguments are passed
return type
Needs to find, store, retrieve records quickly.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 24 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Compiler-Construction Tools
Lexer generators: token description → lexical analyser
Parser generators: grammar → syntax analyser
Syntax-directed translation engines: syntax tree → IR
Code-generator generators: translation rules → code generator
Data-flow engines: data-flow information analyzers
Compiler generators: integrated set of the above
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 25 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Language Processors Structure of a Compiler Structure of the Course
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 26 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Course Description
Standard compilers course following the Dragon Book.
11 lectures and homework assignments. 1 special topic.
2 exams.
Semester-long programming project.
Grade: 15% hw, 15% midterm, 25% final, 45% project.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 27 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Implement fully functional compiler:
Source language: ChocoPy.
Target language: RISC-V assembly.
Implementation language: Java.
Logistics: Team effort; three parts; code and write-up.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 28 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Choco dialect of Python designed at UC Berkeley for teaching compilers: https://chocopy.org/.
Familiar: Can be executed by Python.
Statically typed: Enforces Python type annotations.
Expressive: Supports lists, classes, and nested functions.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 29 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
class Vector(object): items: [int] = None capacity: int = 0
def size(self: “Vector”) -> int: return len(self.items)
def contains(self: “Vector”, x: int) -> bool: i: int = 0
while i < len(self.items): if self.items[i] == x:
return True
i=i+1 return False
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 30 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Reduced instruction set computers (RISC) use a small set of general instructions.
RISC-V is an open-source architecture based on RISC. Has offline and online simulators.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 31 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
RISC-V Example
.globl $Vector.contains
$Vector.contains:
addi sp, sp, # Reserve space for stack frame
sw ra, @Vector.contains.size-4(sp)
sw fp, @Vector.contains.size-8(sp)
addi fp, sp, @Vector.contains.size
sw a0, -12(fp)
j label_15
lw a0, 4(fp)
bnez a0, label_17
j error.None
lw a0, 12(a0)
# return address
# control link
# New fp is at old SP
# Load integer literal 0
# local variable i
# Jump to loop test
# Top of while loop
# Load var: Vector.contains.self
# Ensure not None
# Go to error handler
# Not None
# Get attribute: Vector.items
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 32 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Implementation Language
Java: 4-5 KLOC given; at least as much to write.
Will use lexer and parser generators (JFlex and CUP). Only use another JVM language if you seek challenge.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 33 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Working in three-person teams.
Three milestones: parser; type checker; code generator. Submit code and write-up.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 34 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Project Review
Implement fully functional compiler:
Source language: ChocoPy.
Target language: RISC-V assembly.
Implementation language: Java.
Logistics: Team effort; three parts; code and write-up.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 35 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Project Challenges
Volume and complexity of work.
Need for independent investigation. Software-engineering challenges.
Team and project management.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 36 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Team Formation
Start forming your three-person team:
Speak with classmates directly.
Post on Brightspace.
If you are teamless by Jan 31, we will urge you to fix that. If you are teamless by Feb 7, we will assign you a team.
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 37 / 38
Overview Language Processors Structure of a Compiler Structure of the Course
Thank you!
Compiler Construction (CSCI-GA.2130-001) 1. Introduction 38 / 38
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com