MSCP 52011
Introduction to Computer Systems
Assembler
Assemblers
Assemblers are the first rung up the software hierarchy ladder
• translator of a simple language
• good first step toward writing compilers
Low-level C programming may involve some assembly programming for optimization
Software Hierarchy
Operating System Compiler
Virtual Machine Language Translator
unix, iOS c, java JVM, .NET
Assembler one for every CPU arch
Program Translation
Program Translation Challenge
• Parse the source program, using the syntax rules of the source language
• Re-express the program’s semantics using the syntax rules of the target language
Hack Assembly Language
Assembly program is a stream of text lines, each being one of the following things:
•
•
•
•
Handling A-Instructions
Translation to binary:
•If value is a number: simple •If value is a symbol: later
A-instruction C-instruction
Symbol declaration
(symbol)
Comment or white space
// comment
Handling C-Instructions
this specifies the translation to binary
C-instruction
• comp is one of: (these are the ALU operations)
• dest is one of: • jump is one of:
dest=comp;jump //compismandatory
// dest and jump are
// optional
0,1,-1,D,A,!D,!A,-D,-A,D+1,A+1,D-1,A-1,D+A,D-A,A-D,D&A,D|A,
M, !M, -M, M+1, M-1,D+M,D-M,M-D,D&M,D|M
Null, M, D, MD, A, AM, AD, AMD
Null, JGT, JEQ, JGE, JLT, JNE, JLE, JMP
[some examples]
Implementation
Stage 1: build a basic assembler for programs that have no symbols
Test using MaxL.asm, RectL.asm, PongL.asm
Stage 2: extend the basic assembler with symbol-handling capabilities
Test using Max.asm, Rect.asm, Pong.asm
Overall logic of the final assembler For each (real) command:
•
•
•
•
parse the command (break it up into its constituent fields)
replace each symbolic reference (if any) with the corresponding memory address (a binary number)
for each field , generate the corresponding binary code
assemble the binary codes into complete machine instructions
Predefined Symbols
Label RAM address
SP
0
LCL
1
ARG
2
THIS
3
THAT
4
R0-R15
0-15
SCREEN
16384
KBD
24576
Implementation
Stage 1: build a basic assembler for programs that have no symbols
Test using MaxL.asm, RectL.asm, PongL.asm
Stage 2: extend the basic assembler with symbol-handling capabilities
Test using Max.asm, Rect.asm, Pong.asm
Suggested implementation
• Parser: unpacks each command into its underlying fields
• Code: translates each field into its corresponding binary value
• SymbolTable: manages the symbol table (leave this until the end)
• Main: initializes I/O, drives the program
• You can assume perfect input for an A; input checking for higher A
A procedural approach: 3 passes
First pass:
• Clear out white space • Remove comments
A procedural approach: 3 passes
Second pass:
• Find labels
• Parse for “(“ label name “)”
• Put in symbol table with the address of line number (think about this…it’s for jumping the PC)
• Remove line; this will affect line number of subsequent symbols
•
A procedural approach: 3 passes
•
if the line is a C-instruction, do some lookups (dest, comp, jump)
Third pass: go through program again and translate each line:
• • •
•
if the line is @xxx, where xxx is a number, pretty simple
If the line is @xxx, where xxx is a symbol, look it up in the symbol table
if the symbol is found, replace it with its numeric meaning and complete the command’s translation
•
Write code to machine language file
if the symbol is not found, then it must represent a new variable: add the pair
Testing your assembler
1.Write assembler without symbol-handling capabilities
2.Test on:MaxL.asm, RectL.asm, PongL.asm
3.Add symbol-handling to your addembler
4.Test on:Add.asm, Max.asm, Rect.asm, pong.asm
Note: Run your assembled programs!