程序代写代做代考 js compiler mips x86 Java interpreter algorithm assembler assembly chain data structure cache computer architecture 211: Computer Architecture
 Spring 2021

211: Computer Architecture
 Spring 2021
Instructor: Prof. David Menendez
Topics:
■ Hardware-Software Interface
■ AssemblyProgramming
● Reading: Chapter 3

Programming Meets Hardware
High-Level Language Program
Compiler
Assembly Language Program
Assembler
Machine Language Program
#include int main() {
int x, y, temp;
x=1; y=2;
temp =x; x=y; y=temp; printf(“%d %d %d\n”,x,y,temp);
}
How do you get performance?
7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01
00 00 00 f0 82 04 08 34 00 00 00 c4 0c 00 00 00 00 00 00 34 00
movl $1, -8(%ebp)
movl $2, -12(%ebp)
movl -8(%ebp), %eax
movl %eax, -16(%ebp)
movl -12(%ebp), %eax
movl %eax, -8(%ebp)
movl -16(%ebp), %eax
movl %eax, -12(%ebp)
movl -16(%ebp), %eax ISA movl %eax, 12(%esp)
movl -12(%ebp), %eax movl %eax, 8(%esp) movl -8(%ebp), %eax movl %eax, 4(%esp)

Performance with Programs
(1) Program: Data structures + algorithms (2) Compiler translates code
(3) Instruction set architecture
(4) Hardware Implementation
Rutgers University David Menendez 3

Instruction Set Architecture
(1) Set of instructions that the CPU can execute
(1) What instructions are available?
(2) How the instructions are encoded? Eventually everything is binary.
(2) State of the system (Registers + memory state + program counter)
(1) What instruction is going to execute next
(2) How many registers? Width of each register?
(3) How do we specify memory addresses?
● Addressing modes
(3) Effect of instruction on the state of the system
Rutgers University David Menendez 4

IA32 (X86 ISA)
There are many different assembly languages because they are processor-specific

IA32 (x86)
● x86-64 for new 64-bit processors
● IA-64 radically different for Itanium processors
● Backward compatibility: instructions added with time
■ ■
We will focus on IA32/x86-64 because you can generate and run on iLab machines (as well as your own PC/laptop)

PowerPC MIPS
IA32 is also dominant in the market although smart phone, eBook readers, etc. are changing this
Rutgers University David Menendez 5

X86 Evolution
8086 – 1978 – 29K transistors – 5-10MHz
I386 – 1985 – 275K transistors – 16-33 MHz Pentium4 – 2005 – 230M transistors – 2800-3800 MHz Haswell – 2013 – > 2B transistors – 3200-3900 MHz
Added features
• Large caches
• Multiple cores
• Support for data parallelism (SIMD) eg AVX extensions
Rutgers University David Menendez 6

CISC vs RISC
CISC: complex instructions : eg X86
• Instructions such as strcpy/AES and others • Reduces code size
• Hardware implementation complex?
RISC: simple instructions: eg Alpha • Instructions are simple add/ld/st
• Increases code size
• Hardware implementation simple?
Rutgers University David Menendez 7

Aside About Implementation of x86
About 30 years ago, the instruction set actually reflected the processor hardware

As hardware advanced, industry faced with choice
■ ■
E.g., the set of registers in the instruction set is actually what was present in the processor
Change the instruction set: bad for backward compatibility
Keep the instruction set: harder to exploit hardware advances
● Example: many more registers but only small set introduced circa 1980
Starting with the P6 (PentiumPro), IA32 actually got implemented by Intel using an “interpreter” that translates IA32 instructions into a simpler “micro” instruction set
Rutgers University David Menendez 8

P6 Decoder/Interpreter

Assembly Programming
Brief tour through assembly language programming Why?
■ ■
Why not binary language?
■ ■
Machine interface: where software meets hardware
To understand how the hardware works, we have to understand the interface that it exports
Much easier for humans to read and reason about
Major differences:
● Human readable language instead of binary sequences ● Relative instead of absolute addresses
Rutgers University David Menendez 10

Assembly Programmer’s View
CPU
Memory
ALU
Control Logic
Condition Codes
PC
Registers
(OS code & data) Object Code Program Data
Addresses
Data
Instructions

CPU
Memory
Memory
Address
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Command: R/W
Data
Addresses

Memory Access: Read
CPU
Memory
03
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
R
01010101
Addresses

Memory Access: Write
CPU
Memory
04
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
00000000
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
W
01010101
Addresses

Memory Access: Write
CPU
Memory
04
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
W
01010101
Addresses

C
S
AB
Processor: ALU & Registers
C = FS(A, B)
F includes
Arithmetic: +, -, *, /, ~, etc. Logical: <, >, =, etc.
ALU
Registers
Name Storage
R0 R1 R2 R3
00101100
10001000
11111111
01010101
Multiple Ports
Name Command: R/W Data

Putting It All Together
CPU
Memory
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Condition Codes
Program Counter (PC)
Control Logic
ALU
Registers
Addresses

Putting It All Together
CPU
Memory
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Condition Codes
Program Counter (PC)
1
ALU
Control Logic
Registers
Addresses

Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
1
ALU
Registers
Control Logic
10001000
1
Addresses

Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
+
ALU
Control Logic
1
Registers
R0: x R1: y
R0 R1
10001000
1
Addresses

Putting It All Together
CPU
Memory
Condition Codes
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
Program Counter (PC)
+
ALU
x
1
y
Registers
R0: x R1: y
R0 R1
Control Logic
10001000
1
Addresses

Putting It All Together
CPU
Memory
Condition Codes
Program Counter (PC)
z
+
ALU
x
Storage
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
00101100
10001000
11111111
01010101
01010101
11000001
00000000
11111001
11111000
00110000
00000000
00000000
00000000
11000011
00011001
00000000
2
y
Registers
R0: x R1: y
R2: z
R0 R1
Control Logic
R2
10001000
1
Addresses

Sample C Code
int accum;
int sum(int x, int y)
{
gcc -O1 -m32 –S code.c
Generated Assembly
sum:
push %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
addl 8(%ebp), %eax
addl %eax, accum
popl %ebp
ret
}
int t = x + y;
accum += t;
return t;
Rutgers University
David Menendez 23
C & Assembly Code

Sample C Code
int accum;
int sum(int x, int y){
int t = x + y;
accum += t;
return t;
objdump –d code.o
C & Machine Code
} 3:
6: 03 45 08
gcc -O1 -m32 –c code.c 9:
gdb code.o
(gdb) x/100xb sum
f: 5d 10: c3
pop %ebp ret
: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45
0x8 : 0x08 0x01 0x05 0x00 0x00 0x00 0x00 0x5d
0x10 : 0xc3 Cannot access memory at address 0x11
0000000 :
0: 55
1: 89 e5
push %ebp
mov %esp,%ebp
mov 0xc(%ebp),%eax add 0x8(%ebp),%eax
8b 45 0c
01 05 00 00 00 00 add %eax, accum
Rutgers University David Menendez 24

Assembly Characteristics
Sequence of simple instructions
Minimal Data Types

■ ■
No type checking
■ ■
“Integer” data of 1, 2, or 4 bytes
● Data values
● Addresses (untyped pointers)
Floating point data of 4, 8, or 10 bytes
No aggregate types such as arrays or structures
● Just contiguously allocated bytes in memory
Interpretation of data format depends on instruction No protection against misinterpretation of data
Rutgers University David Menendez 25

Assembly Characteristics
3 types of Primitive Operations
■ ■

Perform arithmetic function on register or memory data
Transfer data between memory and register
● Load data from memory into register ● Store register data into memory
Transfer control
● Unconditional jumps to/from procedures ● Conditional branches
Rutgers University David Menendez 26

x86 Characteristics
Variable length instructions: 1-15 bytes
Can address memory directly in most instructions
Uses Little-Endian format (Least significant byte in the lowest address)
Rutgers University David Menendez 27

General format:
opcode operands
Opcode:

● movb,addl, etc. Operands:
■ ■
Example:

Instruction Format
Short mnemonic for instruction’s purpose
Immediate, register, or memory
Number of operands command-dependent
movl %ebx, (%ecx)
Rutgers University David Menendez 28

Machine Representation
Remember, each assembly instruction translated to a sequence of 1-15 bytes
Opcode
addressing mode
other bytes
First, the binary representation of the opcode Second, instruction specifies the addressing mode
■ ■
Some instructions can be single-byte because operands and addressing mode are implicitly specified by the instruction

The type of operands (registers or register and memory) How to interpret the operands
E.g., pushl
Rutgers University David Menendez 29 29

x86 Registers
General purpose registers are 32 bit

Originally categorized into two groups with different functionality


Now, the registers are mostly interchangeable
Segment registers

Although operations can access 8-bits or 16-bits portions
Data registers (EAX, EBX, ECX, EDX)
● Holds operands
Pointer and Index registers (EBP, ESP, EIP,ESI,EDI)
● Holds references to addresses as well as indexes
Holds starting address of program segments
● CS, DS, SS, ES
Rutgers University David Menendez 30 30

x86 Registers
16 BITS
8 BITS
EAX AX
AH
AL
ECX CX
CH
CL
EDX DX
DH
DL
EBX BX
BH
BL
ESP –Stack Pointer
EBP — Base register of current stack frame
ESI — Source index for string operations
EDI — Destination index for string operations
32 BITS Rutgers University David Menendez
31

x86 Programming
• Mov instructions to move data from/to memory

• Addressing modes
• Understanding swap
• Arithmetic operations
• Condition codes
• Conditional and unconditional branches • Loops and switch statements
Operands and registers
Rutgers University David Menendez 32

Byte: 8 bits

Word: 16 bits (2 bytes)

E.g., char
E.g., short int
Data Format
Double Word: 32 bits ( 4 bytes)

Quad Word: 64 bits (8 bytes)

Instructions can operate on any data size
■ movl, movw, movb
● Move double word, word, byte, respectively

E.g., int, float
E.g., double
End character specifies what data size to be used
Rutgers University David Menendez 33

MOV instruction
Most common instruction is data transfer instruction
■ Mov SRC, DEST: Move source into destination
■ SRC and DEST are operands
■ DEST is a register or a location
■ SRC can be the contents of register, memory location, constant, or a label.
■ If you use gcc, you will see movl ,
■ Alltheinstructionsinx86are32-bit
Used to copy data:
■ Constant to register (immediate)
■ Memory to register
■ Register to memory
■ Register to register
Cannot copy memory to memory in a single instruction
Rutgers University
David Menendez
34 34

Immediate Addressing
Operand is immediate
■ ■ ■ ■
Operand value is found immediately following the instruction Encoded in 1, 2, or 4 bytes
$ in front of immediate operand E.g., movl $0x4040, %eax
ADDR
000f
00A1 00A2
memory
movl %eax
4040
Rutgers University David Menendez
35

Register Mode Addressing
Use % to denote register

Source operand: use value in specified register
Destination operand: use register as destination for value
Examples:



E.g., %eax
movl %eax, %ebx
● Copy content of %eax to %ebx
movl $0x4040, %eax ! immediate addressing ● Copy 0x4040 to %eax
movl %eax, 0x0000f !Absolute addressing
● Copy content of %eax to memory location 0x0000f
Rutgers University David Menendez 36 36

Indirect Mode Addressing
Content of operand is an address

Offset can be specified as immediate mode
Examples:


Designated as parenthesis around operand
movl (%ebp), %eax
● Copy value from memory location whose address is in ebp into eax
movl -4(%ebp), %eax
● Copy value from memory location whose address is -4 away from content of ebp into eax
Rutgers University David Menendez 37

Indexed Mode Addressing
Add content of two registers to get address of operand


Useful for dealing with arrays
■ ■

movl (%ebp, %esi), %eax
● Copy value at (address = ebp + esi) into eax
movl 8(%ebp, %esi),%eax
● Copy value at (address = 8 + ebp + esi) into eax
If you need to walk through the elements of an array Use one register to hold base address, one to hold index
● E.g., implement C array access in a for loop
Index cannot be ESP
Rutgers University David Menendez 38 38

Scaled Indexed Mode Addressing
Multiply the second operand by the scale (1, 2, 4 or 8)

movl 0x80 (%ebx, %esi, 4), %eax
● Copy value at (address = ebx + esi*4 + 0x80) into eax
Where is it useful?
Rutgers University David Menendez 39 39

Address Computation Examples
%edx
0xf000
%ecx
0x100
Expression
Computation
Address
0x8(%edx)
0xf000 + 0x8
0xf008
(%edx,%ecx)
0xf000 + 0x100
0xf100
(%edx,%ecx,4)
0xf000 + 4*0x100
0xf400
0x80(,%edx,2)
2*0xf000 + 0x80
0x1e080
40

movl Operand Combinations
Source
Imm
movl
Destination C Analog
Reg Mem
movl $0x4,%eax temp = 0x4; movl$-147,(%eax) *p=-147;

Cannot do memory-memory transfers with single instruction
Reg
Mem Reg
Reg Mem
movl %eax,%edx
movl %eax,(%edx)
movl (%eax),%edx
temp2 = temp1;
*p = temp;
temp = *p;
Rutgers University David Menendez 41

Stack Operations
By convention, %esp is used to maintain a stack in memory

%esp contains the address of top of stack
Instructions to push (pop) content onto (off of) the stack


Where does the stack start? We’ll discuss later
Used to support C function calls
pushl %eax
● esp=esp–4
● Memory[esp] = eax
popl %ebx
● ebx = Memory[esp] ● esp=esp+4
Rutgers University David Menendez 42

Using Simple Addressing Modes
swap:
pushl %ebp
movl %esp,%ebp Set
pushl %ebx
Up
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
Body
Finish

Understanding Swap
Offset
12 8 4 0 -4
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
Stack
• • •
yp
xp
Rtn adr
Old %ebp
Old %ebx
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
%ebp
# ecx = yp
# edx = xp #eax=*yp(t1) #ebx=*xp(t0) #*xp=eax #*yp=ebx
Register Variable
%ecx yp %edx xp %eax t1 %ebx t0

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx #
ecx = yp
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# # # # #
edx=xp
eax = *yp (t1) ebx = *xp (t0) *xp = eax
*yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
%edx
0x124
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx #
ecx = yp
edx=xp
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
#
# # # #
eax = *yp (t1)
ebx = *xp (t0)
*xp = eax
*yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
123
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
456
456
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx

Understanding Swap
Offset
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
456
123
0x120
0x124
Rtn adr
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
%esi
%edi
%esp
%ebp
0x104
yp 12 xp 8 4 0 -4
%ebp
movl 12(%ebp),%ecx
movl 8(%ebp),%edx
movl (%ecx),%eax
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)
# ecx = yp
# edx=xp
# eax = *yp (t1) # ebx = *xp (t0) # *xp = eax
# *yp = ebx

Swap in x86-64: 64-bit Registers
rax
eax
rcx
ecx
rdx
edx
rbx
ebx
rsp
esp
rbp
ebp
rsi
esi
rdi
edi
r8
r9
r10
r11
r12
r13
r14
r15

Swap in x86-64 bit
swap:
movl (%rdi), %edx
movl (%rsi), %eax
movl %eax, (%rdi)
movl %edx, (%rsi)
retq
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
Arguments passed in registers
■ ■
No stack operations
What happens with long int?
First, xp in rdi and yp in rsi
64-bit pointers, data values are 32-bit ints, so uses eax/edx

IA32 Stack


■ Register%espindicates lowest stack address
● address of top element
Stack Pointer
%esp
Stack “Bottom”
Increasing Addresses
Region of memory managed with stack discipline
Grows toward lower addresses
Stack Grows Down
Rutgers University
David Menendez
54
Stack “Top”

IA32 Stack Pushing
Pushing
■ pushl Src

■ Decrement%espby4

Stack “Bottom”
Increasing Addresses
Fetch operand at Src Write operand at address
given by %esp
Stack Pointer
-4
Stack Grows Down
%esp
Rutgers University
David Menendez
55
Stack “Top”

IA32 Stack Popping
Popping
■ popl Dest

■ Increment%espby4

Stack “Bottom”
Increasing Addresses
Read operand at address given by %esp
Write to Dest
Stack Pointer
%esp
+4
Stack Grows Down
Rutgers University
David Menendez
56
Stack “Top”

Stack Operation Examples
pushl %eax
popl %edx
123
123
213
123
213
0x110
0x10c
0x108
0x110
0x10c
0x108
0x104
%eax
%edx
%esp
0x110
0x10c
0x108
0x104
%eax
%edx
%esp
213
555
0x108
213
555
0x1084
213
525153
0x1048
%eax
%edx
%esp

Procedure Control Flow
■ Use stack to support procedure call and return Procedure call:
call label Push return address on stack; Jump to label
Return address value
■ Address of instruction beyond call ■ Example from disassembly
804854e: e8 3d 06 00 00 8048553: 50
●Return address = 0x8048553 Procedure return:
call 8048b90


pushl %eax
■ ret Pop address from stack; Jump to address
Rutgers University David Menendez 58

0x110
0x10c
0x108
%esp %eip
0x108
0x804854e
Procedure Call Example
804854e: e8 3d 06 00 00 call 8048b90


8048553: 50
pushl %eax
call 8048b90
0x110
0x10c
0x108
0x104
%esp 0x1084 %eip 0x80485b49e0
123
123
0x8048553
%eip is program counter

8048591: c3
Procedure Return Example
ret
0x110
0x10c
0x108
0x104
%esp %eip
0x110
0x10c
0x108
ret
123
0x8048553
123
0x8048553
0x104
0x8048591
%esp 0x1048 %eip 0x804855931
%eip is program counter

Address Computation Instruction
leal: compute address using addressing mode without accessing memory
leal src, dest
■ ■
Use

Example:

src is address mode expression Set dest to address specified by src
Computing address without doing memory reference
● E.g., translation of p = &x[i];
leal 7(%edx, %edx, 4), %eax
● eax=4*edx+edx+7=5*edx+7
Rutgers University David Menendez 61

Some Arithmetic Operations
Instruction addl Src,Dest subl Src,Dest imull Src,Dest sall Src,Dest sarl Src,Dest xorl Src,Dest andl Src,Dest orl Src,Dest
Computation
Dest = Dest + Src
Dest = Dest – Src
Dest = Dest * Src
Dest = Dest << Src (left shift) Dest = Dest >> Src (right shift) Dest = Dest ^ Src
Dest = Dest & Src Dest = Dest | Src
Rutgers University
David Menendez 62

Some Arithmetic Operations
Instruction
incl Dest decl Dest negl Dest notl Dest
Computation
Dest = Dest + 1 Dest = Dest – 1 Dest = – Dest Dest = ~ Dest
Rutgers University
David Menendez 63

Using leal for Arithmetic Expressions
arith:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%eax
movl 12(%ebp),%edx
leal (%edx,%eax),%ecx
leal (%edx,%edx,2),%edx
sall $4,%edx
addl 16(%ebp),%ecx
leal 4(%edx,%eax),%eax
imull %ecx,%eax
movl %ebp,%esp
popl %ebp
ret
Set Up
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
Body
Finish

Understanding arith Offset
16 12 8 4 0
Stack
• • •
z
y
x
Rtn adr
Old %ebp
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
movl 8(%ebp),%eax
movl 12(%ebp),%edx
leal (%edx,%eax),%ecx
leal (%edx,%edx,2),%edx # edx = 3*y
sall $4,%edx
addl 16(%ebp),%ecx
leal 4(%edx,%eax),%eax
imull %ecx,%eax
# edx = 48*y (t4)
# ecx = z+t1 (t2)
# eax = 4+t4+x (t5)
# eax = t5*t2 (rval)
# eax = x
# edx = y
# ecx = x+y (t1)
%ebp

Understanding arith
# eax = x
movl 8(%ebp),%eax
# edx = y
movl 12(%ebp),%edx
# ecx = x+y (t1)
leal (%edx,%eax),%ecx
# edx = 3*y
leal (%edx,%edx,2),%edx
# edx = 48*y (t4)
sall $4,%edx
# ecx = z+t1 (t2)
addl 16(%ebp),%ecx
# eax = 4+t4+x (t5)
leal 4(%edx,%eax),%eax
# eax = t5*t2 (rval)
imull %ecx,%eax
int arith
(int x, int y, int z)
{
int t1 = x+y;
int t2 = z+t1;
int t3 = x+4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}

Another Example
logical:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%eax
xorl 12(%ebp),%eax
sarl $17,%eax
andl $8185,%eax
movl %ebp,%esp
popl %ebp
ret
eax = x
eax = x^y (t1) eax = t1>>17 (t2) eax = t2 & 8185
Set Up
int logical(int x, int y)
{
int t1 = x^y;
int t2 = t1 >> 17;
int mask = (1<<13) - 7; int rval = t2 & mask; return rval; } Body Finish 213 =8192,213 –7=8185 movl 8(%ebp),%eax xorl 12(%ebp),%eax sarl $17,%eax andl $8185,%eax Mystery Function What does the following piece of code do? A. Add two variables B. Subtract two variables C. Swap two variables D. No idea movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) Rutgers University David Menendez 68 What does this function do? .globl foo .type foo, @function foo: pushl %ebp movl movl imull addl popl %ebp ret %esp, %ebp 16(%ebp), %eax 12(%ebp), %eax 8(%ebp), %eax Control Flow/Conditionals How do we represent conditionals in assembly? A conditional branch can implement all control flow constructs in higher level language ■ Examples: if/then, while, for A unconditional branch for constructs like break/ continue Condition Codes Single Bit Registers CF Carry Flag SF Sign Flag ZF Zero Flag OF Overflow Flag Can be set either implicitly or explicitly. ■ Implicitly by almost all logic and arithmetic operations ■ Explicitly by specific comparison operations Not Set by leal instruction ■ Intended for use in address computation only Rutgers University David Menendez 71 jX Instructions ■ Jumping Jump to different part of code depending on condition codes jX Condition Description jmp 1 Unconditional je ZF Equal / Zero jne ~ZF Not Equal / Not Zero js SF Negative jns ~SF Nonnegative jg ~(SF^OF)&~ZF Greater (Signed) jge ~(SF^OF) Greater or Equal (Signed) jl (SF^OF) Less (Signed) jle (SF^OF)|ZF Less or Equal (Signed) ja ~CF&~ZF Above (unsigned) jb CF Below (unsigned) Rutgers University David Menendez 72 Condition Codes Implicitly Set By Arithmetic Operations addl Src,Dest Canalog: t=a+b ■ CF set if carry out from most significant bit ●Used to detect unsigned overflow ■ ZFsetift==0 ■ SFsetift<0 ■ OF set if two’s complement overflow (a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0)
Rutgers University David Menendez 73

Setting Condition Codes (cont.)
Explicit Setting by Compare Instruction
cmpl Src2,Src1
cmpl b,a like computing a-b without setting destination
NOTE: The operands are reversed. Source of confusion






CF set if carry out from most significant bit
● Used for unsigned comparisons
ZFsetifa==b
SFsetif(a-b)<0 OF set if two’s complement overflow (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-
b)>0)
Rutgers University David Menendez 74

Setting Condition Codes (cont.)
Explicit Setting by Test instruction
testl Src2,Src1
■ SetsconditioncodesbasedonvalueofSrc1&Src2
● Useful to have one of the operands be a mask
■ testl b,a like computing a&b without setting destination ■ ZF set whena&b == 0
■ SFsetwhena&b<0 Rutgers University David Menendez 75 Conditional Branch Example _max: pushl %ebp movl %esp,%ebp Set Up Body movl 8(%ebp),%edx movl 12(%ebp),%eax cmpl %eax,%edx jle L9 movl %edx,%eax L9: movl %ebp,%esp popl %ebp ret Finish Conditional Branch Example _max: pushl %ebp movl %esp,%ebp Set Up Body int max(int x, int y) { if (x <= y) return y; else return x; } movl 8(%ebp),%edx movl 12(%ebp),%eax cmpl %eax,%edx jle L9 movl %edx,%eax L9: movl %ebp,%esp popl %ebp ret Finish Conditional Branch Example (Cont.) int max(int x, int y) { if (x <= y) return y; else return x; } int goto_max(int x, int y) { int rval = y; int ok = (x <= y); if (ok) goto done; rval = x; done: return rval; } movl 8(%ebp),%edx movl 12(%ebp),%eax # eax = x y goto L9 x cmpl %eax,%edx jle L9 movl %edx,%eax L9: # x : y # if <= # eax = # Done: ■ Generally considered bad coding style # edx = ■ C allows “goto” as means of transferring control ● Closer to machine- level programming style Rutgers University David Menendez 78 Skipped when x ≤ y .LC0: .string "%d” .text .globl foo .type foo, @function foo: pushl %ebp movl subl leal movl movl call scanf cmpl $4, -12(%ebp) je .L3 call explode_bomb .L3: leave .p2align 4,,3 ret Rutgers University David Menendez 79 %esp, %ebp $40, %esp -12(%ebp), %eax %eax, 4(%esp) $.LC0, (%esp) Mystery Function C Code “Do-While” Loop Example int fact_do(int x) { int result = 1; do { result *= x; x = x-1; } while (x > 1);
return result;
}
Rutgers University David Menendez 80

“Do-While” Loop Example
C Code Goto Version
int fact_do(int x)
{
int result = 1;
do {
result *= x;
x = x-1;
} while (x > 1);
return result;
}
int fact_goto(int x)
{
int result = 1;
loop:
result *= x;
x = x-1;
if (x > 1)
goto loop; return result;
}
■ ■
Use backward branch to continue looping Only take branch when “while” condition holds
Rutgers University David Menendez 81

“Do-While” Loop Compilation
Goto Version
Assembly
int fact_goto(int x)
{
int result = 1;
loop:
result *= x;
x = x-1;
if (x > 1)
goto loop; return result;
}
Registers
%edx x
%eax result
_fact_goto:
pushl %ebp
movl %esp,%ebp
movl $1,%eax
movl 8(%ebp),%edx
L11:
imull %edx,%eax
decl %edx
cmpl $1,%edx
jg L11
movl %ebp,%esp
popl %ebp
ret
# Setup
# Setup
# eax = 1
# edx = x
# result *= x
# x–
# Compare x : 1
# i