L5_1 Assembly_Data-Layout
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 1 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• Understand mapping of C-code data structures (struct) to data layout in memory (e.g., stack)
EECS 370 – Introduction to Computer Organization
2
Reminders
• P1a due Thursday!!! • Midterm exam 10/20
• Complete request for alternate or submit requests for accommodations by 9/22
• Drop by office hours
• PrOffice hours on the calendar
EECS 370 – Introduction to Computer Organization
3
Resources
• Video reviews of many topics!
• https://www.eecs.umich.edu/courses/eecs370/eecs370.f20/video_reviews/
EECS 370 – Introduction to Computer Organization
4
Converting C to assembly – Example #2
Write ARM assembly code for the following C expression (assume an int is 4 bytes, unsigned char is 1 byte)
Register to variable mapping
X1→pointer to y
C-code instructions
struct { int a; unsigned char b, c; } y; y.a =y.b+y.c;
ARM LEGv8
LDURB X2, [X1, #4] // load y.b
LDURB X3, [X1, #5] // load y.c
ADD X4, X2, X3 // calculate y.b + y.c STURW X4, [X1, #0] // store y.a
See supplemental video for detailed explanation
How do you determine offsets for struct sub-fields? THIS lecture will detail
EECS 370 – Introduction to Computer Organization
5
Calculating Load/Store Addresses
Problem: Calculate the total amount of memory needed for the struct instance x Assume data memory starts at address 1000
Datatype
short
char
int
double
size (bytes)
2
1
4
8
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f;
int g[1];
char h;
} x;
EECS 370 – Introduction to Computer Organization
6
Calculating Load/Store Addresses
Problem: Calculate the total amount of memory needed for the struct instance x Assume data memory starts at address 1000
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f;
int g[1];
char h;
} x;
Datatype
short
char
int
double
size (bytes)
2 1 4 8
EECS 370 – Introduction to Computer Organization
7
Calculating Load/Store Addresses
Problem: Calculate the total amount of memory needed for the struct instance x Assume data memory starts at address 1000
Datatype
short
char
int
double
size (bytes)
2
1
4
8
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f;
int g[1];
char h;
} x;
Solution???
a = 2 bytes * 100 = 200 b = 1 byte
c = 4 bytes
d = 8 bytes
e = 2 bytes
x = 1 + 4 + 1 = 6 bytes
Total: 221 bytes???
Correct or incorrect?
EECS 370 – Introduction to Computer Organization
8
Memory layout of variables
• Most modern ISAs require that data be aligned. • What do we mean by alignment in this context?
• An N-byte variable must start at an address A, such that (A % N) == 0
• “Golden” rule – Address of a variable is aligned based on the size of
the variable
• char is byte aligned (any address is fine)
• short is half-word (H) aligned (LSBit of address must be 0) • int is word aligned (W) (2 LSBit’s of addr must be 0)
• This greatly simplifies hardware needed for loads and stores
• Otherwise, multiple memory accesses need to be used to access one piece of
data
EECS 370 – Introduction to Computer Organization
9
Structure Alignment
• Each field is laid out in the order it is declared using the Golden Rule for alignment
• Identify largest (primitive) field
• Starting address of overall struct is aligned based on the largest field
• Size of overall struct is a multiple of the largest field
• Reason for this is so we can have an array of structs
• Guarantees that each instance of struct is aligned the same way
EECS 370 – Introduction to Computer Organization
10
Structure Alignment – Example
Datatype
size (bytes)
short
2
char
1
int
4
double
8
Problem: What boundary should this struct be aligned to? What is the total size of the struct?
C-code
struct {
char w;
int x[3]
char y;
short z;
} s;
EECS 370 – Introduction to Computer Organization
11
Structure Alignment – Example
Problem: What boundary should this struct be aligned to? What is the total size of the struct?
C-code
struct {
char w;
int x[3]
char y;
short z;
} s;
Datatype size (bytes)
short 2 char 1 int 4 double 8
EECS 370 – Introduction to Computer Organization
12
Structure Alignment – Example
Datatype
size (bytes)
short
2
char
1
int
4
double
8
Problem: What boundary should this struct be aligned to? What is the total size of the struct?
C-code
struct {
char w;
int x[3]
char y;
short z;
} s;
Largest field is 4 bytes (int), therefore:
• struct size will be multiple of 4
• structstartingaddressiswordaligned,
since a word is 4 bytes
EECS 370 – Introduction to Computer Organization
13
Assume struct starts at address 1000, what is the data layout of the struct?
Structure Alignment – Example
Datatype
size (bytes)
short
2
char
1
int
4
double
8
Problem: What boundary should this struct be aligned to? What is the total size of the struct?
C-code
struct {
char w;
int x[3]
char y;
short z;
} s;
Largest field is 4 bytes (int), therefore:
• struct size will be multiple of 4
• structstartingaddressiswordaligned,
since a word is 4 bytes
Assume struct starts at address 1000, what is the data layout of the struct?
char w -> 1000
// padding 1001-1003
x[0] -> 1004-1007 x[1] -> 1008-1011 x[2] -> 1012-1015 char y -> 1016
// padding 1017 short z -> 1018-1019
start: 1000
end: 1019
Total size = 20 bytes
Why padding?
“Golden” rule – Address of a variable is aligned based on the size of the variable
EECS 370 – Introduction to Computer Organization
14
Calculating Load/Store Addresses – 2nd Try
Datatype
size (bytes)
short
2
char
1
int
4
double
8
Problem: Calculate the total amount of memory needed for the declarations. Assume data memory starts at address 100
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f; int g[1]; char h;
} x;
EECS 370 – Introduction to Computer Organization
15
An N-byte variable must start at an address A, such that
(A % N) == 0
Calculating Load/Store Addresses – 2nd Try
Problem: Calculate the total amount of memory needed for the declarations. Assume data memory starts at address 100
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f; int g[1]; char h;
} x;
Datatype size (bytes)
short 2 char 1 int 4 double 8
An N-byte variable must start at an address A, such that
(A % N) == 0
EECS 370 – Introduction to Computer Organization
16
Calculating Load/Store Addresses – 2nd Try
Datatype
size (bytes)
short
2
char
1
int
4
double
8
Problem: Calculate the total amount of memory needed for the declarations. Assume data memory starts at address 100
C-code
short a[100]; char b;
int c;
double d; short e; struct {
char f; int g[1]; char h;
} x;
C-code
short a[100];
char b;
int c;
double d;
short e;
struct {
char f;
int g[1];
char h;
} x;
Bytes
200
1
3
4
4
8
2
2
12
1
3
4
1
3
12
Start
100
300
301
304
308
312
320
322
324
324
325
328
332
333
324
End
299
303
307
311
319
321
323
335
324
327
331
332
335
335
Notes
300
padding
padding
padding
largest field: 4 bytes
padding
padding
An N-byte variable must start at an address A, such that
(A % N) == 0
Total size: 236 bytes
EECS 370 – Introduction to Computer Organization
17
Pause
The next example is a review of Lecture 5 worksheet 1. Pause, complete the worksheet, then proceed.
EECS 370 – Introduction to Computer Organization
18
Example 2
Datatype
size (bytes)
short
2
char
1
int
4
double
8
address
4
Problem: Calculate the total amount of memory needed for the declarations. Assume data memory starts at address 200
C-code
int a; struct {
double b;
char c;
int d;
} x;
char *f; short g[20];
EECS 370 – Introduction to Computer Organization
19
Example 2
Datatype
size (bytes)
short
2
char
1
int
4
double
8
address
8
Problem: Calculate the total amount of memory needed for the declarations. Assume data memory starts at address 200
C-code
Bytes
Start
End
Notes
int a;
4
200
203
4
204
207
padding
struct {
16
208
223
largest field: 8 bytes
double b;
8
208
215
char c;
1
216
216
3
217
219
padding
int d;
4
220
223
} x;
16
208
223
char *f;
8
224
231
short g[20];
40
232
271
TOTAL:
72
200
271
C-code
int a; struct {
double b;
char c;
int d;
} x;
char *f; short g[20];
EECS 370 – Introduction to Computer Organization
20
Data Layout – Why?
• Does gcc (or another compiler) reorder variables in memory to avoid padding?
• No, a compiler will not optimize data layout to remove padding.
• C99 standard prohibits this
• Memory is laid out in order of declaration for structs.
• gcc has implemented an option for this, then later removed it
• The programmer (i.e., you) are expected to manage data layout of variables for your program and structs.
• For a start: order fields in struct by datatype size, smallest first EECS 370 – Introduction to Computer Organization
21
Logistics
• There are 3 videos for lecture 5 • L5_1 – Assembly_Data-Layout
• L5_2 – Assembly_Flow-Control
• L5_3 – C-to-Assembly_Examples
• There are two worksheets for lecture 5
1. Data Layout – you can do this now
2. C to Assembly
EECS 370 – Introduction to Computer Organization
22
L5_2 Assembly_Flow-Control
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 23 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• Recognize the set of branching instructions for ARM ISA and be able to describe the operations and operands for instructions
• LEGv8 subset
• Understand mapping of complex C-code branching instructions into
corresponding assembly code instructions
EECS 370 – Introduction to Computer Organization
24
ARM/LEGv8 Sequencing Instructions
• Sequencing instructions change the flow of instructions that are executed • This is achieved by modifying the program counter (PC)
• Unconditional branches are the most straightforward
they ALWAYS change the PC and thus “jump” to another instruction out of the usual sequence
• Conditional branches
if (condition_test) goto target_address
• condition_test examines the four flags from the processor
status word (SPSR)
• target_address is the 19-bit signed word displacement from
current PC
EECS 370 – Introduction to Computer Organization
25
LEGv8 Conditional Instructions
• Two varieties of conditional branches
1. One type compares a register to see if it is equal to zero.
2. Another type checks the condition codes set in the status register.
• Let us look at the first type: CBZ and CBNZ
• CBZ – Conditional Branch if Zero
• CBNZ – Conditional Branch if Not Zero
EECS 370 – Introduction to Computer Organization
26
LEGv8 Compare and Branch Instructions
• CBZ/CBNZ: test a register against zero and branch to a PC relative address
• The relative address is a 19-bit signed integer—the number of instructions. Recall instructions are 32 bits of 4 bytes
ARM LEGv8
Description
CBNZ X3, foo CBNZ X3, 25
• if X3 does not equal 0, then branch to label foo
• 25 is an offset from the PC of the current instruction (CBNZ)
EECS 370 – Introduction to Computer Organization
27
LEGv8 Compare and Branch Instructions
• CBZ/CBNZ: test a register against zero and branch to a PC relative address
• The relative address is a 19-bit signed integer—the number of instructions. Recall instructions are 32 bits of 4 bytes
Why does 25 in the table result in PC + 100?
Offset is # of instructions (words)
EECS 370 – Introduction to Computer Organization
28
Conditional Branch Offset Example
Problem: Calculate the numerical offset for the CBNZ instruction. ARM LEGv8
loop:
ADDI X3, X3, #1 SUBI X4, X4, #1 ADD X5, X3, X4 CBNZ X5, loop
EECS 370 – Introduction to Computer Organization
29
Conditional Branch Offset Example
Problem: Calculate the numerical offset for the CBNZ instruction.
Answer: -3
Offset field: 19-bit, signed 111 1111 1111 1111 1101
ARM LEGv8
loop: ADDI X3, X3, #1 SUBI X4, X4, #1 ADD X5, X3, X4
CBNZ X5, loop
The assembler will calculate the offset
If any instructions are added or removed while writing code, using a label saves from recalculating the offset
EECS 370 – Introduction to Computer Organization
30
Conditional Branch Offset Example
How is the branch target address calculated?
ARM LEGv8
CBNZ X5, #-3
EECS 370 – Introduction to Computer Organization
31
Conditional Branch Offset Example
How is the branch target address calculated? ARM LEGv8
CBNZ X5, #-3
EECS 370 – Introduction to Computer Organization
32
Conditional Branch Offset Example
How is the branch target address calculated?
ARM LEGv8
CBNZ X5, #-3
1. Offset field: 19 bits (-3 decimal)
2. Append two zeros
3. Sign extend to 64 bits
4. Add offset to PC of CBNZ instruction
EECS 370 – Introduction to Computer Organization
33
111 1111 1111 1111 1101
1 1111 1111 1111 1111 0100
1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 0100
LEGv8 Conditional Instructions Using FLAGS
• FLAGS: NZVC record the results of (arithmetic) operations
Negative, Zero, oVerflow, Carry—not present in LC-2K
• We explicitly set them using the “Set” modification to ADD/SUB etc.
ARM LEGv8
Description
ADDS X1, X2, X3
Causes the 4 flag bits to be set accordingly as the outcome is negative, zero, overflows, or generates a carry bit
EECS 370 – Introduction to Computer Organization
34
LEGv8 Condition Codes
• In LEGv8 only ADDS / SUBS / ADDIS / SUBIS / CMP /CMPI set the condition codes FLAGs or condition codes in PSR—the program status register
• Four primary condition codes evaluated:
• N – set if the result is negative (i.e., bit 63 is non-zero)
• Z – set if the result is zero (i.e., all 64 bits are zero)
• C – set if last addition/subtraction had a carry/borrow out of bit 63
• V – set if the last addition/subtraction produced an overflow (e.g., two negative numbers added together produce a positive result)
• Do not worry about the C and V bits per se. They are important but are tricky to understand.
• Instead we will just be using branches based on these results for signed numbers which is a lot easier to deal with.
EECS 370 – Introduction to Computer Organization
35
Conditional Branches
• CMP instruction lets you compare two registers, set NZVC flags • Could also use ADDS etc.
• B.condition lets you branch based on the flags set by CMP (ADDS, etc.)
ARM LEGv8
Description
CMP X1, X2
B.GT label1
Branches to label1 if value in register X1 greater than value in register X2
• What is the set of conditions for B.condition ? EECS 370 – Introduction to Computer Organization
36
LEGv8 Conditional Instructions
ARM LEGv8
CMP X1, X2 B.GT label1
CMP X3, X4 B.EQ label2
CMP X5, X6 B.LE label3
You need to know the 7 with red arrows
EECS 370 – Introduction to Computer Organization
37
Branching Far Away
• The underlying philosophy of ISA design and microachitecture in general is to make the common case fast
• In the case of branches, you are commonly going to branch to other instructions nearby.
• In ARMv8, the encoding for the displacement of conditional branches is 19 bits. • Having a displacement of 19 bits is usually enough
• BUT what if we need jump to a target (Label) that we cannot get to with a 19-bit displacement from the current PC?
CBZ X15, FarLabel
• The assembler is smart enough to replace that with
• The simple branch instruction (B) has a 26-bit
offset which spans about 64 million instructions!
EECS 370 – Introduction to Computer Organization
38
CBNZ X15, L1
B FarLabel L1:
Unconditional Branching Instructions
• There are three types of unconditional branches in the LEGv8 ISA.
• The first (B) is the PC relative branch with the 26-bit offset from the last slide. • The second (BR) jumps to the address contained in a register (X30 above)
• The third (BL) is like our PC relative branch but it does something else.
• It sets X30 (always) to be the current PC+4 before it branches. • Why?
• Function calls – return to next instruction
EECS 370 – Introduction to Computer Organization
39
Logistics
• There are 3 videos for lecture 5 • L5_1 – Assembly_Data-Layout
• L5_2 – Assembly_Flow-Control
• L5_3 – C-to-Assembly_Examples
• There are two worksheets for lecture 5
1. Data Layout
2. C to Assembly – wait until after next video
EECS 370 – Introduction to Computer Organization
40
L5_3 C-to-Assembly_Examples
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 41 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• Translate C-code statements to ARM assembly code
• Break down complex C-code branching instructions into a series of assembly
operations
• Map conditions in C to comparison and branch instructions in assembly
EECS 370 – Introduction to Computer Organization
42
Branching – Example
Problem: Convert the C code to LEGv8 assembly: 1: using labels, 2. without labels
Register to variable mapping
X1→x X2→y
C-code instructions
int x, y;
// update x,y
if (x == y)
x++; else
y++;
EECS 370 – Introduction to Computer Organization
43
Branching – Example
Problem: Convert the C code to LEGv8 assembly: 1: using labels, 2. without labels
Register to variable mapping
X1→x X2→y
C-code instructions
int x, y;
// update x,y
if (x == y)
x++; else
y++;
EECS 370 – Introduction to Computer Organization
44
Branching – Example
Problem: Convert the C code to LEGv8 assembly: 1: using labels, 2. without labels
ARM LEGv8 – w/labels
CMP X1, X2 B.NE lbl1
ADDI X1, X1, #1 B lbl2
lbl1: ADDI X2, X2, #1
lbl2:
Register to variable mapping
X1→x X2→y
C-code instructions
int x, y;
// update x,y
if (x == y)
x++; else
y++;
EECS 370 – Introduction to Computer Organization
45
Branching – Example
Problem: Convert the C code to LEGv8 assembly: 1: using labels, 2. without labels
ARM LEGv8 – w/o labels
CMP X1, X2 B.NE 3
ADDI X1, X1, #1 B2
lbl1: ADDI X2, X2, #1
lbl2:
Register to variable mapping
X1→x X2→y
C-code instructions
int x, y;
// update x,y
if (x == y)
x++; else
y++;
EECS 370 – Introduction to Computer Organization
46
Loop – Example
Problem: Convert the C code to LEGv8 assembly (assume no registers initialized)
Register to variable mapping
X1→i
X2→sum
X4→#10
X5→a[i]
X6→i*8
a is array of long long integers (64 bits, 8 bytes) Start of a at address 100, sum starts at address 96
C-code instructions
sum = 0;
for (i=0 ; i < 10 ; i++) {
if (a[i] >= 0) { sum += a[i];
} }
EECS 370 – Introduction to Computer Organization
47
Loop – Example
Problem: Convert the C code to LEGv8 assembly (assume no registers initialized)
C-code instructions
sum = 0;
for (i=0 ; i < 10 ; i++) {
if (a[i] >= 0) { sum += a[i];
} }
Register to variable mapping
X1→i
X2→sum
X4→#10
X5→a[i]
X6→i*8
a is array of long long integers (64 bits, 8 bytes) Start of a at address 100, sum starts at address 96
EECS 370 – Introduction to Computer Organization
48
Loop – Example
Problem: Convert the C code to LEGv8 assembly (assume no registers initialized)
Register to variable mapping
X1→i
X2→sum
X4→#10
X5→a[i]
X6→i*8
a is array of long long integers (64 bits, 8 bytes) Start of a at address 100, sum starts at address 96
ARM LEGv8
MOV X1, XZR MOV X2, XZR MOVZ X4, #10
Loop1: CMP
B.GE endLoop
LSL X6, X1, #3 LDUR X5, [X6, #100] CMPI X5, #0
B.LT endif
ADD X2, X2, X5 STURW X2,[XZR,#96]
endif: ADDI
B Loop1
endLoop:
X1, X4
X1, X1, #1
C-code instructions
sum = 0;
for (i=0 ; i < 10 ; i++) {
if (a[i] >= 0) { sum += a[i];
} }
EECS 370 – Introduction to Computer Organization
49
Loop – Example
Alternate Solution: do-while
Register to variable mapping
X1→i
X2→sum
X4→#10
X5→a[i]
X6→i*8
a is array of long long integers (64 bits, 8 bytes) Start of a at address 100, sum starts at address 96
ARM LEGv8
MOV X1, XZR MOV X2, XZR MOVZ X4, #10
Loop1: LSL
LDUR X5, [X6, #100]
CMPI X5, #0
B.LT endif
ADD X2, X2, X5 STUR X2, [XZR, #96]
endIf: ADDI
CMP X1, X4
B.LT Loop1 endLoop:
X6, X1, #3
X1, X1, #1
C-code instructions
sum = 0;
for (i=0 ; i < 10 ; i++) {
if (a[i] >= 0) { sum += a[i];
} }
EECS 370 – Introduction to Computer Organization
50
Logistics
• There are 3 videos for lecture 5 • L5_1 – Assembly_Data-Layout
• L5_2 – Assembly_Flow-Control
• L5_3 – C-to-Assembly_Examples
• There are two worksheets for lecture 5
1. Data Layout
2. C to Assembly – can do this now
EECS 370 – Introduction to Computer Organization
51