OSU CSE 2421
Required Reading: Computer Systems: A Programmer’s Perspective, 3rd Edition Chapter 3, Section 3.8 through 3.8.4 (inclusive)
J.E.Jones
OSU CSE 2421
Arrays
◦ One-dimensional
◦ Multi-dimensional (nested) ◦ Multi-level
Structures ◦ Allocation ◦ Access
◦ Alignment
J. E. Jones
OSU CSE 2421
Most General Form of the address expression Imm(Rb,Ri,S) Mem[Imm+ Reg[Rb]+S*Reg[Ri]]
or Address = Imm+Rb+Ri*S ◦ Imm: Constant “displacement”
It’s often a “displacement” of 1, 2, 4 or 8 bytes, but can be any constant value
◦ Rb:
◦ Ri:
◦ S:
◦ This form is seen often when referencing elements of arrays ◦ DON’T CONFUSE Imm and S!!
Base register: Any of the16 integer registers Index register: Any register except %rsp Scale: Only 1, 2, 4, or 8 (why these numbers?)
Imm can be *any* constant S can only be 1, 2, 4, or 8
J. E. Jones
OSU CSE 2421
Review
◦ If p is a pointer to data type T
◦ And the value of p (i.e., an address) is x_p
◦ Then, then p+i has value x_p + L*i
where, L is the size of data type T
◦ Thus, for an array A of elements, A[i] == *(A+i)
◦ In C the compiler takes care of multiplying i by L for us
Example
◦ int E[10]; /*Assume int is 4 bytes long */
C expression E
Type int *
Comment
Address to start of array
E[i] &(E[i]) E+i-1 *(E+i-3)
int int * int * int
Value of array element i Address of array element i Address of array element i-1 Value of array element i-3
J. E. Jones
OSU CSE 2421
X86-64 is consistent with what we saw in C to a point
◦ If p is a pointer to data type T
◦ And the value of p (i.e., an address) is x_p
◦ Then, then p+i has value x_p + L*i
where, L is the size of data type T
◦ Thus, for an array A of elements, A[i] == *(A+L*i)
◦ In x86-64 we must take care of multiplying i by L ourselves
Usually, the easiest way to do this is with Imm(Base, Index, Scale) constructs
Example
◦ int E[10]; /*Assume int is 4 bytes long */
Note memory address is 8-bytes
◦ Suppose rdx holds starting address of array E
C expression Type Assembly code result in rax/eax Comment
◦ Suppose rcx holds integer index i movl (%rdx,%rcx,4),%eax, or
Address copy
E int * movq %rdx, %rax
E[i] int &(E[i]) int * E+i-1 int *
movl E(,%rcx,4), %eax if E is a label leaq (%rdx,%rcx,4),%rax
leaq -4(%rdx,%rcx,4),%rax
movb -12(%rdx,%rcx,4),%al
Reference memory Generate address Generate address Reference memory
*(E+i-3) int
J. E. Jones
Note memory reference is 4-bytes
OSU CSE 2421
C declaration: type array[length]
Arrays store multiple data objects of the same type
Stored sequentially, often accessed as an offset from a pointer which points to the beginning of the array.
◦ size = length*sizeof(type)
If x is the address of the first byte of the first element in the array, then
array element i will be stored at address x+sizeof(type)*i
THIS DOESN’T CHANGE BETWEEN C AND X86!
◦ Nor should it since all C code gets compiled to assembler within gcc ◦ If it wasn’t consistent nothing would work!
J. E. Jones
OSU CSE 2421
Basic Principle T A[L];
◦ Array of data type T and length L
◦ Contiguously allocated region of L * sizeof(T) bytes in memory
char string[12]; if char *x==string
x
x + 12
val[2] val[3] val[4]
int val[5]; if int *x==val
val[0]
val[1]
double a[3]; if x==a
a[2]
char *p[3]; if x==p
p[0]
p[1]
p[2]
x
x + 4 a[0]
x + 8
x + 12 a[1]
x + 16
x + 20
x
x + 8
x + 16
x + 24 x + 24
x
x + 8
x + 16
J. E. Jones
OSU CSE 2421
Basic Principle T A[L];
◦ Array of data type T and length L
◦ Identifier A can be used as a pointer to array element 0: Type T*
Reference val[4]
Type
Value 3
x
x+4 x+8
val val+1 &val[2] val[5] *(val+1) val + i
int int * int * int * int int int *
??
5 x+4i
int val[5];
15213
x + 4 x + 8 x + 12 x + 16 x + 20
x
J. E. Jones
OSU CSE 2421
#define ZLEN 5
typedef int zip_dig[ZLEN];
zip_dig cmu = { 1, 5, 2, 1, 3 }; zip_digmit ={0,2,1,3,9}; zip_digucb ={9,4,7,2,0};
zip_dig cmu;
15213
zip_dig mit; zip_dig ucb;
02139
16 20 24 28 32 36
36 40 44 48 52 56
94720
56 60 64 68 72 76
Declaration zip_dig cmu equivalent to int cmu[5]
Example arrays were allocated in successive 20-byte blocks
◦ Not guaranteed to happen in general
J. E. Jones
OSU CSE 2421
zip_dig cmu;
15213
int get_digit(zip_dig z, int digit) {
Register %rdi contains starting address of array
return z[digit]; }
Register %rsi contains array index
X86‐64
Use memory reference (%rdi,%rsi,4)
# %rdi = z
# %rsi = digit
movl (%rdi,%rsi,4), %eax
# z[digit]
use movl instruction to move 4 bytes
16 20 24 28 32 36
Desired digit at %rdi + 4*%rsi
Use 4-byte register %eax as destination
J. E. Jones
OSU CSE 2421
# %rdi = z movq $0,%rax
#i=0
# goto Test
# loop:
# z[i]++
# i++
# middle
# i:4 (ZLEN-1)
# if <=, goto Loop
jmp Loop: incl
Test
incq Test:
%rax
cmpq
jle Loop ret
(%rdi,%rax,4)
$4, %rax
void zincr(zip_dig z) { size_t i;
for (i = 0; i < ZLEN; i++)
z[i]++; }
# ret
J. E. Jones
OSU CSE 2421
Consider the following C code: static int array[30];
static int x = array[25];
Which is equivalent to assembly code:
REMINDER: $ in assembly language with a label gives an address. array and x must have been defined in the data segment (.data section) of the program.
pushq %rbx
movq $array, %rbx
movq $25, %rcx
movl (%rbx,%rcx,4),%eax
movl %eax, $x
popq %rbx
Why can’t we combine the last 2 instructions? movl (%rbx, %rcx,4), $x
# %rbx is base register # %rcx is index register # %eax = array[25]
# x = %eax
J. E. Jones
OSU CSE 2421
Consider the following C code: static int array[30];
static int x = array[25];
Which is equivalent to assembly code:
REMINDER: $ in assembly language with a label gives an address. array and x must have been defined in the data segment (.data section) of the program.
movq $array, %rbx
movq $25, %rcx
movl (%rbx,%rcx,4),%eax movl %eax, $x
# %rbx is base register # %rcx is index register # %eax = array[25]
# x = array[25]
Why can’t we combine the last 2 instructions? movl (%rbx, %rcx,4), $x Because memory to memory moves are not legal in x86-64
J. E. Jones
OSU CSE 2421
static int array[30]; int x;
C code
.data
.align 4
array:
.skip 120, 0 x:
.long 0
X86-64 equivalent
J. E. Jones
OSU CSE 2421
C code:
int MyFunction1() {
int data[20];
... }
What are the class/scope/linkage of the array?
J. E. Jones
OSU CSE 2421
C code:
int MyFunction1() {
int data[20];
... }
What are the class/scope/linkage of the array? Where is the array located? Stack or Heap?
Automatic/Block/None
J. E. Jones
OSU CSE 2421
C code:
int MyFunction1() {
int data[20];
... }
What are the class/scope/linkage of the array?
Automatic/Block/None
Where is the array located? Stack or Heap?
Stack. So how do we do that in x86-64??
J. E. Jones
OSU CSE 2421
C code:
int MyFunction1() {
int data[20];
... }
x86-64 code:
MyFunction1: pushq %rbp
movq %rsp, %rbp subq $80,%rsp
# must do stack housekeeping first
leaq (%rsp), %rax
#using %rax as base register for int data[20] array
#OR movq %rsp, %rax movl (%rax,%rsi,4), %edx
...
#Allocate space for array
#on the stack: 20 elements, 4 bytes each=80 bytes
J. E. Jones
OSU CSE 2421
C code:
void MyFunction2() {
}
x86-64 code:
char buffer[6]; ...
MyFunction2: pushq %rbp
movq %rsp, %rbp subq $6,%rsp
# allocate 6 bytes for array
# %rax is base register for char buffer[6] # OR movq %rsp, %rax
leaq (%rsp), %rax
...
J. E. Jones
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp subq $6, %rsp leaq (%rsp), %rax ...
#OR movq %rsp, %rax
When we enter MyFunction2: %rsp
Caller Ret Address 8-byte value
Higher Addresses
Lower Addresses
J. E. Jones
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp subq $6, %rsp leaq (%rsp), %rax ...
#OR movq %rsp, %rax
1 byte chunks
After pushq %rbp : %rsp
Caller’s %rbp 8-byte value
Caller Ret Address 8-byte value
8
byte chunks
Lower Addresses
Higher Addresses
J. E. Jones
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp
subq $6, %rsp leaq (%rsp), %rax ...
#OR movq %rsp, %rax
1 byte chunks
After movq %rsp, %rbp: %rsp
%rbp
Caller’s %rbp 8-byte value
Caller Ret Address 8-byte value
8
byte chunks
Lower Addresses
Higher Addresses
J. E. Jones
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp subq $6, %rsp leaq (%rsp), %rax ...
%rax
buffer[0] 1-byte value
#OR movq %rsp, %rax
buffer[1] 1-byte value
1 byte chunks
%rsp
buffer[2] 1-byte value
After
buffer[3] 1-byte value
subq $6, %rsp
buffer[4] 1-byte value
%rax+6
buffer[5] 1-byte value
%rbp
Caller’s %rbp 8-byte value
%rax+12
Caller Ret Address 8-byte value
8
byte chunks
2nd byte of 8 byte %rbp
J. E. Jones
Lower Addresses
Higher Addresses
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp subq $6, %rsp leaq (%rsp), %rax ...
%rax
buffer[0] 1-byte value
#OR movq %rsp, %rax
buffer[1] 1-byte value
1 byte chunks
%rsp
buffer[2] 1-byte value
After
buffer[3] 1-byte value
leaq (%rsp), %rax
buffer[4] 1-byte value
%rax+6
buffer[5] 1-byte value
%rbp
Caller’s %rbp 8-byte value
%rax+12
Caller Ret Address 8-byte value
8
byte chunks
2nd byte of 8 byte %rbp
J. E. Jones
Lower Addresses
Higher Addresses
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp
subq $6, %rsp
leaq (%rsp), %rax #OR movq %rsp, %rax pushq %rbx %rsp
%rbx 8-byte value
...
buffer[2] 1-byte value
1 byte chunks
%rax
buffer[0] 1-byte value
%rax+6
buffer[4] 1-byte value
%rbp
Caller’s %rbp
8-byte value 8
%rax+12
2nd byte of 8 byte %rbp
Higher Addresses
J. E. Jones
buffer[1] 1-byte value
buffer[3] 1-byte value
buffer[5] 1-byte value
Caller Ret Address 8-byte value
byte chunks
Lower Addresses
OSU CSE 2421
MyFunction2:
pushq %rbp
movq %rsp, %rbp subq $6, %rsp leaq (%rsp), %rax pushq %rbx
%rax
%rbx 8-byte value
...
Note %rsp continues to change, but we still have %rax that contains the starting address of the array
buffer[2] 1-byte value
1 byte chunks
#OR movq %rsp, %rax
buffer[0] 1-byte value
%rsp
buffer[1] 1-byte value
%rax+6
buffer[4] 1-byte value
%rbp
Caller’s %rbp
8-byte value 8
%rax+12
2nd byte of 8 byte %rbp
Higher Addresses
J. E. Jones
buffer[3] 1-byte value
buffer[5] 1-byte value
Caller Ret Address 8-byte value
byte chunks
Lower Addresses
OSU CSE 2421
MyFunction2:
Lower Addresses
pushq %rbp
movq %rsp, %rbp
subq $6, %rsp
leaq (%rsp), %rax %rsp, %rax
#OR movq
%rax %rsp
%rbx 8-byte value
pushq %rbx
buffer[0] 1-byte value
...
buffer[1] 1-byte value
What happens if %rcx equals 12, and the code tries to access (%rax,%rcx,1)? For example, suppose %dl equals 5, and this instruction is executed:
buffer[2] 1-byte value
1 byte chunks
movb %dl, (%rax,%rcx,1)
buffer[5] 1-byte value
%rax+6
buffer[4] 1-byte value
%rbp
Caller’s %rbp
8-byte value 8
%rax+12
2nd byte of 8 byte %rbp
Higher Addresses
J. E. Jones
buffer[3] 1-byte value
Caller Ret Address 8-byte value
byte chunks
OSU CSE 2421
MyFunction2:
Lower Addresses
pushq %rbp
movq %rsp, %rbp
subq $6, %rsp
leaq (%rsp), %rax %rsp, %rax
#OR movq
%rbx 8-byte value
pushq %rbx
%rax %rsp
buffer[0] 1-byte value
...
What happens if %rcx equals 12, and the code tries to access (%rax,%rcx,1)? For example, suppose %dl equals 5, and this instruction is executed:
buffer[1] 1-byte value
movb %dl, (%rax,%rcx,1)
buffer[3] 1-byte value
Buffer Overflow! What was on the stack where we wrote 5??
%rax+6
buffer[4] 1-byte value
We may have written over something important!
buffer[5] 1-byte value
%rbp
Caller’s %rbp 8-byte value
5
8
byte chunks
%rax+12
Caller Ret Address 8-byte value
2nd byte of 8 byte %rbp
Higher Addresses
buffer[2] 1-byte value
1 byte chunks
J. E. Jones
OSU CSE 2421
Look for large allocation on the stack
Look for data references using a register other than %rsp or
%rbp as the base
StackArrayEx: pushq %rbp
movq %rsp, %rbp pushq %rbx
subq $520, %rsp
leaq (%rsp), %rbx
movl $0x0,(%rbx)
What options can you think of for how the array was declared?
#OR movq %rsp, %rbx #set first element to 0
J. E. Jones
OSU CSE 2421
Look for large allocation on the stack
Look for data references using a register other than %rsp or %rbp as the base StackArrayEx:
pushq %rbp
movq %rsp, %rbp
subq $520, %rsp
leaq (%rsp), %rbx
movl $0x0,(%rbx)
What options can you think of for how the array was declared?
char buffer[520];
#OR movq %rsp, %rbx #set first element to 0
J. E. Jones
OSU CSE 2421
Look for large allocation on the stack
Look for data references using a register other than %rsp or %rbp as the base StackArrayEx:
pushq %rbp
movq %rsp, %rbp
subq $520, %rsp
leaq (%rsp), %rbx
movl $0x0,(%rbx)
What options can you think of for how the array was declared?
char buffer[520];
short buffer[260]; Any of these would be options, right?
int buffer[130];
long buffer[65];
#OR movq %rsp, %rbx #set first element to 0
J. E. Jones
OSU CSE 2421
Look for large allocation on the stack
Look for data references using a register other than %rsp or %rbp as the base StackArrayEx:
pushq %rbp
movq %rsp, %rbp
subq $520, %rsp
leaq (%rsp), %rbx
movl $0x0,(%rbx)
What options can you think of for how the array was declared?
char buffer[520];
short buffer[260];
int buffer[130];
long buffer[65];
#OR movq %rsp, %rbx #set first element to 0
Any of these would be options, right?
This one is likely the correct one, since code above uses ‘l’ suffix to set first element to 0.
J. E. Jones
OSU CSE 2421
For the array on the preceding slide, how could the compiler generate code for a loop to initialize all the array elements to 0?
StackArrayEx: pushq %rbp
movq %rsp, %rbp subq $520, %rsp pushq %rbx
leaq (%rsp), %rbx
#base register
#OR movq %rsp, %rbx
# number of array elements
movq $130, %rcx
initialize:
decq %rcx
jl next
movl $0x0, (%rbx,%rcx,4) jmp initialize
# decrement index
# if less than 0 done Why not jle?? # set 4 bytes of memory to zero
# go again
next: ...
J. E. Jones
OSU CSE 2421
For the array on the preceding slides, what needs to happen with respect to cleanup before return?
buffer[0] 4-byte value
StackArrayEx:CREATION pushq %rbp
movq %rsp, %rbp
%rsp %rcx
buffer[1] 4-byte value
subq $520, %rsp leaq (%rsp), %rcx
buffer[4] 4-byte value
%rcx
#base register #OR movq %rsp,
%rbp
buffer[129] 4-byte value
buffer[2] 4-byte value
buffer[3] 4-byte value
Caller’s %rbp 8-byte value
Caller Ret Address 8-byte value
J. E. Jones
OSU CSE 2421
For the array on the preceding slides, what needs to happen with respect to cleanup before return?
buffer[0] 1-byte value
StackArrayEx:CREATION pushq %rbp
movq %rsp, %rbp
%rsp %rcx
buffer[1] 1-byte value
subq $520, %rsp leaq (%rsp), %rcx
#base register #OR movq %rsp,
buffer[4] 1-byte value
%rcx
Return:CLEANUP movq %rbp, %rsp
#leave instruction %rbp %rsp
popq %rbp ret
buffer[519] 1-byte value
%rbp
Caller’s %rbp 8-byte value
buffer[2] 1-byte value
buffer[3] 1-byte value
Caller Ret Address 8-byte value
J. E. Jones
OSU CSE 2421
“Global” Arrays (i.e. Static Class/Can be either File or Block Scope) Arrays of elements with initial values of 0 by default
◦ If stored in the .data section of application (i.e., static arrays) Accessed through a memory address
.section
.data staticArray1: .skip 48,0
# staticArray1 is 48 bytes long
# initialized to zero
# equivalent to static char staticArray1[48];
staticArray2: .long 1
.long 2
.long 3
# staticArray2 is 20 bytes long
# equivalent to
# static int staticArray2[5]={1,2,3,4,5}
.long 4 .long 5
J. E. Jones
OSU CSE 2421
“Global” Arrays (i.e. Static Class/Can be either File or Block Scope) Arrays of elements with initial values of 0 by default
◦ If stored in the data section of application (i.e., static arrays) Accessed through a memory address
MemArrayEx: pushq %rbp
movq %rsp, %rbp pushq %rbx
movq $staticArray1, %rsi movq $0x0, %rbx
movb $0x0,(%rsi,%rbx,1)
#base register #index register
#set 1st element to 0
J. E. Jones
OSU CSE 2421
If an array holds elements larger than 1 byte, the index will need to be multiplied by the size of the element
◦ The Scale element of the address computation takes care of that for us
#access to array of elements of size 4, #with scaling, where rax holds the index i, #and rbx is base register:
#e.g., arr[i] = 0x11223344
movl $0x11223344, (%rbx,%rax,4)
...
J. E. Jones
OSU CSE 2421
What if the array holds elements larger than 8 bytes? For example, what if it is an array of structures?
Recall that, in x86-64, scaling factors can only be: 1, 2, 4, or 8 so (%rbx, %rcx, 20) won’t compile!!
Therefore, for arrays with elements larger than 8 bytes, manual scaling must be used
What if we had an array of structures where each structure is 20 bytes??? #Here, two index registers are used, one
#for the conventional index (here, %rcx), 0 <= %rcx =< n -1
#and one for a scaled index register,
#here, %rax. This is “manual scaling.” 0<= %rax <= 20*(n-1)
movq $5, %rcx
# if we want to access element at index 5...
# signed multiply
# imulq aux, src, Dest
# manually scale index, NOTE! 3 operand mult
#%rax = 20*%rcx
imulq $20,%rcx,%rax
#Suppose we want: ptr = &arr[5] leaq (%rbx,%rax), %rdx
# what is in %rdx???
J. E. Jones
OSU CSE 2421
What if the array holds elements larger than 8 bytes? For example, what if it is an array of structures?
Recall that, in x86-64, scaling factors can only be: 1, 2, 4, or 8 so (%rbx, %rcx, 20) won’t compile!!
Therefore, for arrays with elements larger than 8 bytes, manual scaling must be used
What if we had an array of structures where each structure is 20 bytes??? #Here, two index registers are used, one
#for the conventional index (here, %rcx), 0 <= %rcx =< n -1
#and one for a scaled index register,
#here, %rax. This is “manual scaling.” 0<= %rax <= 20*(n-1)
movq $5, %rcx
# if we want to access element at index 5...
# signed multiply
# imulq aux, src, Dest
# manually scale index, NOTE! 3 operand mult
#%rax = 20*%rcx
imulq $20,%rcx,%rax
#Suppose we want: ptr = &arr[5] leaq (%rbx,%rax), %rdx
# what is in %rdx???
Address to beginning of structure in an array!
J. E. Jones
OSU CSE 2421
What if the array holds elements larger than 8 bytes? For example, what if it is an array of structures?
Recall that, in x86-64, scaling factors can only be: 1, 2, 4, or 8 so (%rbx, %rcx, 20) won’t compile!!
Therefore, for arrays with elements larger than 8 bytes, manual scaling must be used
What if we had an array of structures where each structure is 20 bytes??? #Here, two index registers are used, one
#for the conventional index (here, %rcx),
#and one for a scaled index register,
#here, %rax. This is “manual scaling.” movq $0, %rcx
imulq $20,%rcx,%rax
#Suppose we want: ptr = &arr[i] leaq (%rbx,%rax), %rdx
# signed multiply
# imulq aux, src, Dest
# manually scale index, NOTE! 3 operand mult
# what is in %rdx???
Address to beginning of structure in an array!
J. E. Jones