CS2421 Autumn 2013
CSE 2421
X86-64Assembly Language – Part 3: assembler directives, first programs, arithmetic instructions, stack
Required Reading: Computer Systems: A Programmer’s Perspective, 3rd Edition Chapter 3, Section 3.5 through 3.5.4 (inclusive)
Assembler directives (“pseudo-ops”)
.file
Allows a name to be assigned to the assembly language source code file.
.section
This makes the specified section the current section.
.rodata
Specifies that the following data is to be placed in the read only memory portion of the executable
.data
Changes or sets the current section to the data section
.text
Changes or sets the current section to the text (or code) section
Assembler directives (continued)
.globl
A directive needed by the linker for symbol resolution: followed by name of function
.type
Needed by the linker to identify the label as one associated with a function, as opposed to data
.size
Needed by the linker to identify the size of the text for the program
Note: labels (for functions or data) in assembly language source code are followed by a colon.
Data size assembler directives
What if I want to “declare” static class variables???
In the .data or .rodata section of your program use:
.quad value
Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 8 bytes
.long value
Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 4 bytes
.word value
Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 2 bytes
.byte value
Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 1 byte
.string
Specifies that the characters enclosed in quotation marks are to be stored in memory, terminated by a null byte
5
First X86 program
.file “first.s”
.section .rodata
.data
.align 8
Array:
.quad 0x6f
.quad 0x84
.globl main
.type main, @function
.text
main:
pushq %rbp
movq %rsp, %rbp
movq $55,%rdx
movq %rdx, %rbx
movq $Array, %rax
movq %rbx,8(%rax)
movq (%rax),%rcx
leave
ret
.size main, .-main
5
6
Second X86 program
.file “second.s”
.section .rodata
.data
.align 8
Array:
.quad 0x6f
.quad 0x84
.quad 0x55
.quad 0x44
.globl main
.type main, @function
.text
main:
pushq %rbp
movq %rsp, %rbp
movq $55,%rdx
movq %rdx, %rbx
movq $0x33, %r8
movq $Array, %rax
movq %rbx,8(%rax)
movq %r8, 24(%rax)
movq %rax,(%rax)
movq (%rax),%rcx
leave
ret
.size main, .-main
6
Some Arithmetic Operations
Two Operand Instructions:
Format Computation
add Src,Dest #Dest = Dest + Src
sub Src,Dest #Dest = Dest Src
imul Src,Dest #Dest = Dest * Src signed multiply
mul Src,Dest #Dest = Dest * Src unsigned multiply
sal Src,Dest #Dest = Dest << Src Also called shlq
sar Src,Dest #Dest = Dest >> Src Arithmetic (fills w/copy of sign bit)
shr Src,Dest #Dest = Dest >> Src Logical (fills with 0s)
xor Src,Dest #Dest = Dest ^ Src
and Src,Dest #Dest = Dest & Src
or Src,Dest #Dest = Dest | Src
Watch out for argument order!
Except for mul, no distinction between signed and unsigned int (why?)
Don’t forget to include a suffix for each of these instructions.
The multiply instruction has other options and the divide instruction is a completely different animal. We’ll look at them later.
Arithmetic Operation Examples
Format Computation
addq %rax,%rcx # %rcx = %rax + %rcx
addl %eax, %ecx # %rcx = %eax + %ecx
subw %ax, %cx # %cx = %cx %ax
subb %al, %cl # %cl = %cl – %al
imulq %rax,%rcx # %rcx = %rcx * %rax
imull %eax, %ecx # %ecx = %ecx * %eax
sarw $3, %cx # %cx = %cx >> 3
andl 0x0f0f0f0f, %ecx # %ecx = %ecx & 0x0f0f0f0f
Some Arithmetic Operations
One Operand Instructions
inc Dest #Dest = Dest + 1
dec Dest #Dest = Dest 1
neg Dest #Dest = Dest
not Dest #Dest = ~Dest
See book for more instructions (Figure 3.10)
Obviously, each of these instructions must use the appropriate suffix based on the Destination size
More Arithmetic Examples
One Operand Instructions
incq %rax # %rax = %rax + 1
decl %eax # %eax = %eax 1
negw %ax # %ax = %ax
notb %al # %al = ~%al
long arith(long x, long y, long z)
{
long t1, t2, t3, t4, t5, rval;
t1 = x+y;
t2 = z+t1;
t3 = x+4;
t4 = y * 48;
t5 = t3 + t4;
rval = t2 * t5;
return rval;
}
arith:
leaq (%rdi,%rsi), %rax # t1 = x+y
addq %rdx, %rax # t2 = z + t1
leaq (%rsi,%rsi,2),%rdx # %rdx = 3y
salq $4, %rdx # %rdx * 16
leaq 4(%rdi,%rdx), %rcx # x + t4 + 4
imulq %rcx, %rax # t2=t2*t5
ret
Register Use(s)
%rdi Argument x
%rsi Argument y
%rdx Argument z
%rax t1, t2, rval
%rdx t4
%rcx t5
Interesting Instructions
leaq: address computation
salq: shift arithmetic left
imulq: signed multiply
But, only used once
Arithmetic Expression Example
(z+x+y)*((x+4)+(y*48))
X86 program stack
The program stack is divided conceptually into frames.
Each procedure or function (main and any functions called from main or from another function) has its own part of the stack to use, which is called its stack frame.
The stack frame goes from the stack address pointed to by %rbp in that procedure, this is called the frame (or base) pointer, to %rsp, which points to the top of the stack while the procedure is running.
This implies that the address pointed to by %rbp is different in different procedures: %rbp must be set when the procedure is entered.
X86 Stack
Stack top address always held in register %rsp
Stack grows towards lower addresses
Where is %rbp???
That depends…
%rsp
•
•
•
Increasing
Addresses
Stack “Bottom”
Stack “Top”
Uses of the stack in X86-64
Save the caller’s %rbp (frame pointer) before setting our own frame pointer;
Before calling another function, to preserve values needed;
To pass parameters to another function (if there are more than 6 parameters to pass);
To store the return address when a call instruction is executed.
If we need more temp data than available registers, (i.e. automatic, block scope variables)
Put something on the stack
pushX source, where X is the suffix q or the suffix w
What it does:
Decrement %rsp by number of bytes specified by opcode suffix and write bytes (of size specified by opcode suffix), to memory address in %rsp. This puts the value on top of the stack
Note: Because operands of different sizes can be pushed, the number of bytes which is subtracted from the stack pointer depends on the operand size suffix. Since %rbp and %rsp contain stack addresses, they are always referenced as an 8-byte value.
Syntax
pushX
pushX
pushX
Examples
pushq %rax – subtract 8 bytes from the value in %rsp, and then copy the value in
%rax onto the stack at the address pointed to by %rsp.
pushw %ax – subtract 2 bytes from the value in %rsp, and then copy the value in %ax
onto the stack at the address pointed to by %rsp.
To avoid data alignment calculations/confusion with respect to the stack, we will only be pushing/popping 8 byte values for this class.
Note: pushl and pushb are not valid instructions in x86-64, although they are valid on 32-bit processors.
Take something off the stack
popX destination, where X is the suffix q or the suffix w
Read the number of bytes specified by the suffix from the address in %rsp and store it in destination; increment %rsp by number of bytes specified by opcode suffix so that %rsp now points to the next value on the stack.
Syntax
popX
popX
Examples
popw %ax – copy 2 bytes from stack into %ax, and add 2 bytes to %rsp.
popq %rax – copy 8 bytes from stack into %rax, and add 8 bytes to %rsp.
To avoid data alignment calculations/confusion with respect to the stack, we will only be pushing/popping 8 byte values for this class.
Note: popl and popb are not valid instructions in x86-64, although they are valid on 32-bit processors.
Function calls and returns
To use function calls and returns in our X86 program, we have to manage the program stack and program registers correctly.
Two different aspects to this:
Maintain the stack pointer, frame pointer and associated data in relation to each function call and return. (The OS initializes these values upon system start.)
Place appropriate values in “some” registers as expected by a calling or caller function. More on this later.
Setting up the program stack
In X86 programs, for each function, you must set up the stack frame in your assembly language source code.
There are three things to do:
At the start of a function:
1. Set %rbp to point to the bottom of the current stack frame.
2. Set %rsp to point to the top of the stack (the same address as the stack bottom initially).
At the end of a function:
3. Restore %rbp and %rsp to what they were before the function was called (we will see how to do this below)
Call instruction
call Dest
Dest is a label which has been placed in the assembly language source code at the address of the procedure to be called.
call instruction does 2 things:
1. the address of the instruction immediately after the call instruction is pushed onto the stack (that is, the return address is pushed), and
2. the Dest (remember it’s an address) is assigned to the PC (%rip) register.
This means that, when the called function begins execution, the return address to the calling function is the last thing that has been pushed onto the stack.
System Stack
Some stack value
Before executing the call instruction
%rsp
Lower
Addresses
Higher
Addresses
call Swap_it
…
System Stack
Caller Ret Address
8 byte value
Some stack value
After executing the call instruction
%rsp
Lower
Addresses
Higher
Addresses
call Swap_it
…
Address of Swap_it
%rip
Setting up our stack frame
Part 1:
pushq %rbp # Save caller’s base pointer
movq %rsp, %rbp # Set my base pointer
Put these two instructions at the beginning of your function before any other statements!
* Notice that, since %rbp equals %rsp, the stack is empty.
* We are now ready to use the stack!
Part 2:
leave # set caller’s stack frame back up
Put this statement directly before the ret instruction in any function in your program.
System Stack
Caller’s %rbp
8 byte value
Caller Ret Address
8 byte value
After pushq %rbp :
%rsp
Lower
Addresses
Higher
Addresses
Swap_it:
pushq %rbp
movq %rsp, %rbp
…
System Stack
Caller’s %rbp
8 byte value
Caller Ret Address
8 byte value
After movq %rsp, %rbp:
%rsp
%rbp
Lower
Addresses
Higher
Addresses
Swap_it:
pushq %rbp
movq %rsp, %rbp
…
Cleaning up our stack frame
leave
Sets stack pointer to the base frame address
Pops what is at top of stack into %rbp (this adds 8 bytes to %rsp)
Prepares the stack for return instruction
Syntax
leave
leave – equivalent to: movq %rbp,%rsp
popq %rbp
Don’t use both leave and movq/popq in same program
nasty result.
System Stack
Caller’s %rbp
8 byte value
Caller Ret Address
8 byte value
When we are getting ready to finish up:
%rsp
%rbp
Lower
Addresses
Higher
Addresses
Swap_it:
pushq %rbp
movq %rsp, %rbp
…
leave equivalent to:
movq %rbp,%rsp
popq %rbp
ret
System Stack
Caller’s %rbp
8 byte value
Caller Ret Address
8 byte value
%rsp
Lower
Addresses
Higher
Addresses
Swap_it:
pushq %rbp
movq %rsp, %rbp
…
leave equivalent to:
movq %rbp,%rsp
popq %rbp
ret
Return instruction
ret
The ret instruction does 2 things:
1. the address of the instruction immediately after the call instruction that got us here is popped from the stack, and
2. that address is assigned to the PC (%rip) register.
System Stack
Caller’s %rbp
8 byte value
Caller Ret Address
8 byte value
%rsp
Lower
Addresses
Higher
Addresses
Swap_it:
pushq %rbp
movq %rsp, %rbp
…
leave equivalent to:
movq %rbp,%rsp
popq %rbp
ret
Caller Ret Address
8 byte value
%rip
System Stack
Caller Ret Address
8 byte value
Some stack value
After executing the call instruction
%rsp
Lower
Addresses
Higher
Addresses
call Swap_it
…
/docProps/thumbnail.jpeg