Buffer Overflow Attacks
Introduction
• A simplified, high-level view of buffer overflow attacks
– x86 architecture
– overflows on the stack
– Focus on 32-bit mode, but most things directly apply to 64-bit mode as well
Introduction
• In languages like C, you have to tell the compiler how to manage the memory.
– This is hard.
• If you get it wrong, then an attacker can usually exploit
this bug to make your application run arbitrary code.
• Countless worms, attacks against SQL servers, web servers, iPhone jailbreaks, SSH servers, …
What‘s wrong with this code?
void getname() {
char buffer[32];
gets(buffer);
printf(“Your name is %s.\n”, buffer);
}
int main(void) {
printf(“Enter your name:” );
getname();
return 0;
}
Live-Demo
“Anything that can go wrong, will go wrong”
Triggering a buffer overflow
The x86 Architecture
The program code
Static variables, Strings, etc
Data in use
Registers e.g.
The Accumulator
Instruction point “Top” of Stack “Bottom” of Frame
Low address
Text
Data
Free Memory
Stack
Cmd line args, ev vars.
EAX
EIP
ESP
EBP
High address
The x86 Architecture
The program code
Static variables, Strings, etc
Data in use
Registers e.g.
The Accumulator
Instruction point “Top” of Stack “Bottom” of Frame
Low address
Text
Data
Free Memory
Current Frame Stack
Rest of Stack (older)
Stack
EAX
EIP
ESP
EBP
High address
The Stack
The stack part of the memory is mostly “Last In, First Out”.
We can write and read to the top of the stack.
…PUSH 12345 PUSH 678245 POP EAX
….
Data
Stack
EAX:
EIP: 7797F9CD
ESP: BF914EB0
EBP: BF914ED8
You write to the stack with push
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
Stack
EAX:
EIP: 7797F9CD
ESP: BF914EB0
EBP: BF914ED8
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
123456
Stack
EAX:
EIP: 7797F9CF
ESP: BF914EAC
EBP: BF914ED8
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
123456
Stack
EAX:
EIP: 7797F9CF
ESP: BF914EAC
EBP: BF914ED8
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
678245
123456
Stack
EAX:
EIP: 7797F9D3
ESP: BF914EA9
EBP: BF914ED8
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
123456
Stack
EAX: 678245
EIP: 7797F9D3
ESP: BF914EAC
EBP: BF914ED8
The Stack
…PUSH 12345 PUSH 678245 POP EAX
….
Data
678245
123456
Stack
EAX: 678245
EIP: 7797F9D3
ESP: BF914EAC
EBP: BF914ED8
Function calls (32-bit)
void main () {
function (1,2);
}
Function calls (32-bit)
void main () {
function (1,2);
}
PUSH <2>
PUSH <1>
CALL
• Arguments 1 & 2 are passed on the stack.
• The CALL instruction puts the address of function into EIP and stores the old EIP on the stack.
• ESP
– Points to the TOP of the stack. When you push or pop data onto the stack, it goes here and ESP is then adjusted to reflect the new stack head.
• EBP
– Points to the BASE of the stack FRAME. When you call a function, this is saved on the stack and EBP now points to the base of the new stack FRAME. All local variables can be found relative to this pointer.
Stack Pointers
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
Stack
EAX:
EAX:
EIP: main
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX:
EAX:
EIP: main+1
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX:
EAX:
EIP: main+2
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX:
EAX:
EIP: function
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX:
EAX:
EIP: function+1
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EBP
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX:
EAX:
EIP: function+2
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EBP
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
All local variables in the current context are located relative to EBP
PUSH 2 PUSH 1 CALL function
…function: …
Data
Function Working space
EAX:
EAX:
EIP: function+3
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EBP
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
Function Working space
EAX: result
EAX:
EIP: function+n
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EBP
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
Function Working space
EAX:
EAX:
EIP: function+n
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
Old EBP
Old EIP
1
2
Stack
The Stack
void main () {
function (1,2);
}
PUSH 2 PUSH 1 CALL function
…function: …
Data
EAX: result
EAX:
EIP: main+3
ESIPP: :7S7t9ac7kFT9opCED
EBSP: 0S0ta1ck8BFa9sBe10
1
2
Stack
Buffer Overflows
• The instruction pointer controls which code executes,
Buffer Overflows
• The instruction pointer controls which code executes, • The instruction pointer is stored on the stack,
Buffer Overflows
• The instruction pointer controls which code executes, • The instruction pointer is stored on the stack,
• I can write to the stack …
Buffer Overflows
• The instruction pointer controls which code executes, • The instruction pointer is stored on the stack,
• I can write to the stack …J
Buffers
…
getname();
…
getname() {
char buffer[16];
gets(buffer);
}
Stack
Buffers
1. Function called … getname();
…
getname() {
char buffer[16];
gets(buffer);
}
Stack
Buffers
1. Function called …
2. EIP&EBPwritten to stack
getname(); …
getname() {
char buffer[16];
gets(buffer);
}
Old EBP
Old EIP
Stack
Buffers
1. Function called …
2. EIP&EBPwritten to stack
3. Function runs
getname(); …
getname() {
char buffer[16];
gets(buffer);
}
Old EBP
Old EIP
Stack
Buffers
1. Function called …
2. EIP&EBPwritten to stack
3. Function runs
4. Buffer allocated
getname(); …
getname() {
char buffer[16];
gets(buffer);
}
ß16 byteà
Old EBP
Old EIP
Stack
Buffers
1. Function called …
2. EIP&EBPwritten to stack
3. Function runs
4. Buffer allocated
5. User inputs “Hello World”
getname(); …
getname() {
char buffer[16];
gets(buffer);
}
Hello World
Old EBP
Old EIP
Stack
If input is >16 bytes: Hello World XXXXXXXXXXXX
…
getname();
…
getname() {
char buffer[16];
gets(buffer);
Buffer Overflow
}
Stack
If input is >16 bytes: Hello World XXXXXXXXXXXX
1. Runsasbefore
…
getname();
…
getname() {
char buffer[16];
gets(buffer);
Buffer Overflow
}
Old EBP
Old EIP
Stack
If input is >16 bytes: Hello World XXXXXXXXXXXX
1. Runsasbefore
…
getname();
…
getname() {
char buffer[16];
gets(buffer);
Buffer Overflow
}
ß16 byteà
Old EBP
Old EIP
Stack
If input is >16 bytes: Hello World XXXXXXXXXXXX
1. Runsasbefore
2. But the string flows over the end of the buffer
3. EIPcorrupted, segmentation fault
…
getname();
…
getname() {
char buffer[16];
gets(buffer);
}
Buffer Overflow
Hello World XXXX
XXXX
XXXX
Stack
Once more, with malice
1. Runsasbefore
Stack
Once more, with malice
1. Runsasbefore
2. Attack send a very long message, ending with the address of some code that gives him a shell:
Hello World XXXX XXXX97F9
Hello World XXXX
Old EBP
Old EIP
Stack
Once more, with malice
1. Runsasbefore
2. Attack send a very long message, ending with the address of some code that gives him a shell:
Hello World XXXX XXXX97F9
3. The attackers value is copied over the old EIP
Hello World XXXX
XXXX
97F9
Stack
Once more, with malice
1. Runsasbefore
2. Attack send a very long message, ending with the address of some code that gives him a shell:
Hello World XXXX XXXX97F9
3. The attackers value is copied over the old EIP
4. When the function returns the attacks code (here: at 0x97f9) is run
Hello World XXXX
XXXX
97F9
Stack
Once more, with malice
1. Runsasbefore
2. Attack send a very long message, ending with the address of some code that gives him a shell:
Hello World XXXX XXXX97F9
3. The attackers value is copied over the old EIP
4. When the function returns the attacks code (here: at 0x779ff9) is run: this code can be on the stack as well
Hello World
XXXXXX
7797F9
Stack
Live-Demo
“Anything that can go wrong, will go wrong”
Debugging a buffer overflow with gdb
What To Inject
Shell code (under Linux) is assembly code for
exec(“/bin/sh”, {NULL}, NULL)
There are some defenses in modern Linux, hence use that to indirectly call a binary that first calls setuid(0) and then spawns a shell (see msh.c on Canvas)
Live-Demo
“Anything that can go wrong, will go wrong”
Exploiting a buffer overflow
Live-Demo
“Anything that can go wrong, will go wrong”
Exploiting a buffer overflow to pop a shell
Defense: The NX-bit
The program code
Static variables, Strings, etc
Data in use
Low address
Text
Data
Free Memory
Stack
High address
Defense: The NX-bit
The program code
Low address
Text
Data
Free Memory
Stack
I injected code here
High address
Defense: The NX-bit
• Code should be in the text area of the memory. Not on the stack.
• The NX-bit provides a hardware distinction between the text and stack.
• When enabled, the program will crash if the EIP ever points to the stack.
Low address
Text
Data
Free Memory
Stack
High address
Reuse Code.
The standard attack against the NX-bit is to reuse code from the executable part of memory. E.g.
• Jump to another function in the program.
• Jump to a function from the standard C library (Return to libc)
• String together little pieces of existing code (Return-oriented programming).
Return-to-libc
• Libc is the C standard library.
• It is often packaged with executables to provide a runtime environment.
• It includes lots of useful calls like “system” which runs any command.
• It links to executable memory, therefore bypasses NX- bit protections.
Address space layout randomization.
• ASLRaddsarandomoffsettothestackandcodebaseeach time the program runs.
• Jumpsintheprogramarealteredtopointtotherightline.
• Theideaisthatitsnowhardforanattackertoguessthe address of where they inject code or the address of particular functions.
• On by default in all OS. For Linux, alter it with sysctl –w kernel.randomize_va_space=X, where X is 0 (off), 1 (on for the stack and shared libraries), 2 (1, plus also data segment).
Live-Demo
“Anything that can go wrong, will go wrong”
ASLR in action
NOP slide
• In x86 the op code assembly instruction 0x90 does nothing.
• If the stack is 2MB, I could inject 999000 bytes of 0x90 followed by my shell code
• I then guess a return address and hope it is somewhere in the 2MB of NOPs.
• If it is, the program slides down the NOPs to my shell code.
• Often used with other methods of guessing the randomness.
Metasploit
• Metasploit is a framework for testing and executing known buffer overflow attacks.
• If a vulnerability in an application is well known their will be a patch for it, but also a Metasploit module for it.
• If an application is unpatched it can probably be taken over with Metasploit.
• Metasploit also includes a library of shell code which can be injected.
• Without wishing to get into another debate, using it against machines you don’t own is illegal. Do not do this.
Recommend Paper:
• “Smashing the Stack for Fun and Profit”
Elias Levy (Aleph One)
• A simple introduction to buffer overflows from the mid 90s.
• Standard defenses now stop the attacks in this paper, but it gives an excellent introduction.
Conclusion
Buffer overflows are the result of poor memory management in languages like C: even the “best” programmers will make mistakes.
Buffer overflow attacks exploit these to overwrite memory values.
This lets an attack execute arbitrary code.