程序代写 Software Security

Software Security
Code Reuse Attacks)
#2: Format String and SFL @ TU Dortmund

Format String Attacks

Format Strings: An Introduction
• printf expects a format string as first argument
• Variable number of arguments determined by formatter
•int printf(const char *format, …);
• As usual, parameters are consumed from registers (1-6) and stack (others)
• Example:
“score %d”
“User %s,”
Format String Vulnerability in the Final Call to printf()
void print_score(char *user, int score) { char formatter[] = “User %s, score %d\n”; printf(formatter, user, score);
rdx = 0x42 (score)
rsi = user (stringptr)
rdi = &formatter
stack frame
function parameters

Format String Attacks (1/3)
• Formatter must not be attacker-controlled
• If so, attacker can read from and write to almost arbitrary memory locations • Caveat: attacker can inject and abuse arbitrary format specifiers
• Vulnerable example:
Format String Vulnerability in the Final Call to printf()
int welcome() {
int i = 0;
char username[256];
printf(“Please enter your user name: “);
while ((ch = fgetc(stdin)) != EOF
&& ch != ‘\n’
&& (i < sizeof(username) - 1)) { username[i++] = ch; username[i] = 0; printf("Hello, "); printf(username); Format String Attacks (2/3) • Demo output in x64 Example Output of Program on Previous Slide (gcc strformat.c -o strformat -O2 -fno-unroll-loops -fno-omit-frame-pointer -fno-dce -fno-dse -D_FORTIFY_SOURCE=0) # benign behavior $ echo -ne 'Peter' | ./strformat Hello, Peter # reading the stack (local vars; eventually AAAA is found) $ echo "AAAAAAAA.%p.%p.%p.%p.%p.%p" | ./strformat Please enter your user name: Hello, AAAAAAAA.0x4006f1.0x7f3f03c5adf0.0x4006f1.0x7f3f03e7b024.(nil).0x4141414141414141 non-deterministic register contents (2nd to 6th parameter read from registers) • When printf encounters a specification (e.g., %x), it reads data from the stack (or, even worse, with %n also stores data on the stack) Format String Attacks (3/3) • String format vulnerabilities allow the attacker to • read arbitrary data: - %08x reads 8 bytes from stack and prints them - %s reads 8 byte-wide string pointer from stack prints data at this location as C string • write arbitrary data - %n reads 8 bytes from stack, interprets this as address and writes the number of bytes printed so far (by printf) to this address - Control number of bytes written, e.g., %hhn (1 byte) or %hn (2 bytes) - Control data that is written by printing variable number of bytes before %n specifier, e.g., via %64u (64 bytes); prepend this output specifier before %n - Direct parameter access:%6$n→write to address stored in 6th specifier Code-Reuse Attacks Non-Executable Stack • Shellcode could execute because we assumed stack to be executable • However, stack should actually just contain non-executable data • A non-executable stack prevents classical shellcode attacks • Memory Management Unit (MMU) enforces non-executable stack • Basic principle: a page is either writable (W), or executable (X), never W+X • NX bit in page table helps to enforce this write-xor-execute (W^X) mechanism • In Windows, this scheme is called Data Execution Prevention (DEP) • It is no longer possible to execute code stored on the stack • ...unless privileges are set accordingly (e.g., mprotect(), -z execstack) • ...unless attacker stores pointers to existing code (→ code-reuse attack) Code-Reuse Attacks • Code execution by reusing existing code • Assume that the stack is non-executable • We can overrun the buffer, but cannot add shellcode • Idea: re-use code (by definition, code is executable) char *mystr(char *fn) { char mystr[32]; reg FILE *f = fopen(fn, "rb"); fread(mystr, 2048, 1, f); return NULL; Return-to-Libc (1/2) • Code execution by reusing existing functions • Attacker can re-use any of the linked system functions • Instead of placing shellcode, attacker diverts control flow to such functions • Attack known as return-to-libc, as libc library is a prominent jump target system(char *cmd) Return-to-Libc (2/2) • Example: return to start of system(char *cmd) to execute cmd • In x86, function parameter could have just been placed on the stack • In x64, attacker has to store the first parameter in rdi “/bin/bash” de ad de ad de ad de ad 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 system(char *cmd) Return-Oriented Programming (1/2) • Code execution by reusing existing functions • Reuse code gadgets, i.e., not entire functions, but just part of functions • In principle, we can return to the middle of a function (and even instruction!) • Code gadget ends, e.g., in ret or jmp
• In our example, we need a gadget that stores stack element in rdi
“/bin/bash”
de ad de ad de ad de ad
41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41
system(char *cmd)

Return-Oriented Programming (2/2)
• We can chain several function calls on the stack
• Example: If we want to nicely clean after shell terminates, call exit(0)
“/bin/bash”
de ad de ad de ad de ad
41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41
exit(int status)
system(char *cmd)
pop rdi ret

ROP Gadget Search
• Searching gadgets with ROPgadget
•python ROPgadget.py –binary /lib/x86_64-linux-gnu/libc-
2.13.so –all | less –S
• Searching gadgets with pwntools
Gadget search in pwntools
# load file where we want to find gadgets in
rop = ROP(ELF(‘libc.so’))
# find gadgets using any search term, can also be ‘pop rdi’ only
rop.find_gadget([‘pop rdi’, ‘ret’])
# find gadgets that store specified values in the specified registers
rop.setRegisters({‘rdi’: 0x1234, ‘rsi’: 0x2345})

Code-Reuse Defenses

Eliminating Unaligned Gadgets (1/2)
• x64 instructions have variable length (1 – 15 bytes)
• By default, there is no enforcement where an instruction starts • Unaligned gadgets might be hidden in other instructions
Aligned Instructions
0x0400000: b8 5f c3 00 00 mov rax, 0xc35f
Unaligned Gadget Hidden in Above Code at Offset +1
0x0400001: 5f pop rdi 0x0400002: c3 ret

Eliminating Unaligned Gadgets (2/2)
• Unaligned gadgets can be eliminated by rewriting
• Basic idea: Destroy values containing control flow instructions (e.g., ret, jmp) • Rewriting must not change semantics of code
• Several rewriting rules and iterations required
• Separate two consecutive instructions with nop sled between them • Transform two-byte opcodes that contain 0xc3 (ret)
• Transform immediates (see example below)
Aligned Instructions
0x0400000: b8 5f c3 00 00 mov rax, 0xc35f
Return-Free Instruction Transformation
0x0400000: 8d 04 25 5f b3 00 00 lea eax, 0xb35f 0x0400007: 67 8d 80 00 10 00 00 lea eax,[eax+0x1000]

Address Space Layout Randomization (1/2)
• ASLR randomizes (base) addresses of code and data
• Addresses of variables, shellcode or ROP gadgets are no longer deterministic • Attacker does not know where to jump to or what do overwrite
• Randomized parts are OS-dependent
• Windows: randomize base addr. of code, heap/stack and library locations
• Linux: randomize base addr. of code, heap/stack, library locations and vDSO

Address Space Layout Randomization (2/2)
• ASLR causes problems if absolute addresses are used
• Function and data addresses vary, depending on the base address
• No problem for direct calls (which are already relative), but for indirect calls
• Example: Function pointer in C and its x86 assembly implementation
Function pointer
int fun(int arg1) {
return arg1;
int main() {
void *funcptr = &fun;
Assembly code (showing a hard-coded absolute address)
48 c7 40 08 30 06 40 00 mov [funcptr], 0x400630 …
• Two solutions to this problem in practice:
• Linux: Position-Independent Code (PIC) that is relative to instruction pointer
• Windows: Program ships with relocation table that includes pointers that require updates once the program is mapped to specific memory address

ASLR in Linux: Position Independent Code (PIC, 1/2)
• PIC can run regardless of the address it is stored
• All internal addressing is done relative to current instruction pointer
• Addressing external variables/functions:
• Global Offset Table (GOT) stores addresses of external functions/variables • GOT is at fixed offset to code section
• Loader fills GOT with pointers to absolute addresses

ASLR in Linux: Position Independent Code (PIC, 2/2)
• Compile a shared library with PIC
• Address filled during loading (example is before loading) • No need to change the code section
$ gcc -shared -fPIC -o pic.so pic.c
extern int ext_var;
int my_local_fun(void) {
return ext_var;
Assembly code
6b0: 55 push rbp
6b1: 48 89 e5 mov rbp,rsp
6b4: 8b 05 ce 02 20 00 mov rax,[rip+0x2002ce]
Global Offset Table (.got)
00 00 00 00
6bb: 8b 00 67a: 5d 67b: c3
mov eax,[rax]
pop rbp ret

ASLR in Linux: PLT (1/3)
• Linux supports dynamic address resolution (lazy binding)
• Immediately binding all external addresses would slow down program load • Thus, dynamically (lazily) bind addresses upon their first use
• Procedure Linkage Table (PLT) allows to look up addresses at runtime • Support for calling dynamically-located functions
• GOT is meant for data and code addresses, PLT for code only
• Each PLT entry has corresponding GOT entry

ASLR in Linux: PLT (2/3)
• PLT consists of…
• Address resolution function (dynamic loader, first entry PLT[0]) • Each one entry per external function
• Each PLT entry consists of…
• A jump instruction to the address stored in the corresponding GOT entry
– First call of PLT entry: address points to PLT entry’s lookup code
– Subsequent calls: address points to actual function address (no return to PLT)
• Lookup code: Ask dynamic loader to load library (and fill GOT accordingly)
Assembly code
PLT[0]: // special entry
(address lookup)
PLT[1]: // n’th PLT entry jmp *GOT[n]
(prepare lookup)
jmp PLT[0]

ASLR in Linux: PLT (3/3)
• After binding, PLT points to actual function (via GOT) • Dynamic loader overwrites address stored in GOT
• Before, address pointed to the loader code in PLT
• Now, address points to actual function’s address
• PLT code thus loads function’s address and jumps to it
GOT[n]:
Assembly code
PLT[0]: // special entry
(address lookup)
PLT[1]: // n’th PLT entry jmp *GOT[n]
(prepare lookup)
jmp PLT[0]
Assembly code
(start myfunc)

ASLR in Windows: Relocation
• Windows solves ASLR via relocations
• Any absolute address used in code is stored in relocation table
• Compiler has to fill relocation table to mark such absolute addresses • Loader updates these addresses according to new base address
• Problems of shared libraries in Windows
• Relocations hardwire absolute addresses upon program load time
• This effectively changes the libraries, as addresses are part of their code
• If a library that is shared among two processes would be located at different addresses by each process, Copy-on-Write would not be possible
• Solution: All processes randomize base address of library to same address

Fine-Grained ASLR
• ASLR randomizes the base address of code
• This is coarse-grained randomization only
• Single pointer leak would derandomize the entire code segment
• Fine-grained randomization aims to prevent this threat
• Instead of randomizing just the base address, fine-grained ASLR can:
– Randomize the order of functions
– Shuffle instructions within a function that are independent from each other
– Replace instructions by semantic equivalents
– Randomize the alignment of functions (e.g., random padding)
• However, not really deployed in practice (maybe soon in Linux kernel?)

• JIT-ROP undermines ASLR under the assumption, that the attacker…:
a) can leak one or several pointers to code, and
b) can use Just-in-Time environment (e.g., browser) to dynamically read code
• In principle, ASLR randomizes code/data segments
• ROP attacks are impossible, as attacker cannot know gadget addresses
• Yet information leaks, a common vulnerability (also in browsers), may leak certain code pointers to the attacker
• Attacker can then use JavaScript to read ROP gadgets on the fly • Basic JIT-ROP idea
• Attacker uses JIT environment to read code pages and finds
a) new pointers to other code pages (to expand the search), and
b) ROP gadgets and their addresses
• Once ROP gadgets are found, use them for classical ROP attack

Execute-no-Read (1/2)
• JIT-ROP works under the assumption that code pages are readable
• While data has to be readable, code usually just needs to be executed
• Execute-no-read (XnR) removes the read permission for code pages • Attackers cannot learn gadgets on the fly by scanning code sections
• JIT-ROP is prevented, or at least mitigated
• Fundamental conflict
• We want to withdraw the read privilege of code pages
• Yet code has to be readable, as the CPU needs to fetch/decode instructions
• Further complications
• Code and data are not as nicely separated
• There is no permission in x86 to forbid read on memory pages

Execute-no-Read (2/2)
• An OS can support Execute-no-Read (XnR) to mark pages as either readable or executable to limit number of pages exposed to JIT-ROP
• Basic XnR idea
• Map a small sliding window of code pages, unmap other code pages
• If code access unmapped page, page fault handler has to decide:
a) Map page when trying to read from a data page (and continue)
b) Map page when trying to execute code (and unmap an older page, and continue)
c) Raise exception if trying to read a code page

Control Flow Integrity (1/2)
• Indirect control flow (jmp/call , ret, etc.) is problematic • Might give attacker the possibility to divert into non-intended path
• Control Flow Integrity (CFI) enforces intended program paths
• Program paths unforeseen by developer are prohibited
• E.g., attackers can no longer return to arbitrary address specified on stack • Compromise between efficiency and security guarantees
• CFI checks usually embedded during program compile time
• Assumption: Valid call/jump targets are known to compiler
• Compiler thus knows and can enforce valid (source, destination) tuples • CFI prohibits any control flow deviating from these known pairs

Control Flow Integrity (2/2)
• Program calls sort, which in turn invokes comparator functions to sort • Compiler analyzes forward and backward edges at compile time
• Before calling/returning, check if valid path is entered, else abort
void myfun(arr[], len) { /* … */
sort(arr, len, lt); /* … */
sort(arr, len, gt);
/* … */ }
void sort(a[], len, *fptr) { /* … */
fptr(a[i], a[j]);
lt() comparator function
bool lt(x, y) { return x < y; gt() comparator function bool gt(x, y) { return x > y;
Forward edge (call target check) Backward edge (return site check)

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts