Abstract—Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program’s behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated.
This paper sheds light on the primary reasons for this by describing attacks that succeed on today’s systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corrup- tion attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies.
We analyze the reasons why protection mechanisms imple- menting stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed.
A comparison of different enforceable policies helps de- signers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency. We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.
Copyright By PowCoder代写 加微信 powcoder
I. INTRODUCTION
Memory corruption bugs are one of the oldest problems in computer security. Applications written in low-level lan- guages like C or C++ are prone to these kinds of bugs. The lack of memory safety (or type safety) in such languages enables attackers to exploit memory bugs by maliciously altering the program’s behavior or even taking full control over the control-flow. The most obvious solution would be to avoid these languages and to rewrite vulnerable applications in type-safe languages. Unfortunately, this is unrealistic not only due to the billions of lines of existing C/C++ code, but also due to the low-level features needed for performance critical programs (e.g. operating systems).
The war in memory is fought on one side by offensive research that develops new attacks and malicious attack- ers, and on the other side by defensive researchers who develop new protections and application programmers who
*Corresponding author.
try to write safe programs. The memory war effectively is an arms race between offense and defense. Accord- ing to the MITRE ranking [1], memory corruption bugs are considered one of the top three most dangerous soft- ware errors. Google Chrome, one of the most secure web browsers written in C++, was exploited four times during the Pwn2Own/Pwnium hacking contests in 2012.
In the last 30 years a set of defenses has been devel- oped against memory corruption attacks. Some of them are deployed in commodity systems and compilers, protecting applications from different forms of attacks. Stack cook- ies [2], exception handler validation [3], Data Execution Prevention [4] and Address Space Layout Randomization [5] make the exploitation of memory corruption bugs much harder, but several attack vectors are still effective under all these currently deployed basic protection settings. Return- Oriented Programming (ROP) [6], [7], [8], [9], [10], [11], information leaks [12], [13] and the prevalent use of user scripting and just-in-time compilation [14] allow attackers to carry out practically any attack despite all protections.
A multitude of defense mechanisms have been proposed to overcome one or more of the possible attack vectors. Yet most of them are not used in practice, due to one or more of the following factors: the performance overhead of the approach outweighs the potential protection, the approach is not compatible with all currently used features (e.g., in legacy programs), the approach is not robust and the offered protection is not complete, or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available.
With all the diverse attacks and proposed defenses it is hard to see how effective and how efficient different solutions are and how they compare to each other and what the primary challenges are. The motivation for this paper is to systematize and evaluate previously proposed approaches. The systematization is done by setting up a general model for memory corruption vulnerabilities and exploitation techniques. The defense techniques are clas- sified by the exploits they mitigate and by the particular phase of exploit they try to inhibit. The evaluation is based on robustness, performance and compatibility. Using this evaluation, we also discuss common criteria that need to be fulfilled for successful deployment of a new software defense.
SoK: Eternal War in Memory
La ́szlo ́ Szekeres†, ‡, ∗‡, ‡ †Stony Brook University
‡University of California, Berkeley
∗Peking University
Some related work already covers different memory cor- ruption attacks [15], provides historical overview [16] or lists different protection mechanisms [17]. This systematization of knowledge paper extends the related survey papers by developing a new general model for memory corruption attacks and evaluating and comparing proposed defense mechanisms using a new set of criteria that incorporates real- world adoption as well. The paper does not aim to cover or refer every proposed solution, but rather identifies and analyzes the main approaches in a systematical way, sorts out the most promising proposals and points out fundamental problems and unsolved challenges.
With this systematization of knowledge paper we make the following contributions:
• develop a general model of memory corruption at- tacks and identify different enforceable security policies based on the model;
• clarify what attack vectors are left unprotected by currently used and previously proposed protections by matching their enforced polices with separate phases of different exploits;
• evaluate and compare proposed solutions for perfor- mance, compatibility, and robustness;
• discuss why many proposed solutions are not adopted in practice and what the necessary criteria for a new solution are.
The paper is organized as follows. Section II sets up the main model of attacks and classifies protections based on the policies they enforce. Section III discusses currently deployed protections and their main weaknesses. Our evalu- ation criteria are set up in Section IV, and are used through the analysis of defense approaches covered by the following four sections. Section IX summarizes with a comparative analysis and Section X concludes the paper.
II. ATTACKS
To solve the problem of memory error based attacks, we first need to understand the process of carrying out such an exploit. In this section we set up a step-by-step memory exploitation model. We will base our discussion of protection techniques and the policies they enforce on this model. Figure 1 shows the different steps of exploiting a memory error. Each white rectangular node represents a building block towards successful exploitation and the oval nodes on the bottom represent a successful attack. Each rhombus represents a decision between alternative paths towards the goal. While control-flow hijacking is usually the primary goal of attacks, memory corruption can be exploited to carry out other types of attacks as well.
A. Memory corruption
Since the root cause of all vulnerabilities discussed in this systematization of knowledge paper is memory corruption, every exploit starts by triggering a memory error. The first
two steps of an exploit in Figure 1 cover the initial memory error. The first step makes a pointer invalid, and the second one dereferences the pointer, thereby triggering the error. A pointer can become invalid by going out of the bounds of its pointed object or when the object gets deallocated. A pointer pointing to a deleted object is called a dangling pointer. Dereferencing an out-of-bounds pointer causes a so called spatial error, while dereferencing a dangling pointer causes a temporal error.
As Step 1, an attacker may force a pointer out of bounds by exploiting various programming bugs. For instance, by triggering an allocation failure which is unchecked, the pointer can become a null pointer (in kernel-space null- pointer dereferences are exploitable [18]). By excessively incrementing or decrementing an array pointer in a loop without proper bound checking, a buffer overflow/underflow will happen. By causing indexing bugs where an attacker has control over the index into an array, but the bounds check is missing or incomplete, the pointer might be pointed to any address. Indexing bugs are often caused by integer related errors like an integer overflow, truncation or signedness bug, or incorrect pointer casting. Lastly, pointers can be corrupted using the memory errors under discussion. This is represented by the backward loop in Figure 1.
As an alternative, the attacker may make a pointer “dan- gling” by, for instance, exploiting an incorrect exception handler, which deallocates an object, but does not reinitialize the pointers to it. Temporal memory errors are called use- after-free vulnerabilities because the dangling pointer is dereferenced (used) after the memory area it points to has been returned (freed) to the memory management system. Most of the attacks target heap allocated objects, but pointers to a local variable can “escape” as well from the local scope when assigned to a global pointer. Such escaped pointers become dangling when the function returns and the local variable on the stack is deleted.
Next, we show how either an out-of-bounds or a dangling pointer can be exploited to execute any third step in our exploitation model when the invalid pointer is read or written in Step 2. The third step is either the corruption or the leakage of some internal data.
When a value is read from memory into a register by dereferencing a pointer controlled by the attacker, the value can be corrupted. Consider the following jump table where the function pointer defining the next function call is read from an array without bounds checking.
func_ptr jump_table[3] = {fn_0, fn_1, fn_2};
jump_table[user_input]();
The attacker can make the pointer point to a location under his or her control and divert control-flow. Any other variable read indirectly can be vulnerable.
Besides data corruption, reading memory through an attacker specified pointer leaks information if that data is
Memory Safety
Code Pointer Integrity
Address Space Randomization
Control-flow Integrity
Non-executable Data / Instruction Set Randomization
Control-flow hijack attack
Attack model demonstrating four exploit types and policies mitigating the attacks in different stages
Make a pointer go out of bounds
Make a pointer become dangling
Use pointer to write (or free)
Use pointer to read
Modify a data pointer
Modify code …
Modify a code pointer …
Modify a data variable …
Output data variable
Code Integrity
Instruction Set Randomization
Data Integrity
Data-flow Integrity
Data-only attack
Data Space Randomization
… to the attacker specified code
… to the address of shellcode / gadget
… to the attacker specified value
Interpret the output data
Use pointer by indirect call/jump
Use pointer by return instruction
Use corrupted data variable
Execute available gadgets / functions
Execute injected shellcode
included in the output. The classic example of this attack is the printf format string bug, where the format string is controlled by the attacker. By specifying the format string the attacker creates invalid pointers and reads (and writes) arbitrary memory locations.
printf(user_input); // input “%3$x” prints the // 3rd integer on the stack
If an attacker controlled pointer is used to write the memory, then any variable, including other pointers or even code, can be overwritten. Buffer overflows and indexing bugs can be exploited to overwrite sensitive data such as a return address or virtual table (vtable) pointer. Corrupting the vtable pointer is an example of the backward loop in Figure 1. Suppose a buffer overflow makes an array pointer out of bounds in the first round that is exploited (in Step 3) to corrupt a nearby vtable pointer in memory in the second round. When the corrupted vtable pointer is dereferenced (in Step 2), a bogus virtual function pointer will be used. It is important to see that with one memory error, more and more memory errors can be raised by corrupting other pointers. Calling free() with an attacker controlled pointer can also be exploited to carry out arbitrary memory writes [19]. Write dereferences can be exploited to leak information as well.
printf(“%s\n”, err_msg);
Code corruption attack
Information leak
For instance, the attacker is able to leak arbitrary mem- ory contents in the above line of code by corrupting the err_msg pointer.
Temporal errors, when a dangling pointer is dereferenced in Step 2, can be exploited similarly to spatial errors. A constraint for exploitable temporal errors is that the memory area of the deallocated object (the old object) is reused by another object (new object). The type mismatch between the old and new object can allow the attacker to access unintended memory.
Let us consider first reading through a dangling pointer with the old object’s type but pointing to the new object, which is controlled by the attacker. When a virtual function of the old object is called and the virtual function pointer is looked up, the contents of the new object will be interpreted as the vtable pointer of the old object. This allows the corruption of the vtable pointer, comparable to exploiting a spatial write error, but in this case the dangling pointer is only dereferenced for a read. An additional aspect of this attack is that the new object may contain sensitive information that can be leaked when read through the dangling pointer of the old object’s type.
Writing through a dangling pointer is similarly exploitable as an out of bounds pointer by corrupting other pointers or data inside the new object. When the dangling pointer is an escaped pointer to a local variable and points to the stack, it may be exploited to overwrite sensitive data, such as a return address. Double-free is a special case of the use-after-free vulnerability where the dangling pointer is used to call free() again. In this case, the attacker controlled contents of the new object will be interpreted wrongly as heap metadata, which is also exploitable for arbitrary memory writes [19].
Memory errors in general allow the attacker to read and modify the program’s internal state in unintended ways. We showed that any combination of the first two steps in our memory exploitation model can be used both to corrupt internal data and to leak sensitive information. Furthermore, more memory errors can be triggered by corrupting other pointers. Programming bugs which make these errors possi- ble, such as buffer overflows and double-frees, are common in C/C++. When developing in such low-level languages, both bounds checking and memory management are fully the programmers responsibility, which is very error prone.
The above described errors are a violation of the Memory Safety policy. C and C++ are inherently memory unsafe. According to the C/C++ standards, writing an array be- yond its bounds, dereferencing a null-pointer, or reading an uninitialized variable result in undefined behavior. Since finding and fixing all the programming bugs is infeasible, we need automatic solutions to enforce Memory Safety in existing programs or stop attacks in their later phases. The policy mitigating a given set of attack steps is represented in the figure by a colored area surrounding the white boxes. The section number which discusses approaches enforcing a given policy is also indicated in the figure, above the policy names. We discuss approaches that try to stop any exploit in the first (two) steps by enforcing Memory Safety in Section VI. In the following subsections we discuss the steps of different exploit paths and identify the policies mitigating the given step. As shown in the figure, some policies include other, weaker policies.
B. Code corruption attack
The most obvious way to modify the execution of a program is to use one of the abovementioned bugs to overwrite the program code in memory. A Code Integrity policy enforces that program code cannot be written. Code Integrity can be achieved if all memory pages containing code are set read-only, which is supported by all modern pro- cessors. Unfortunately, Code Integrity does not support self- modifying code or Just-In-Time (JIT) compilation. Today, every major browser includes a JIT compiler for JavaScript or Flash. For these use-cases, Code Integrity cannot be fully enforced because there is a time window during which the generated code is on a writable page.
C. Control-flow hijack attack
Most often, memory corruption exploits try to take control over the program by diverting its control-flow. If Code Integrity is enforced then this alternative option tries to use a memory error to corrupt a code pointer in Step 3. Code Pointer Integrity policies aim to prevent the corruption of code pointers. We discuss the potential code pointer targets and limitations of this policy in Section VIII-A.
Suppose the attacker can access and modify a return address due to a buffer overflow. For a successful exploit, the attacker needs to know the correct target value (i.e., the address of the payload) as well. We represent this as a separate fourth step in Figure 1. If the target of the control- flow hijack (the code address to jump to) is not fixed, the attacker cannot specify the target and the attack fails at this step. This property can be achieved by introducing entropy to memory addresses using Address Space Randomization. We discuss techniques randomizing the address space in Section V-A.
Suppose a code pointer (e.g., a function pointer) has been successfully corrupted in the first four steps. The fifth step is that the execution needs to load the corrupted pointer into the instruction pointer. The instruction pointer can only be updated indirectly by executing an indirect control-flow transfer instruction, e.g., an indirect function call, indirect jump or function return instruction. Diverting the execution from the control-flow defined by the source code is a violation of the Control-flow Integrity (CFI) policy. In Section VIII-B, we cover protections that enforce different CFI policies by detecting corruption at indirect control transfers.
The final step of a control-flow hijack exploit is the execution of attacker specified malicious code. Classic at- tacks injected so-called shellcode into memory, and diverted execution to this piece of code. This kind of exploitation is prevented by the Non-executable Data policy which can be enforced using the executable bit for memory pages to make data memory pages, like the stack or the heap, non-executable. A combination of Non-executable Data and Code Integrity results in the W⊕X (Write XOR Execute) [4] policy, stating that a page can be either writable or exe- cutable, but not both. Practically all modern CPU support setting non-executable page permissions, so combined with non-writable code, enforcing W⊕X is cheap and practical. However in the case of JIT compilation or self-modifying code, W⊕X cannot be fully enforced. For the sake of completeness, we note that another randomization approach, Instruction Set
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com