计算机代写 COMP 116

Binary Static Analysis
, CTO and Co-founder
March 7, 2012
Introduction to Computer Security – COMP 116

Copyright By PowCoder代写 加微信 powcoder

, Veracode’s CTO and Co-Founder, is responsible for the company’s software security analysis capabilities. In 2008 he was named one of InfoWorld’s Top 25 CTO’s and one of the 100 most inFluential people in IT by eWeek. In 2010, he was named a SANS Security Thought Leader.
In the 90’s he was one of the original vulnerability researchers at The L0pht. He has testiFied on Capitol Hill in the US on the subjects of government computer security and how vulnerabilities are discovered in software. is the lead author of “The Art of Software Security Testing” published by Addison-Wesley.

Writing insecure code creates a system that is just as vulnerable as not using passwords, missing encryption, or neglecting to build any other security feature.

Evolution of Computer Intrusions
ü Misconfiguration of networks or hosts
ü Weak or blank passwords, world readable file shares
ü Vulnerability in underlying OS or other “infrastructure” hardware/software
ü Steady stream of updates from Microsoft, IBM, Cisco, Oracle ü “Social Engineering” or tricking the user
ü Download and run malicious codec or install free AV, phishing, clicking on link to exploit client software vulnerabilities
ü Vulnerabilities in software
ü Media players, desktop software, web applications

ü Analysis of software performed without actually executing the program
ü Full coverage of the entire source or binary
ü In theory, having full application knowledge can reveal a wider range of bugs and vulnerabilities than the “trial and error” of dynamic analysis
ü Impossible to identify vulnerabilities based on system configuration that exist only in the deployment environment
Static Analysis

Benefits of Binary Static Analysis
ü C99 specification has many unspecified, or implementation defined, constructs, e.g.:
ü Order of function argument evaluation int i = 0;
foo(i++, i++); // foo(0, 1), or foo(1, 0)
bar(a(), b(), c()); // where a(), b(), and c() have side effects
ü Order of expressions int i = 0;
a[++i] = i; // a[1] = 1, or a[1] = 0
ü Many others (Google for ”nasal demons”)
ü Detect flaws in third-party libraries
ü Is the compiler trustworthy? ( , “Reflections on Trusting Trust”)

Benefits Of Binary Analysis
ü Binary analysis has 100% coverage. All code can be analyzed, regardless of source availability.
ü Exact modeling of control flow in the presence of compiler switches is automatic, for example, buffer checks do not have to be emulated.
ü You analyze exactly what is being shipped. Backdoors inserted in source, compiled, and then removed from source will still be found.
ü Binary-level flaws such as optimizations that remove memory clearing of cryptographic keys, can be detected.
ü Code is always analyzed in its complete execution context. Analyzing ‘pieces’ of programs leads to higher false-positive rates.

Binary Static Analysis Architecture
Program Binary
Load and disassemble the binary and third-party libraries into an intermediate representation
Create a high- level model of the application including data 0low, control 0low, taint coloring, and numeric ranges
Perform analysis by scanning the model for the existence of coding patterns indicative of security Flaws
Generate metrics based on raw vulnerability data, and use debug symbols to correlate security Flaws with original source code
Security Metrics
Detailed Flaw List
Application Modeler
Model Analyzer

Components of Static Binary Analysis
ü Binary Modeler
ü This component builds a model from the binary directly, using type information from debug symbols if available, producing a high-level representation of the program that includes reconstructed dataflow and control flow elements suitable for human consumption or machine- based inspection.
ü Intermediate Representation
ü This component is the core of the analysis, the data structure that represents the entire ‘meaning’ of the program being analyzed, designed carefully to represent everything and make assumptions about very little to nothing. There are some liberties one can take here but you have to be very careful!

Program Structure (SOM)
ü Describes how the program is organized ü Procedures and functions
ü Libraries
ü Class layout
ü Data structure layout
ü Not much analysis performed here, but provides the foundation for data flow and control flow phases to be layered on

Components of Static Binary Analysis
ü Model Querying and Condition Searching System ü This is responsible for searching the intermediate model
for characteristics.
ü In the case of Veracode, this is the part that looks for ‘security flaws’. At no point in the process before this stage does ‘security’ really come into play.
ü Static Binary Analysis could easily be looking for other things such as general code quality problems or to compare two models for equivalent pieces via graph isomorphism that might suggest code being stolen and reused elsewhere

Decompilation is Compilation in Reverse
Tokenization (lexical)
Basic block generation, control Flow
Data Flow transforms, register coloring, assembly generation
Optimization, copy constant propagation, loop unrolling
Emit machine code
Evaluate and detect Flaw patterns
Reconstruct basic block graph and control Flow
Analyze expressions, discover code, optimize data Flow
Variablize: undo coloring, determine variable lifetimes
Load executable, unlink DLLs, convert to custom IR
Veracode doesn’t decompile to source but rather makes use of decompiler concepts to build our control Flow and data Flow models
Compilation
Decompilation

A Brief Explanation of Taint

ü Track data from the time it enters the program throughout its variable lifetime
ü Multiple taint colors can be applied to any piece of data: untrusted, sensitive, decrypted, fromstorage, fromnetwork, etc.
ü Report locations where tainted data is used in a potentially dangerous situation
• ServletRequest.getParameter() • HttpServletRequest.getHeader() • HttpServletRequest.getCookies() • HttpUtils.parsePostData() • Socket.getInputStream() • DynaActionForm.get()
Propagators
• String.concat() • StringBuffer.append() • String.getBytes() • String.split() • String.toLowerCase() • StringBuilder.insert() • etc.
• Statement.execute() • PreparedStatement.execute() • ServletOutputStream.print() • ServletOutputStream.write() • Runtime.exec()
• File() • FileInputStream() • etc.

Control Flow
ü A basic block is code that has one entry point, one exit point and no jump instructions contained within it
ü A control flow graph represents all paths that might be traversed during execution; it is a group of basic blocks with directed edges
ü Consider virtual function calls; what appears to be a simple call to myFunction() may actually have 10 different control flow edges

Numeric Ranges
ü Attempt to predict the range of values for a variable at a particular location in the code
ü Use these ranges in conjunction with type information, buffer
sizes, etc. to detect memory corruption issues and other
numeric flaws
char lookup(int idx) {
char buf[32];
load_values(buf);
return buf[idx];
char lookup_safe(int idx) {
char buf[32];
load_values(buf);
if (idx < 32) { return buf[idx]; printf(“idx was %d”, idx); return 0; } Q: What is the range of idx? A: INT_MIN to INT_MAX Q: What is the range of idx? A: INT_MIN to 31 (oops, still vulnerable) Q: What is the range of idx? A: 32 to INT_MAX Veracode Query Language (VQL) ü Declarative scan language ü Abstracts away internals ü Makes scan intent clear ü Makes scan reviewable ü Shortens development time scan TimeBomb1 { L1: now = time( _ ); L2: VQL_COMPARE( now, bombtime ); AlwaysConst( bombtime, L2 ); Annotate( L2, VULN_Time_Bomb ); Detecting Flaws ü Coding flaws can be represented by patterns ü Pose a series of questions to the control flow and data flow models to determine if those patterns exist ü Example: Is user-supplied data ever concatenated into an ad-hoc SQL query? ü Relies on control flow, data flow, untrusted taint color, and knowledge of database APIs Detecting Flaws ü Example: Is sensitive information ever exfiltrated from a mobile application? ü Relies on control flow, data flow, sensitive taint color, and network communication APIs for a specific mobile platform Questions? Thank You! 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com