CS代写 compiler assembly assembler CSCI 2021: x86-64 Assembly Extras and Wrap

CSCI 2021: x86-64 Assembly Extras and Wrap
Chris Updated:
Mon Nov 1 02:22:23 PM CDT 2021
1

Logistics
Reading Bryant/O’Hallaron
Read in Full
▶ Ch 3.7 Procedure Calls
Skim the following
▶ Ch 3.8-3.9: Arrays, Structs ▶ Ch 3.10: Pointers/Security ▶ Ch 3.11: Floating Point
Goals
⊠ Asm Procedure Calls ⊠ Assembly vs C
□ Data in Assembly
□ Security Risks
□ Floating Point Instr/Regs
Date
Wed 10/27 Fri 10/29 Mon 11/01 Wed 11/03
Fri 11/05 Wed 11/10
Project 3
Event
Asm Extras
Asm Extras
Asm Wrap-up Practice Exam 2 Lab/HW 9: Review Exam 2
P3 Due
▶ Problem 1: Clock Assembly Functions (50%)
▶ Problem 2: Binary Bomb via debugger (50%)
Start NOW if you haven’t already
2

Exercise: All Models are Wrong…
▶ Rule #1: The Doctor Lies
▶ Below is our original model for memory layout of C programs
▶ Describe what is incorrect based on x86-64 assembly
▶ Will all variables have a position in the stack?
▶ What else is on the stack / control flow info?
▶ What registers are likely used?
9: int main(…){
10: intx=19;
11: inty=31;
+-<12: swap(&x, &y); STACK: Caller main(), prior to swap() |FRAME |ADDR |NAME|VALUE| |---------+-------+------+-------| |main() |#2048|x | 19| |line:12|#2044|y | 31| |---------+-------+------+-------| | 13: | 14: return 0; V 15: } | | 18: void swap(int *a,int *b){ | FRAME | ADDR | NAME | VALUE | printf("%d %d\n",x,y); +->19: int tmp = *a; 20: *a=*b;
21: *b = tmp; 22: return;
23: }
|———+——-+——+——-| |main() |#2048|x | 19|<-+ | line:12 | #2044 | y | 31 |<-|+ |---------+-------+------+-------| || | swap() | #2036 | a | #2048 |--+| | line:19 | #2028 | b | #2044 |---+ | |#2024|tmp | ?| STACK: Callee swap() takes control 3 Answers: All Models are Wrong, Some are Useful 9: int main(...){ 10: intx=19; 11: inty=31; +-<12: swap(&x, &y); STACK: Callee swap() takes control |FRAME |ADDR |NAME|VALUE| |---------+-------+------+-------| |main() |#2048|x | 19| | |#2044|y | 31| |---------+-------+------+-------| | swap() | #2036 | rip |Line 13| |---------+-------+------+-------| REGS as swap() starts |REG|VALUE|NOTE | |-----+-------+--------------| |rdi|#2048|for*a | |rsi|#2044|for*b | |rax| ?|fortmp | |rip| L19|lineinswap| | 13: | 14: return 0; V 15: } | | 18: void swap(int *a,int *b){ +->19:
20:
21:
22:
23: }
int tmp = *a; *a=*b;
*b = tmp; return;
printf(“%d %d\n”,x,y);
▶ main() must have stack space for locals passed by address ▶ swap() needs no stack space for arguments: in registers
▶ Return address is next value of rip register in main()
▶ Mostly don’t need to think at this level of detail but can be useful in some situations
4

Data In Assembly Arrays
Usually: base + index × size
arr[i] = 12;
movl $12,(%rdi,%rsi,4)
int x = arr[j];
movl (%rdi,%rcx,4),%r8d
▶ Array starting address often held in a register
▶ Index often in a register
▶ Compiler inserts appropriate size (1,2,4,8)
Structs
Usually base+offset
typedef struct {
int i; short s;
char c[2];
} foo_t;
foo_t *f = …;
short sh = f->s;
movw 4(%rdi),%si
f->c[i] = ‘X’;
movb $88, 6(%rdi,%rax)
5

Packed Structures as Procedure Arguments
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 }
▶ Passing pointers to structs is ’normal’: registers contain addresses to main memory
▶ Passing actual structs may result in packed structs where several fields are in a single register
▶ Assembly must unpack these through shifts and masking
// packed_struct_main.c typedef struct {
short first;
short second; } twoshort_t;
short sub_struct(twoshort_t ti);
1 ### packed_struct.s 2 .text
3 .globl sub_struct 4 sub_struct:
5 ## first arg is twoshort_t ts
6 ## %rdi has 2 packed shorts in it 7 ## bits 0-15 are ts.first
8 ## bits 16-31 are ts.second
int main(){
twoshort_t ts = {.first=10, 10
.second=-2}; int sum = sub_struct(ts);
printf(“%d – %d = %d\n”, ts.first, ts.second, sum);
return 0;
9 ## upper bits could
11 movl %edi,%eax 12 andl $0xFFFF,%eax 13 sarl $16,%edi
14 andl $0xFFFF,%edi 15 subw %di,%ax
16 ret
be anything
#eax=ts;
# eax = ts.first; #edi=edi>>16; # edi = ts.second; #ax=ax-di
# answer in ax
6

Example: coins_t in HW06 / Lab07
// Type for collections of coins
typedef struct { // coint_t has the following memory layout
char quarters; //
char dimes; // | | Pointer | Packed | Packed | char nickels; // | | Memory | Struct | Struct | char pennies; // | Field | Offset | Arg# | Bits |
} coins_t;
## | #2048 ## | #2049 ## | #2050 ## | #2051
// |———-+———+——–+——–|
//|quarters| +0|#1
|0-7 | |8-15 | | 16-23 | | 24-31 |
total_coins:
### args are
### %rdi packed coin_t struct with struct fields ### { 0- 7: quarters, 8-15: dimes,
### 16-23: nickels, 24-31: pennies}

### rdi: 0x00 00 00 00 03 00 01 02 ### pndq
movq %rdi,%rdx # extract dimes ### rdx: 0x00 00 00 00 03 00 01 02
### pndq
//|dimes // | nickels // | pennies
| c->quarters | 2 |
| +1|#1 | +2|#1 | +3|#1
| c->dimes | c->nickels | c->pennies
|1| | – | | – |
set_coins:
### int set_coins(int cents, coins_t *coins) ### %edi = int cents
### %rsi = coints_t *coins

# rsi: #2048 #al:0 %dl:3 movb %al,2(%rsi) movb %dl,3(%rsi)
## | #2048 | c->quarters |
# coins->nickels = al; # coins->pennies = dl;
2 | 1| 0 | 3 |
sarq $8,%rdx # shift ### rdx: 0x00 00 00 00 00 03 00 01 ### pnd
andq $0xFF,%rdx # rdx = ### rdx: 0x00 00 00 00 00 00 00 01 ### pnd
dimes to low bits dimes
## | #2049 | c->dimes ## | #2050 | c->nickels ## | #2051 | c->pennies
| | |
7

General Cautions on Layout by Compilers
▶ Compiler honors order of source code fields in struct
▶ BUT compiler may add padding between/after fields for alignment
▶ Compiler determines total struct size
Struct Layout Algorimths
▶ Baked into compiler
▶ May change from
compiler to compiler
▶ May change through history of compiler
Structs in Mem/Regs
▶ Stack structs spread across several registers
▶ Don’t need a struct on the stack at all in some cases (just like don’t need local variables on stack)
▶ Struct arguments packed into 1+ registers
Stay Insulated
▶ Programming in C insulates you from all of this
▶ Feel the warmth of gcc’s abstraction blanket
8

Security Risks in C Buffer Overflow Attacks
▶ No default bounds checking in C: Performance favored over safety
▶ Allows classic security flaws:
char buf[1024];
printf(“Enter you name:”);
fscanf(file,”%s”,buf); // BAD
// or
gets(buf); // BAD
// my name is 1500 chars
// long, what happens?
▶ For data larger than buf, begin overwriting other parts of the stack
▶ Clobber return addresses
▶ Insert executable code and run it
Counter-measures
▶ Stack protection is default in gcc in the modern era
▶ Inserts “canary” values on the stack near return address
▶ Prior to function return, checks that canaries are unchanged
▶ Stack / Text Section Start randomized by kernel, return address and function addresses difficult to predict ahead of time
▶ Kernel may also vary virtual memory address as well
▶ Disabling protections is risky
9

Stack Smashing
▶ Explored in a recent homework
▶ See stack_smash.c for a similar example
▶ Demonstrates detection of changes to stack that could be
harmful
#define END 8 // too big for array
void demo(){
int arr[4]; // fill array off the end
for(int i=0; i cd 08-assembly-extras-code/
> gcc stack_smash1.c
> ./a.out
About to do the demo
[0]: 2
[1]: 4
[2]: 6
[3]: 8
*** stack smashing detected ***: terminat
Aborted (core dumped)
10

Sample Buffer Overflow Code
#include // compiled with gcc will likely result
void never(){ // only in ‘stack smashing’
printf(“This should never happen\n”);
return; }
int main(){
union {long addr; char str[9];} never_info;
never_info.addr = (long) never;
never_info.str[8] = ‘\0′;
printf(“Address of never: %0p\n”,never_info.addr);
printf(“Address as string: %s\n”,never_info.str);
printf(“Enter a string: “);
char buf[4];
fscanf(stdin,”%s”,buf);
// By entering the correct length of string followed by the ASCII
// representation of the address of never(), one might be able to
// get that function to run (on windows…)
printf(“You entered: %s\n”,buf);
return 0; }
11

Accessing Global Variables in Assembly
Global data can be set up in assembly in .data sections with labels and assembler directives like .int and .short
.data
an_int:
.int 17
some_shorts:
.short 10
.short 12
.short 14
# single int
# array of shorts
# some_shorts[0]
# some_shorts[1]
# some_shorts[2]
Modern Access to Globals
movl an_int(%rip), %eax
leaq some_shorts(%rip), %rdi
▶ Uses %rip relative addressing
▶ Default in gcc as it plays nice
with OS security features
▶ May discuss again later during Linking/ELF coverage
Traditional Access to Globals
movl an_int, %eax # ERROR
leaq (some_shorts), %rdi # ERROR
▶ Not accepted by gcc by default
▶ Yields compile/link errors
/usr/bin/ld: /tmp/ccocSiw5.o:
relocation R_X86_64_32S against `.data’ can not be used when making a PIE object; recompile with -fPIE
12

Floating Point Operations
▶ The original Intel Chips 8086 didn’t have floating point ops ▶ Had to buy a co-processor, Intel 8087, to add FP ops
▶ Modern CPUs ALL have FP ops but they feel separate from
the integer ops: FPU versus ALU
FP “Media” Registers
Instructions
addss %xmm2,%xmm4,%xmm0
# xmm0[0] = xmm2[0] + xmm4[0]
# Add Scalar Single-Precision
addps %xmm2,%xmm4,%xmm0
# xmm0[:] = xmm2[:] + xmm4[:]
# Add Packed Single-Precision
# “Vector” Instruction
▶ Operates on single values or “vectors” of packed values
▶ 3-operands common in more “modern” assembly languages
256-bits 128-bits
Use
FP Arg 1/ Ret FPArg2

FPArg8 Caller Save
%ymm0 %xmm0
%ymm1 %xmm1
… …
%ymm7 %xmm7
%ymm8 %xmm8
………
%ymm15 %xmm15 Caller Save
▶ Can be used as “scalars” – single values but…
▶ xmmI is 128 bits big holding ▶ 2 64-bit FP values OR
▶ 4 32-bit FP values
▶ ymmI doubles this
13

Floating Point and ALU Conversions
▶ Recall that bit layout of Integers and Floating Point numbers are quite different (how?)
▶ Leads to a series of assembly instructions to interconvert between types
# int eax = …;
# double xmm0 = (double) eax;
vcvtsi2sd %eax,%xmm0,%xmm0
# double xmm1 = …
# long rcx = (int) xmm1;
vcvttsd2siq %xmm1,%rcx
▶ These are non-trivial conversions: 5-cycle latency (delay) before completion, can have a performance impact on code which does conversions
14