CS计算机代考程序代写 file system cache assembler Lecture 2

Lecture 2
CS 111: Operating System Principles

Interfaces
1.0.2

Jon Eyolfson
April 1, 2021

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

cba

http://creativecommons.org/licenses/by-sa/4.0/

CPUs Have “Rings” to Control Instruction Access

Hypervisor
(Ring -1)

Kernel / Supervisor
(Ring 0)

User
(Ring 3)

Each ring can access instructions in any of its outer rings

1

The Kernel of the Operating System Runs in Kernel Mode

User space
Kernel space

2

System Calls Transition between User and Kernel Mode

User space

Kernel space

(352 total)
read write open close stat mmap brk pipe clone fork
execve exit wait4 chdir mkdir rmdir creat mount

init_module delete_module clock_nanosleep exit_group

3

A Monolithic Kernel Runs Operating System Services in Kernel Mode

User space
Kernel space

Process SchedulingVirtual Memory IPC

Device DriversFile Systems

4

A Microkernel Runs the Minimum Amount of Services in Kernel Mode

User space
Kernel space

Process SchedulingVirtual Memory Basic IPC

Device DriversFile Systems Advanced IPC

5

Other Types of Kernels

“Hybrid” kernels are between monolithic and microkernels
Emulation services to user mode (Windows)
Device drivers to user mode (macOS)

Nanokernels and picokernels
Move even more into user mode than traditional microkernels

There’s many different lines you can draw with different trade-offs

6

Let’s Execute a 178 Byte “Hello World” on Linux x86-64

0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x02 0x00 0x3E 0x00 0x01 0x00 0x00 0x00 0x78 0x00 0x01 0x00 0x00 0x00 0x00 0x00
0x40 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x40 0x00 0x38 0x00 0x01 0x00 0x40 0x00 0x00 0x00 0x00 0x00
0x01 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
0xB2 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xB2 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x48 0xC7 0xC0 0x01 0x00 0x00 0x00 0x48
0xC7 0xC7 0x01 0x00 0x00 0x00 0x48 0xC7 0xC6 0xA6 0x00 0x01 0x00 0x48 0xC7 0xC2
0x0C 0x00 0x00 0x00 0x0F 0x05 0x48 0xC7 0xC0 0xE7 0x00 0x00 0x00 0x48 0xC7 0xC7
0x00 0x00 0x00 0x00 0x0F 0x05 0x48 0x65 0x6C 0x6C 0x6F 0x20 0x77 0x6F 0x72 0x6C
0x64 0x0A

7

ELF is the Binary Format for Unix Operating Systems

Executable and Linkable Format (ELF) is a file format

Always starts with the 4 bytes: 0x7F 0x45 0x4C 0x46
or with ASCII encoding: 0x7F ‘E’ ‘L’ ‘F’

Followed by a byte signifying 32 or 64 bit architectures
then a byte signifying little or big endian

Most file formats have different starting signatures (or magic numbers)

8

Use readelf to Read ELF File Headers

Command: readelf

Contains the following:
• Information about the machine (e.g. the ISA)
• The entry point of the program
• Any program headers (required for executables)
• Any section headers (required for libraries)

The header is 64 bytes, so we still have to account for 114 more.

9

Result of readelf -h on “Hello world”

ELF Header:
Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2’s complement, little endian
Version: 1 (current)
OS/ABI: UNIX – GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x10078
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 1
Size of section headers: 64 (bytes)
Number of section headers: 0
Section header string table index: 0

10

ELF Program Header

Tells the operating system how to load the executable:
• Which type? Examples:

• Load directly into memory
• Use dynamic linking (libraries)
• Interpret the program

• Permissions? Read / Write / Execute
• Which virtual address to put it?

• Note that you’ll rarely ever use physical addresses (for embedded)

For “Hello world” we load everything into memory
One program header is 56 bytes

58 bytes left

11

Result of readelf -l on “Hello world”

Elf file type is EXEC (Executable file)
Entry point 0x10078
There is 1 program header, starting at offset 64

Program Headers:
Type Offset VirtAddr PhysAddr

FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000010000 0x0000000000010000

0x00000000000000b2 0x00000000000000b2 R E 0x100

12

“Hello world” Needs 2 System Calls

Command: strace

This shows all the system calls our program makes:

execve(“./hello_world”, [“./hello_world”], 0x7ffd0489de40 /* 46 vars */) = 0
write(1, “Hello world\n”, 12) = 12
exit_group(0) = ?
+++ exited with 0 +++

13

Quick Aside: API Tells You What and ABI Tells You How

Application Programming Interface (API) abstracts the details how how to
communicate

e.g. A function takes 2 integer arguments

Application Binary Interface (ABI) specifies how to layout data and how to
concretely communicate

e.g. The same function using the C calling convention

14

System Call API for “Hello world”

strace shows the API of system calls

The write system call’s API is:
• A file descriptor to write bytes to
• An address to contiguous sequence of bytes
• How many bytes to write from the sequence

The exit_group system call’s API is:
• An exit code for the program (0-255)

15

System Call ABI for Linux x86-64

Enter the kernel with a syscall instruction, using registers for arguments:
• rax — System call number
• rdi — 1st argument
• rsi — 2nd argument
• rdx — 3rd argument
• r10 — 4th argument
• r8 — 5th argument
• r9 — 6th argument

What are the limitations of this?

Note: other registers are not used, whether they’re saved isn’t important for us

16

Instructions for “Hello world”, Using the Linux x86-64 ABI

Plug in the next 46 bytes into a disassembler, such as:
https://onlinedisassembler.com/

Our disassembled instructions:
mov rax,0x1
mov rdi,0x1
mov rsi,0x100a6
mov rdx,0xc
syscall
mov rax,0xe7
mov rdi,0x0
syscall

17

https://onlinedisassembler.com/

Finishing Up “Hello world” Example

The remaining 12 bytes is the “Hello world” string itself, ASCII encoded:
0x48 0x65 0x6C 0x6C 0x6F 0x20 0x77 0x6F 0x72 0x6C 0x64 0x0A

Low level ASCII tip: bit 5 is 0/1 for upper case/lower case (values differ by 32)

This accounts for every single byte of our 178 byte program, let’s see what C does…

Can you already spot a difference between strings in our example compared to C?

18

Source Code for “Hello world” in C

#include

int main(int argc, char **argv)
{
printf(“Hello world\n”);
return 0;

}

Compile with Make in examples/lecture-02

What are other notable differences between this and our “Hello world”?

19

System Calls for “Hello world” in C, Finding Standard Library

execve(“./hello_world_c”, [“./hello_world_c”], 0x7ffcb3444f60 /* 46 vars */) = 0
brk(NULL) = 0x5636ab9ea000
openat(AT_FDCWD, “/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=149337, …}) = 0
mmap(NULL, 149337, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4d43846000
close(3) = 0
openat(AT_FDCWD, “/usr/lib/libc.so.6”, O_RDONLY|O_CLOEXEC) = 3
read(3, “\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000C”…, 832) = 832
lseek(3, 792, SEEK_SET) = 792
read(3, “\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\201\336\t\36\251c\324″…, 68) = 68
fstat(3, {st_mode=S_IFREG|0755, st_size=2136840, …}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f4d43844000

lseek(3, 792, SEEK_SET) = 792
read(3, “\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\201\336\t\36\251c\324″…, 68) = 68
lseek(3, 864, SEEK_SET) = 864
read(3, “\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0”, 32) = 32

20

System Calls for “Hello world” in C, Loading Standard Library

mmap(NULL, 1848896, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4d43680000
mprotect(0x7f4d436a2000, 1671168, PROT_NONE) = 0
mmap(0x7f4d436a2000, 1355776, PROT_READ|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7f4d436a2000

mmap(0x7f4d437ed000, 311296, PROT_READ,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16d000) = 0x7f4d437ed000

mmap(0x7f4d4383a000, 24576, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b9000) = 0x7f4d4383a000

mmap(0x7f4d43840000, 13888, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4d43840000

close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7f4d43845500) = 0
mprotect(0x7f4d4383a000, 16384, PROT_READ) = 0
mprotect(0x5636a9abd000, 4096, PROT_READ) = 0
mprotect(0x7f4d43894000, 4096, PROT_READ) = 0
munmap(0x7f4d43846000, 149337) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), …}) = 0

21

System Calls for “Hello world” in C, Setting Up Heap and Printing

brk(NULL) = 0x5636ab9ea000
brk(0x5636aba0b000) = 0x5636aba0b000
write(1, “Hello world\n”, 12) = 12
exit_group(0) = ?
+++ exited with 0 +++

The C version of “Hello world” ends with the exact same system calls we made

22

Kernel Interfaces Operate Between CPU Mode Boundaries

The lessons from the lecture:
• Code running in kernel mode is part of your kernel
• Different kernel architectures shift how much code runs in kernel mode
• System calls is the interface between user and kernel mode
• Everything involved to define a simple “Hello world” (in 178 bytes)

• Difference between API and ABI
• How to explore system calls

23