SFL @ TU Dortmund
System Security
• System security governs process interactions with the system
Copyright By PowCoder代写 加微信 powcoder
• How do we define and confine a trusted OS kernel?
• Which objects (files, sockets, other processes, etc.) can a process access?
• How do we grant or revoke certain access privileges?
• How can we confine an attacker-controlled process to not do any harm?
Kernel vs. User
Privilege Rings (1/2)
• CPU enforces protection levels in hardware
• Intel designed four rings: ring-0 (most privileges) to ring-3 (least
privileges)
• System software runs in ring-0 (kernel land)
• Unprivileged processes run in ring-3 (user space)
• Device drivers were planned to run in ring-1 or ring-2
• Additional rings (-1, etc.) come with virtualization
3 (user) 2
• Protection levels provide a hardware- enforced separation between privileged and less privileged code
Privilege Rings (2/2)
• In practice, OSes use just two rings: 0 (kernel) and 3 (user)
• No separation for device drivers, they run in ring-0 as well
• CPU vendors offering just 2 rings forced OSes to this downgrade in security
• Restriction of ring-3 (“unprivileged”) code: • Several instructions are forbidden by CPU
• No access to other processes’ address space
• No direct I/O (files, hardware, bus, …)
• No direct access to ring-0 code and data • No direct access to physical memory •…
Memory Separation in Windows/Linux
• OSes (Win / Linux) split the 64-bit virtual memory in two halves
• Emulate 48-bit: Most significant 16 bits represent 17th most significant bit (MSB)
(this 48-bit addressing is also called canonical addressing)
• Lower half (17th MSB = 0) used for user land -> content process-specific
• Upper half (17th MSB = 1) used for kernel -> content identical for all
0x0000 0000 0000 0000
0x0000 7fff ffff ffff
0xffff 8000 0000 0000
0xffff ffff ffff ffff
(unused virtual
address range)
System Calls (1/2)
• User processes cannot directly access kernel land
• System calls provide the interface between user and kernel land
• System calls at a glance:
1. Unprivileged process invokes one
of the predefined system calls
2. System call traps to OS, i.e., CPU elevates privileges of the process and runs the system call handler
3. System call handler performs operations as requested by process and returns to the user (i.e., CPU downgrades privileges)
0x0000 0000 0000 0000
kernel (ring 0)
0x0000 7fff ffff ffff
0xffff 8000 0000 0000
syscall handle
0xffff ffff ffff ffff
System Calls (2/2)
• Example: System call table of x64 Linux
Syscall Number (%rax)
rt_sigprocmask
Entry point
sys_newstat
sys_rt_sigaction
sys_rt_sigprocmask
Implementation
fs/read_write.c
fs/read_write.c
sys_newfstat
sys_newlstat
fs/select.c
fs/read_write.c
arch/x86/kernel/sys_ x86_64.c
sys_mprotect
sys_munmap
mm/mprotect.c
rt_sigaction
kernel/signal.c
rt_sigreturn
stub_rt_sigreturn
kernel/signal.c
arch/x86/kernel/sign al.c
fs/ioctl.c
Virtual Memory of a User-Land Process
• Process memory is organized in using stack and heap • Stack for per-function variables (grows to lower addresses) • Heap for dynamically allocated memory
Stack: last in first out
0x0000 7FFF…FF
Other sections
(e.g.: .data)
Virtual Address Space
Stack designed for:
• Function parameters • Local variables
• Function return addr. • Temporary variables Stack grows downward.
Heap stores dynamically- allocated data items:
• malloc() etc. allocate
• free() etc. release
0x0000 0000…00
Resource Permissions
Protection Domains (1/3)
• OS maintains access to several resources (files, devices, pages, etc.)
• So far we discussed coarse-grained separation between user and kernel
• How about isolation between different users and their processes? • Protection domains
• Set of (object, rights) pairs, where rights is a (sub)set of operations a subject (e.g., user) is allowed to perform on an object (file, device, etc.)
• Users/processes are assigned to a protection domain
• Protection domains govern access privileges
Protection Domains (2/3)
Domain 1 Domain 2
rx r rw r rw
rw rw FFFFFFF 1231456
Protection Domains (3/3)
• Protection domains can be represented in a protection matrix ACL
• Protection matrices are usually sparse
• T wo main principles to store this matrix efficiently
• By column: Per object, i.e., store domains’ privileges (Access Control Lists)
• By row: Per domain, i.e., store access rights to objects (Capabilities)
File 2 File 3 File 4 File 5 File 6
Domain 3 r rw
Example: File Permissions in UNIX (1/2)
• UNIX enforces minimal access control lists (ACLs) for files
• Files belong to an owner and a group
• All types of UNIX files are maintained as inodes
• An inode is a control structure that contains file meta information • File directories are structured in a hierarchical tree (also as inodes)
• Each file has protection bits
• 9 bits to specify RWX (read, write, execute) rights to owner, group
and others
• SetUID and SetGID bits to execute a file in the owner’s user/group
• Sticky bit: if set for a directory, only owner can del/alter/rename files in dir.
Example: File Permissions in UNIX (2/2)
• Privileges are too coarse-grained in many examples • Groups can help to better organize file accesses
• However, a file can only belong to one group
• Extended ACLs can grant permissions to arbitrary number of domains
• Grant privileges with setfacl()
• Complementary to and checked in addition to minimalistic ACLs
• OS iterates over extended permissions and checks if at least one (either user or group-based rule) grants access permission
UNIX: SUID Programs
• SUID Bit allow a user to start a process with higher rights than his own
• Every process has a real and effective User / Group ID (usually equal)
• Real IDs: Given by the user -> Login as user „XYZ“
• Effective IDs: Represents the rights of a process -> Set with the
System Call setuid()
• suid / sgid: Bits to start a process with another effective ID as the
• -> Attractive target!
UNIX: TOCTOU
•Time-of-Check, Time-of-Use Attack
• Time-of-Check : Validity of assumption A on Entity E was
• Race Condition between access to entities
• Time-of-Use : Assumption A is still valid, E is used
• Time-of-Attack : Assumption A is not valid anymore
<