程序代写 CS162 © UCB Spring 2022

Recall: How does the Processor Talk to the Device?
Processor Memory Bus
Bus Bus Adapt Adapt
Other Devices

Copyright By PowCoder代写 加微信 powcoder

Bus Interfac
Registers (port 0x20)
Addressabl e
Memory and/or
Queues Memory Mapped
Device Controller
Hardware Controller
write contr
Region: 0x8f008020
Joseph & Kubiatowicz CS162 © UCB Spring 2022
• CPU interacts with a Controller
– Contains a set of registers that can be read and written – May contain memory for request queues, etc.
• Processor accesses registers in two ways:
– Port-Mapped I/O: in/out instructions
» Example from the Intel architecture: out 0x21,AL
– Memory-mapped I/O: load/store instructions
» Registers/memory appear in physical address space » I/O accomplished with load and store instructions
Interrupt Request
Address + Data
Regular Memory
Interrupt Controller

Port-Mapped I/O in Pintos Speaker Driver
Pintos: devices/speaker.c
Pintos: threads/io.h
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Example: Memory-Mapped Display Controller
• Memory-Mapped:
– Hardware maps control registers and display memory into physical address space
» Addresses set by HW jumpers or at boot time
– Simply writing to display memory (also called the “frame buffer”) changes image on screen
» Addr: 0x8000F000 — 0x8000FFFF
– Writing graphics description to cmd queue
» Say enter a set of triangles describing some scene » Addr: 0x80010000 — 0x8001FFFF
– Writing to the command register may cause on-board graphics hardware to do something
» Say render the above scene » Addr:0x0007F004
• Can protect with address translation
0x80020000
0x80010000
0x8000F000
0x0007F004 0x0007F000
Graphics Comman
Display Memory
Physical Address Space
3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022

There’s more than just a CPU in there!
3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.4

Chip-scale Features of 2015 x86 (Sky Lake)
• Significant pieces:
– Four OOO cores with deeper buffers
» Intel MPX (Memory Protection Extensions) » Intel SGX (Software Guard Extensions)
» Issue up to 6 μ-ops/cycle
– GPU, System Agent (Mem, Fast I/O)
– Large shared L3 cache with on-chip ring bus » 2 MB/core instead of 1.5 MB/core
» High-BW access to L3 Cache
• Integrated I/O
– Integrated memory controller (IMC) » Two independent channels of DRAM
– High-speed PCI-Express (for Graphics cards)
– Direct Media Interface (DMI) Connection to PCH (Platform Control Hub)
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Sky Lake I/O: PCH
Sky Lake System Configuration
• Platform Controller Hub
– Connected to processor with proprietary bus
» Direct Media Interface • Types of I/O on PCH:
– USB, Ethernet
– Thunderbolt 3
– Audio, BIOS support
– More PCI Express (lower speed than on Processor)
– SATA (for Disks)
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Operational Parameters for I/O
• Data granularity: Byte vs. Block
– Some devices provide single byte at a time (e.g., keyboard) – Others provide whole blocks (e.g., disks, networks, etc.)
• Access pattern: Sequential vs. Random
– Some devices must be accessed sequentially (e.g., tape)
– Others can be accessed “randomly” (e.g., disk, cd, etc.) » Fixed overhead to start transfers
– Some devices require continual monitoring
– Others generate interrupts when they need service
• Transfer Mechanism: Programmed IO and DMA
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Transferring Data To/From Controller
• Programmed I/O:
– Each byte transferred via processor in/out or load/store
– Pro: Simple hardware, easy to program
– Con: Consumes processor cycles proportional to data size
• Direct Memory Access:
– Give controller access to memory bus
– Ask it to transfer
data blocks to/from memory directly
• Sample interaction with DMA controller (from OSC book):
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Transferring Data To/From Controller
• Programmed I/O:
– Each byte transferred via processor in/out or load/store
– Pro: Simple hardware, easy to program
– Con: Consumes processor cycles proportional to data size
• Direct Memory Access:
– Give controller access to memory bus
– Ask it to transfer
data blocks to/from memory directly
• Sample interaction with DMA controller (from OSC book):
Joseph & Kubiatowicz CS162 © UCB Spring 2022

I/O Device Notifying the OS
• The OS needs to know when:
– The I/O device has completed an operation – The I/O operation has encountered an error
• I/O Interrupt:
– Device generates an interrupt whenever it needs service – Pro: handles unpredictable events well
– Con: interrupts relatively high overhead
• Polling:
– OS periodically checks a device-specific status register » I/O device puts completion information in status register
– Pro: low overhead
– Con: may waste many cycles on polling if infrequent or unpredictable I/O operations
• Actual devices combine both polling and interrupts – For instance – High-bandwidth network adapter:
Interrupt for first incoming packet
Poll for following packets until hardware queues are empty
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Kernel Device Structure
The System Call Interface
Process Management
Memory Management
Filesystems
Device Control
Networking
Concurrency, Virtual Files and dirs: TTYs and multitasking memory the VFS device access
Connectivity
Architecture Dependent Code
Memory Manager
File System Types
Block Devices
Device Control
Network Subsystem
IF drivers
3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.11

Recall: Device Drivers
• Device Driver: Device-specific code in the kernel that interacts directly with the device hardware
– Supports a standard, internal interface
– Same kernel I/O system can interact easily with different device drivers
– Special device-specific configuration supported with the ioctl() system call
• Device Drivers typically divided into two pieces: – Top half: accessed in call path from system calls
» implements a set of standard, cross-device calls like open(), close(), read(), write(), ioctl(), strategy()
» This is the kernel’s interface to the device driver
» Top half will start I/O to device, may put thread to sleep until finished – Bottom half: run as interrupt routine
» Gets input or transfers next block of output
» May wake sleeping threads if I/O now complete
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Recall: Life Cycle of An I/O Request
User Program
Kernel I/O Subsystem
Device Driver Top Half
Device Driver Bottom Half
Device Hardware
Joseph & Kubiatowicz CS162 © UCB Spring 2022

The Goal of the I/O Subsystem
• Provide Uniform Interfaces, Despite Wide Range of Different Devices – This code works on many different devices:
FILE fd = fopen(“/dev/something”, “rw”);
for (int i = 0; i < 10; i++) { fprintf(fd, "Count %d\n", i); close(fd); – Why? Because code that controls devices (“device driver”) implements standard interface • We will try to get a flavor for what is involved in actually controlling devices in rest of lecture – Can only scratch surface! Joseph & Kubiatowicz CS162 © UCB Spring 2022 Want Standard Interfaces to Devices • Block Devices: e.g. disk drives, tape drives, DVD-ROM – Access blocks of data – Commands include open(), read(), write(), seek() – Raw I/O or file-system access – Memory-mapped file access possible • Character Devices: e.g. keyboards, mice, serial ports, some USB devices – Single characters at a time – Commands include get(), put() – Libraries layered on top allow line editing • Network Devices: e.g. Ethernet,Wireless, Bluetooth – Different enough from block/character to have own interface – Unix and Windows include socket interface » Separates network protocol from network operation » Includes select() functionality – Usage: pipes, FIFOs, streams, queues, mailboxes Joseph & Kubiatowicz CS162 © UCB Spring 2022 How Does User Deal with Timing? • Blocking Interface: “Wait” – When request data (e.g. read() system call), put process to sleep until data is ready – When write data (e.g. write() system call), put process to sleep until device is ready for data • Non-blocking Interface: “Don’t Wait” – Returns quickly from read or write request with count of bytes successfully transferred – Read may return nothing, write may write nothing • Asynchronous Interface: “Tell Me Later” – When request data, take pointer to user’s buffer, return immediately; later kernel fills buffer and notifies user – When send data, take pointer to user’s buffer, return immediately; later kernel takes data and notifies user Joseph & Kubiatowicz CS162 © UCB Spring 2022 Storage Devices • Magnetic disks – Storage that rarely becomes corrupted – Large capacity at low cost – Block level random access (except for SMR – later!) – Slow performance for random access – Better performance for sequential access • Flash memory – Storage that rarely becomes corrupted – Capacity at intermediate cost (5-20x disk) – Block level random access – Good performance for reads; worse for random writes – Erasure requirement in large blocks – Wear patterns issue Joseph & Kubiatowicz CS162 © UCB Spring 2022 Hard Disk Drives (HDDs) Western Digital Drive http://www.storagereview.com/gu IBM Personal Computer/AT (1986) 30 MB hard disk - $500 30-40ms seek time 0.7-1 MB/s (est.) IBM/Hitachi Microdrive Read/Write Head Joseph & Kubiatowicz CS162 © UCB Spring 2022 The Amazing Magnetic Disk • Unit of Transfer: Sector – Ring of sectors form a track – Stack of tracks form a cylinder – Heads position on cylinders • Disk Tracks ~ 1μm (micron) wide – Wavelength of light is ~ 0.5μm – Resolution of human eye: 50μm – 100K tracks on a typical 2.5” disk • Separated by unused guard regions – Reduces likelihood neighboring tracks are corrupted during writes (still a small non-zero chance) Joseph & Kubiatowicz CS162 © UCB Spring 2022 The Amazing Magnetic Disk • Track length varies across disk – Outside: More sectors per track, higher bandwidth – Disk is organized into regions of tracks with same # of sectors/track – Only outer half of radius is used » Most of the disk area in the outer regions of the disk • Disks so big that some companies (like Google) reportedly only use part of disk for active data – Rest is archival data Joseph & Kubiatowicz CS162 © UCB Spring 2022 Shingled Magnetic Recording (SMR) • Overlapping tracks yields greater density, capacity • Restrictions on writing, complex DSP for reading Joseph & Kubiatowicz CS162 © UCB Spring 2022 Review: Magnetic Disks • Cylinders: all the tracks under the head at a given point on all surfaces Track Sector • Read/write data is a three-stage process: – Seek time: position the head/arm over the proper track – Rotational latency: wait for desired sector to rotate under r/w head – Transfer time: transfer a block of bits (sector) under r/w head Disk Latency = Queueing Time + Controller time + Seek Time + Rotation Time + Xfer Time Joseph & Kubiatowicz CS162 © UCB Spring 2022 Software Queue (Device Driver) Media Time (Seek+Rot+Xfer) Hardware Controlle Typical Numbers for Magnetic Disk Info/Range Space/Density Space: 18TB (Seagate), 9 platters, in 31⁄2 inch form factor! Areal Density: ≥ 1 Terabit/square inch! (PMR, Helium, ...) Average Seek Time Typically 4-6 milliseconds Average Rotational Latency Most laptop/desktop disks rotate at 3600-7200 RPM (16-8 ms/rotation). Server disks up to 15,000 RPM. Average latency is halfway around disk so 4-8 milliseconds Controller Time Depends on controller hardware Transfer Time Typically 50 to 270 MB/s. Depends on: • Transfer size (usually a sector): 512B – 1KB per sector • Rotation speed: 3600 RPM to 15000 RPM • Recording density: bits per inch on a track • Diameter: ranges from 1 in to 5.25 in Used to drop by a factor of two every 1.5 years (or faster), now slowing down 3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.23 Disk Performance Example • Assumptions: – Ignoring queuing and controller times for now – Avg seek time of 5ms, – 7200RPM ⇒ Time for rotation: 60000 (ms/min) / 7200(rev/min) ~= 8ms – Transfer rate of 50MByte/s, block size of 4Kbyte ⇒ 4096 bytes/50×106 (bytes/s) = 81.92 × 10-6 sec ≅ 0.082 ms for 1 sector • Read block from random place on disk: – Seek (5ms) + Rot. Delay (4ms) + Transfer (0.082ms) = 9.082ms – Approx 9ms to fetch/put data: 4096 bytes/9.082×10-3 s ≅ 451KB/s • Read block from random place in same cylinder: – Rot. Delay (4ms) + Transfer (0.082ms) = 4.082ms – Approx 4ms to fetch/put data: 4096 bytes/4.082×10-3 s ≅ 1.03MB/s • Read next block on same track: – Transfer (0.082ms): 4096 bytes/0.082×10-3 s ≅ 50MB/sec • Key to using disk effectively (especially for file systems) is to minimize seek and rotational delays Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.24 Lots of Intelligence in the Controller • Sectors contain sophisticated error correcting codes – Disk head magnet has a field wider than track – Hide corruptions due to neighboring track writes • Sector sparing – Remap bad sectors transparently to spare sectors on the same surface • Slip sparing – Remap all sectors (when there is a bad sector) to preserve sequential behavior • Track skewing – Sector numbers offset from one track to the next, to allow for disk head movement for sequential ops Joseph & Kubiatowicz CS162 © UCB Spring 2022 Hard Drive Prices over Time 3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.26 Example of Current HDDs • X18 (2020) – 18 TB hard disk » 9 platters, 18 heads » Helium filled: reduce friction and power – 4.16ms average seek time – 4096 byte physical sectors – 7200 RPMs – Dual 6 Gbps SATA /12Gbps SAS interface » 270MB/s MAX transfer rate » Cache size: 256MB – Price: $ 562 (~ $0.03/GB) • IBM Personal Computer/AT (1986) – 30 MB hard disk – 30-40ms seek time – 0.7-1 MB/s (est.) – Price: $500 ($17K/GB, 340,000x more expensive !!) Joseph & Kubiatowicz CS162 © UCB Spring 2022 Solid State Disks (SSDs) • 1995 – Replace rotating magnetic media with non-volatile memory (battery backed DRAM) • 2009 – Use NAND Multi-Level Cell (2 or 3-bit/cell) flash memory – Sector (4 KB page) addressable, but stores 4-64 “pages” per memory block – Trapped electrons distinguish between 1 and 0 • No moving parts (no rotate/seek motors) – Eliminates seek and rotational delay (0.1-0.2ms access time) – Very low power and lightweight – Limited “write cycles” • Rapid advances in capacity and cost ever since! Joseph & Kubiatowicz CS162 © UCB Spring 2022 FLASH Memory Samsung 2015: 512GB, NAND Flash • Like a normal transistor but: – Has a floating gate that can hold charge – To write: raise or lower wordline high enough to cause charges to tunnel – To read: turn on wordline as if normal transistor » presence of charge changes threshold and thus measured current • Two varieties: – NAND: denser, must be read and written in blocks – NOR: much less dense, fast to read and write • V-NAND: 3D stacking (Samsung claims 1TB possible in 1 chip) Joseph & Kubiatowicz CS162 © UCB Spring 2022 Flash Memory (Con’t) • Data read and written in page-sized chunks (e.g. 4K) – Cannot be addressed at byte level – Random access at block level for reads (no locality advantage) – Writing of new blocks handled in order (kinda like a log) • Before writing, must be erased (256K block at a time) – Requires free-list management – CANNOT write over existing block (Copy-on-Write is normal case) Joseph & Kubiatowicz CS162 © UCB Spring 2022 SSD Architecture – Reads Flash Memory Controll Buffer Manager (softwar e er DD DD Read 4 KB Page: ~25 usec – No seek or rotational latency – Transfer time: transfer a 4KB page » SATA: 300-600MB/s => ~4 x103 b / 400 x 106 bps => 10 us
– Latency = Queuing Time + Controller time + Xfer Time – Highest Bandwidth: Sequential OR Random reads
3/31/2022 Joseph & Kubiatowicz CS162 © UCB Spring 2022

SSD Architecture – Writes
• Writing data is complex! (~200μs – 1.7ms)
– Can only write empty pages in a block
– Erasing a block takes ~1.5ms
– Controller maintains pool of empty blocks by coalescing used pages (read, erase, write), also reserves some % of capacity
• Rule of thumb: writes 10x reads, erasure 10x writes
https://en.wikipedia.org/wiki/Solid-state _drive
Joseph & Kubiatowicz CS162 © UCB Spring 2022 Lec 18.32

SSD Architecture – Writes
• SSDs provide same interface as HDDs to OS – read and write chunk (4KB) at a time
• But can only overwrite data 256KB at a time!
• Why not just erase and rewrite new version of entire 256KB block?
– Erasure is very slow (milliseconds)
– Each block has a finite lifetime, can only be erased and rewritten about 10K times – Heavily used blocks likely to wear out quickly
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Solution – Two Systems Principles
Layer of Indirection
– Maintain a Flash Translation Layer (FTL) in SSD
– Map virtual block numbers (which OS uses) to physical page numbers (which flash mem. controller uses)
– Can now freely relocate data w/o OS knowing
Copy on Write
– Don’t overwrite a page when OS updates its data – Instead, write new version in a free page
– Update FTL mapping to point to new location
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Flash Translation Layer
• No need to erase and rewrite entire 256KB block when making small modifications
• SSD controller can assign mappings to spread workload across pages – Wear Levelling
• What to do with old versions of pages?
– Garbage Collection in background
– Erase blocks with old pages, add to free list
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Some “Current” (large) 3.5in SSDs
• SSD: 15.36TB (2017) – Dual 12Gb/s interface
– Seq reads 860MB/s
– Seq writes 920MB/s
– Random Reads (IOPS): 102K
– Random Writes (IOPS): 15K
– Price (Amazon): $5495 ($0.36/GB)
• Nimbus SSD: 100TB (2019)
– Dual port: 12Gb/s interface
– Seq reads/writes: 500MB/s
– Random Read Ops (IOPS): 100K – Unlimited writes for 5 years!
– Price: ~ $40K? ($0.4/GB)
» However, 50TB drive costs $12500 ($0.25/GB)
Joseph & Kubiatowicz CS162 © UCB Spring 2022

Amusing calculation:
Is a full Kindle heavier than an empty one?
• Actually,“Yes”, but not by much
• Flash works by trapping electrons:
– So, erased state lower energy than written state
• Assuming that:
– Kindle has 4GB flash
– 1⁄2 of all bits in full Kindle are in high-energy state
– High-energy state about 10-15 joules higher
– Then: Full Kindle is 1 attogram (10-18gram) heavier (Using E = mc2)
• Of course, this is less than most sensitive scale can measure (it can measure 10-9 grams)
• Of course, this weight difference overwhelmed by battery discharge, weight from getting warm, ….
• Source: ( Times, Oct 24, 2011)
Joseph & K

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com