Operating Systems Lecture 10a
Dr Ronald Grau School of Engineering and Informatics Spring term 2020
Previously 1 File systems
File systems implementation Kernel data structures
Block allocation
Today 2 File System Management
File system formats Data consistency Performance
Virtual file systems
Recap: Questions 3 Recap questions
1. What does the mount operation do?
2. What information does the kernel maintain for files used by processes?
3. What is reference counting in resource management?
4. What is the unit of storage allocation for files?
5. What is a file allocation table?
6. What is an i-node?
7. What is the difference between a hard and a soft link?
Recap: File abstraction 4 UNIX i-nodes
Recap: File abstraction 5
UNIX i-nodes
Accessing /usr/ast/mbox
Recap: File system implementation 6 Logical file system
Provides file interface (e.g., system calls) File organisation module
Manages block allocation to files E.g. i-nodes in UNIX
Basic file system
Block-based storage format
E.g. FAT-32, NTFS, ext4 I/O device drivers
Physical access to storage media E.g. HD, DVD, USB
Storage devices 7 E.g., a hard disk
Low-level formatting: info for disk controller,
e.g. tracks, sectors, ECCs for fixing soft errors, bad blocks
Partition Tables 8 MBR (Master Boot Record)
Cylinder-Head-Sector
(CHS) addressing: 512MB limit
Maximum 15 partitions (with logical partitions)
MBR-partition size limited to 2TB
Logical Block Addressing (LBA): Today 48 or 64bit
Partition Tables 9
GPT (Globally unique identifiers PT)
Supported by all modern operating systems
File System Formats 10
Raw format
Just blocks, no file system
E.g. Linux swap space, some databases
High-level formatting:
Initialise volume with
file system data structures
File System Examples: FAT 11 FAT-12 (MS-DOS) directory entry
12 bit block number, 512B blocks size → partition size 2MB
FAT-16: 16 bit cluster numbers, 32KB cluster size → 2GB
FAT-32: 28 bit cluster numbers, 64KB cluster size → 16TB
but 231 = 2GB file size (systems that support large files: 232 = 4GB) No file permissions
File System Examples: ISO 9660 12 ISO 9660 (CD-ROM) directory entry
2KB blocks
Contiguous allocation (mostly)
Limitations:
Length of file names, file attributes, directory nesting, . . . Rock Ridge (UNIX) and Joliet (Windows) extensions
How to choose the block size? 13
Effects of different block size 14
Mean file size In 1984: 1.1KB In 2005: 2.4KB in 2017: ?
Internal fragmentation
4KB blocks: 10% largest files occupy 93% of utilised blocks 8KB blocks: 10% largest files occupy 90% of utilised blocks
Disk access time
Rotation speed
Seek time to position head
Small block size + non-contiguous allocation = unreasonably long access times → „defragmentation”
Data Storage Consistency 15
File system operations involve many writes (e.g. appending data in a file)
1. Update free blocks list to get a new block
2. Add pointer to that block and Increase size of file in i-node
3. Write data into block
System crash → can lead to inconsistent data on storage device
How to detect inconsistencies:
E.g. UNIX fsck
Mark problems and try to recover
Drawback: time-consuming; data loss when recovery fails
Approaches: Log-Structured File Systems, RAID
Data Storage Consistency 16
Reusing ideas from database systems Transactions
Log: Store changes on hard disk
Commit them in the background
On a crash, inconsistencies are limited to interrupted transactions
Some operations appear to be much faster Also known as “journaling”
Examples: NTFS, ext4, . . .
Data Storage Consistency
17
Journaling FS example: ext4
48 bit block numbers, 4KB block size → ? 32 bit i-node numbers; 64 bit for file size 128MB journal
Avoids fragmentation:
• Pre-allocation support
• Disk partitioned into block groups
Data Storage Consistency 18 Journaling FS example: ext4
Journaling Block Device (JBD): generic journaling layer Journal entries:
Descriptor
(contains destination block number)
Data (data / metadata)
Commit
Revocation
Journaling modes:
journal / ordered / writeback
If interrupted, transaction is replayed
Data Storage Consistency
19
Redundant Arrays of Independent Disks (RAID)
Objectives
Increase reliability by redundant data storage Increase performance by parallel data access
Mirroring (RAID 1)
Assume disks fail independently
Example: MTBF 100000h, MTTR 10h
→ mean time to data loss 1000002=(2 * 10) ≈ 57000
Error correcting codes
RAID Levels (see also textbook)
Data striping (RAID 0)
Distribute data across n disks
Write every (k mod n)th chunk of m bits onto disk k → n times faster
Performance Optimisation 20
Block Caching
Typical policies: Clock, LRU, etc
Group blocks (superblock, i-nodes, data, etc)
Consistency: write dirty blocks as soon as possible, (e.g. when block full, periodically)
Write-Through Cache
Data written immediately
Requires more disk I/O
Data loss more unlikely for removable disks
Block Read Ahead
Speculatively read blocks ahead of time, e.g. when file is read sequentially
Performance Optimisation 21
Hard disks: reduce seek time Clustering, block groups, . . .
Disk scheduling (next week)
SSD issues: Relocate blocks that are often written to even out wear
Virtual File Systems 22
Virtual File Systems 23
Summary
24
File abstraction
Unified interface to various devices
System calls:
open(), read(), write(), close(), . . .
File System Management
File system formats and block size
Kernel data structures for file management Consistency: Journaling, RAID, …
Optimisations: caches, clusters, …
Virtual file systems
Directory structure and Block allocation Contiguous allocation
Linked list allocation, e.g. FAT
Indexed allocation, e.g. i-nodes
Read 25 Tanenbaum & Bos., Modern Operating Systems
Chapter 4
Silberschatz et al., Operating System Concepts Chapters 10, 11, 12
Next Lecture
26
Introduction
Operating System Architectures Processes
Threads – Programming
Process Scheduling – Evaluation Process Synchronisation
Deadlocks
Memory Management File Systems
Input / Output
Security