CSC 369 Operating Systems
File Systems – Data Layout and Organization
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Copyright By PowCoder代写 加微信 powcoder
• So far…
• File systems – Interface, structure, basic implementation
• Next up…
• Data organization and file system metadata
• UNIX inode structure
• Directories, hard links, soft links • Path name translation
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 2
Review of File Systems
• Manyfilesystemimplementations,literallyfromAFStoZFS • Checkouthttps://en.wikipedia.org/wiki/List_of_file_systems
• Mostwell-known:
• Windows:FAT32,NTFS
• MACOSX:HFS+
• BSD,Solaris:UFS,ZFS
• Linux: ext2 (see A4 too!), ext3, ext4, ReiserFS, XFS, JFS, btrfs, zfs, etc.
• We’llhavealookataverysimplefilesystem(VSFS) • See readings as well!
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 3
Remember: inode-based FS organization
Triple Indirect Block
mode nlinks size etc.
Data Blocks Indirect Blocks
Double Indirect Blocks
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
The main idea
• Weneedtocreateafilesystemforanunformatteddisk
• Weneedtocreatesomestructureinit,sothatthings(data) will be easy to find and organize
• Key questions
• Where do we store file data and metadata structures (inodes)? • How do we keep track of data allocations?
• How do we locate file data and metadata?
• What are the limitations (max file size, etc.)?
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 5
Bad programmers worry about the code.
Good programmers worry about data structures and their relationships.
— Linus Torvalds
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Implementation of a
Very Simple File System (VSFS)
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
An unformatted raw disk
Total size = 256KB
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Overall Organization
The whole disk is divided into fixed-sized blocks. Block size: 4KB
Number of blocks: 64
Total size: 256KB
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Data Region
Most of the disk should be used to actually store user data, while leaving a little space for storing other things like metadata. In VSFS, we reserve the last 56 blocks as data region.
0 D D D D D D D D 15 16 D D D D D D D D D D D D D D D D 31 32 D D D D D D D D D D D D D D D D 47 48 D D D D D D D D D D D D D D D D 63
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Metadata: inode table
FS needs to track information about each file.
In VSFS, we keep the info of each file in a struct called inode. And we use 5 blocks for storing all the inodes.
ext2 inode with size of 128B
0 I I I I I D D D D D D D D 15 16 D D D D D D D D D D D D D D D D 31 32 D D D D D D D D D D D D D D D D 47 48 D D D D D D D D D D D D D D D D 63
How many files can this VSFS hold at most?
Maximum number of inodes it can hold: 5 * 4KB / 128B = 160 => can store at
most 160 files.
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Allocation Structures
• Keep track of which blocks are being used and which ones are free
• We use a data structure (a bitmap) for this purpose
• Each bit indicates if one block is free (0) or in‐use (1)
• A bitmap for the data region and a bitmap for the inode region
• Reserve one block for each bitmap.
0 IB DB I I I I I D D D D D D D D 15
16 D D D D D D D D D D D D D D D D
32 D D D D D D D D D D D D D D D D 47 48 D D D D D D D D D D D D D D D D 63
How many blocks can a 4KB data bitmap keep track of?
4KB = 32K bits, can keep track of 32K blocks, so yeah it’s overkill in this VSFS.
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Superblock contains information about this particular file system:
• what type of file system it is (“VSFS” indicated by a magic number) • how many inodes and data blocks are there (160 and 56)
• where the inode table begins (block 3), etc.
Superblock
0 S IB DB I I I I I D D D D D D D D 15
DDDDDDDDDDDDDDDD
32 D D D D D D D D D D D D D D D D 47 48 D D D D D D D D D D D D D D D D 63
When mounting a file system, the OS first reads the superblock, identifies its type and other parameters, then attaches the volume to the file system tree with proper settings.
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Formatting disk into VSFS, done!
0 S IB DB I I I I I D D D D D D D D 15 16 D D D D D D D D D D D D D D D D 31
32 D D D D D D D D D D D D D D D D 48 D D D D D D D D D D D D D D D D
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Example: Read a file with inode number 32
• Fromthesuperblock,weknow
• inodetablebeginsatBlock3,i.e.,12KB • inodesizeis128B
• Calculatetheaddressofinode32
• 12KB+32*128B=16K
inode 32 is here
0 S IB DB I I I I I D D D D D D D D 15 16 D D D D D D D D D D D D D D D D 31
32 D D D D D D D D D D D D D D D D 48 D D D D D D D D D D D D D D D D
So we have the inode, but which blocks have the data? University of Toronto Mississauga, Department of Mathematical and Computational Sciences
From inode to data
• Say the inode contains an array of 15 direct pointers that point to 15 data blocks that belong to the file.
• Maximum file size supported: • 15*4KB=60KB
• If we need a file larger than 60KB, we need to do something more sophisticated.
ext2 inode with size of 128B
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
Direct pointers to disk blocks do not support large files Idea: indirect pointer
How big a file can we support now?
Multi-Level Index with Indirect Pointers
• Instead of pointing to a block of user data, it points to a block that contains more pointers
• From the 15 pointers we have in an inode, use the first 14 as direct pointers and 15th as an indirect pointer
• 14 direct pointers in total: 14 data blocks
• Indirect pointer points to a block (4KB) which can hold 1K pointers =>
1K data blocks in addition
• Total size supported: 4K * (14 + 1K) = 4152KB
What if I want even BIGGER?
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
A double indirect pointer points to a block full of indirect pointers which point to blocks of direct pointers.
How big a file do we support now?
Double Indirect Pointer!
• E.g., for the 15 pointers, we use the first 13 as direct pointers, the 14th as an indirect pointer, and the 15th as a double indirect pointer.
• From direct pointers: 13 data blocks
• From indirect pointers: 1024 data blocks
• From double indirect pointers: 1024 * 1024 = 1M data blocks
• Total size = 4K * (13 + 1024 + 1024*1024) ≈ 4.004GB
Still not big enough? Use a Triple Indirect Pointer
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
A tree-view of multi-level indirect pointers
Triple Indirect Block
mode nlinks size etc.
Data Blocks Indirect Blocks
Double Indirect Blocks
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
A tree-view of multi-level indirect pointers
• Youmightwonder:thistreeissuperimbalanced.
• Frequentlyaccessbigfiles=>theblockswiththedouble
and/or triple indirect pointers are going to be accessed like crazy (because most of the data blocks are in their subtrees).
• True,butit’sfine:inpractice,mostfilesaresmall.
• For more practice-inspired decisions, read “A Five-Year
Study of File System Metadata” (see course web page)
University of Toronto Mississauga, Department of Mathematical and Computational Sciences
• Data structure representing a FS object (file, dir, etc.)
• Attributes,diskblocklocations
• Nofilename,justmetadata!
• Directory
• List of (name, inode) mappings
• Eachdirectoryentry:afile,otherdirectory,link,itself(.),parentdir(..),etc.
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 21
Inodes and directories: Examples
• Data structure representing a FS object (file, dir, etc.)
• Attributes,diskblocklocations
• Nofilename,justmetadata!
• Directory (not the same as an inode!)
• It is a file with its own inode. This file contains a list of (name, inode) mappings (aka a directory entry)
• Adirentryrepresents:afile,asubdirectory,link,itself(.),parentdir(..),etc • Use stat or ls -if to check out information!
• Onyourmachine,infoaboutyourFS:
Don’t be afraid to explore! It’s fun!
• sudo dumpe2fs –h /dev/sda1
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 22
• Hard links
Links: Examples
• Multiple file names (and directory entries) mapped to the same inode
• Reference count – only remove file when it reaches 0
• Soft (Symbolic) links
• “Pointer” to a given file • Contains the path
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 23
The content of a data block
• If it belongs to a regular file • Data of the file
• If it belongs to a directory
• List of directory entries: (name, inode number) pairs, which are the entries under the directory
• If it belongs to a symbolic link
• The path of the file that it links to
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 24
Path Name Translation
• Let’s say you want to open “/user/homer/doh.txt”
• Whatdoesthefilesystemdo?
• Opendirectory“/”(theroot,wellknown,canalwaysfind)
• Search for the entry “user”, get location of “user” (in directory entry)
• Opendirectory“user”,searchfor“homer”,getlocationof“homer”
• Opendirectory“homer”,searchfor“doh.txt”,getlocationof“doh.txt”
• Openfile“doh.txt”
• Systems spend a lot of time walking directory paths • Thisiswhyopenisseparatefromread/write
• OS will cache prefix lookups for performance
• /a/b, /a/bb, /a/bbb, etc., all share “/a” prefix
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 25
Original Unix File System
• Basically,theFSseesstorageaslineararrayofblocks
• Each block has a block number (its index, for quick access at the right offset on disk)
Superblock
Bitmaps Inodes Data Blocks
• Unfortunatelyperformanceisaffectedbythislayout (think what happens when traversing a path)
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 26
Improvements: Block Groups
• Disk partitioned into block groups (aka cylinder groups)
• Try to keep “related” files and directories close together
• Example: keep files from the same directory in the same block, keep file inode and its data blocks in the same block group, etc.
• ext2, ext3, etc. adopted this improvement
Superblock
Block Group
Block group organization
• To be continued in the lab – ext2 details, exercise to print some FS data structures
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 27
FS Performance optimizations Diving deeper into file systems
Warning: lots more interesting stuff coming! 🙂
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 28
Announcements
• Threelabexercisestohelpyoubetterunderstandext2
• Lab exercises 8 (due this week), 9 + 10 (due next week)
• Can work in pairs, but must be with your assignment partner (some code may be similar)
• Goal: print information from an ext2 image, to get used to the data structures, bitmaps, directory entries, etc.
• Start early, don’t wait until the deadline for the exercises • If you do, you won’t finish A4 on time!
• Attend the lab – we are helping you get started on this!
• Come prepared though!
University of Toronto Mississauga, Department of Mathematical and Computational Sciences 29
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com