CS计算机代考程序代写 data structure file system gui cache File Systems

File Systems

File Systems

DSCI 551

Wensheng Wu

1

Roadmap

• Files and directories

– CRUD operations via system calls

• Implementing CRUD

– Data structures, e.g., organization of blocks

– Access methods: turn system calls into operations
on data structures

2

Files and directories

• File content stored in blocks on storage device

– Has user defined name: hello.txt

– & low-level name, e.g., inode number: 410689

• Files are organized into directories (folders)

– each may have a list of files and/or subdirectories

– That is, directories can be nested

3

Example

4

Root directory

Empty directory

Operations on files

• Create
– open(), write()

• Read
– open(), read(), lseek()

• Update
– write(), lseek()

• Delete
– unlink()

5

System calls:
calls to functions in the API

provided by OS

Create

• User interface via GUI or touch command in Linux

• Implementation, e.g., via a C program with a
system call: open()

• int fd = open(“foo”, O_CREAT | O_WRONLY |
O_TRUNC);
– Open with flags indicating the specifics

– O_CREAT: create a file

– O_WRONLY: write only

– O_TRUNC: remove existing contents if exits

6

|: Bitwise OR operator

File descriptor

• Note open() returns a file descriptor

– Typically an integer

– Reserved fds: stdin 0, stdout, 1, stderr 2

7

Read

• read(fd, buffer, size)

– Read from file “fd” number of bytes

– And store them in buffer

• Read starts from the current offset of fd

– Initially 0

8

Write

• write(fd, buffer, size)

– Write to file fd number of bytes stored in
buffer

– Also start writing from the current offset

9

Random read and write

• off_t lseek(int fd, off_t offset, int whence)
– If whence is SEEK_SET, the offset is set to

bytes from the beginning of file

– If whence is SEEK_CUR, the offset is set to its
current location plus bytes

– If whence is SEEK_END, the offset is set to the size
of the file plus bytes (typically offset is
negative, e.g., -8 for 8 bytes from the end)

• whence: from where

10

11

Copy a file

“0” starts an octal number
=> permissions:

110 (owner) rw-
100 (group) r–
100 (others) r–

Pointer to a character array

File permission mode

12
rw-r–r–
=> 110 (owner permission) 100 (group) 100 (others)

Resources for system calls

• https://en.wikipedia.org/wiki/System_call
• open:

https://en.wikipedia.org/wiki/Open_(system_call
)

• read:
https://en.wikipedia.org/wiki/Read_(system_call)

• write:
https://en.wikipedia.org/wiki/Write_(system_call
)

• close:
https://en.wikipedia.org/wiki/Close_(system_call)

13

https://en.wikipedia.org/wiki/System_call
https://en.wikipedia.org/wiki/Open_(system_call)
https://en.wikipedia.org/wiki/Read_(system_call)
https://en.wikipedia.org/wiki/Write_(system_call)
https://en.wikipedia.org/wiki/Close_(system_call)

Resources for system calls

• man –S 2 read

– Find it in the Section 2 of the manual

14

Install gcc on EC2

• sudo yum install gcc

• Usage:

– gcc -o copy2 copy2.c

15

File and directory

• When creating a file

– Bookkeeping data structure (inode) created:
recording size of file, location of its blocks, etc.

– Linking a human-readable name to the file

– Putting the link in a directory

16

Info about file (stored in inode)
struct stat {

dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* inode number */
mode_t st_mode; /* protection */
nlink_t st_nlink; /* number of (hard) links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device ID (if special device file, e.g., /etc/tty) */
off_t st_size; /* total size, in bytes */
blksize_t st_blksize; /* blocksize for filesystem I/O */
blkcnt_t st_blocks; /* number of blocks allocated */
time_t st_atime; /* last time file content was accessed */
time_t st_mtime; /* last time file content was modified */
time_t st_ctime; /* last time inode was changed */

};

Execute “man -S 2 stat” for more details…

17

inode

• Stores metadata/attributes about the file

• Also stores locations of blocks holding the
content of the file

18

Example

• a.txt

abc def

abc def

abc def

19

Block size

# of blocks allocated

Device id

Inode #

User id Group id

# of (hard) links

Access permission

noatime

20

Hard links

21

Symbolic links

22

Working with directories

• Create: mkdir() system call

– Used to implement command, e.g., mkdir xyz

• Read: opendir(), readdir(), closedir()

– ls xyz

• Delete: rmdir()

23

Roadmap

• Files and directories

– CRUD operations

• Implementation

– Data structures: how to organize the blocks

– Access methods: turn system calls into operations
on data structures

24

Organization of blocks

• Array-based

– Disk consists of a list of blocks

– We will assume this

• Tree-based, e.g., SGI XFS

– Blocks are organized into variable-length extents

– Use B+-tree to quickly find free extents

25

Blocks

• Consider a disk with 64 blocks

– 4KB/block

– 512B/sector (we assume this in this lecture)

• So there are 212/29 = 23 = 8 sectors/block

– Capacity of disk = 64 * 4KB = 256KB

26

Data region

• 56 blocks (#8-63)

– used to store data/content of files

– but see later: some blocks may store pointers

27

Metadata

• For each file, file system records its metadata

– Information in the “stat” struct

– Location of blocks that stores the content of file

28

Metadata of files stored in inodes

• Index nodes

• Stored in blocks #3 — #7 (i.e., 5 blocks)

• Together they are called the “inode” table

29

How many inodes are there?

• 256 bytes/inode

• 5 blocks, 4KB/block

=> 16 inodes/block (4K/256 = 212/28 )

=> 5 blocks, 5 * 16 = 80 inodes

=> File system can store at most 80 files

30

Free space management using bitmaps

• Bitmap: a vector of bits
– 0 for free (inode/block), 1 for in-use

• Inode bitmap (imap)
– keep track of which inodes in the inode table are

available

• Data bitmap (dmap)
– Keep track of which blocks in data region are

available

31

Bitmaps

• Each bitmap is stored in a block

– Block “i”: keep track of 80 inodes (could track 32K)

– Block “d”: keep track of the 56 data blocks

32

Superblock

• Track where i/d blocks and inode table are

– E.g., inode table starts at block 3; there are 80
inodes and 56 data blocks, etc.

• Indicate type of FS & inumber of its root dir

• Will be read first when file system is mounted

33

inumber

• Each inode is identified by a number

– Low-level number of file name

• Can figure out location of inode from inumber

34

inumber => location

• inumber = 32

=> address: offset in bytes from the beginning

=> which sector?

35

inumber => location of inode

• Address: 12K + 32 * 256 = 20K

• Sector #: 20K/512 = 40

– more generally

– ہ
ۂ

(inodeStartAddress + inumber ∗ inode size)/sector
size

36

inode => location of data blocks

• A number of direct pointers
– E.g., 8 pointers, each points to a data block

– Enough for 8*4K = 32K size of file

• Also has a slot for indirect pointer
– Pointing to a data block storing direct pointers

– Assume 4 bytes for block address (e.g.,
represented in CHS), so 1024 pointers/block

– Now file can have (8 + 1024) blocks or 4,128KB

37

Multi-level index

• Pointers may be organized into multiple levels
– Indirect pointer (as in previous slide)

• Inode (pointer1, pointer2, …, indirect pointer)
• Indirect pointer -> a block of direct pointers

– Double indirect pointers
• Inode (pointer1, pointer2, …, indirect pointer)
• Indirect pointer -> a block of indirect pointers instead

-> each points to a block of direct pointers

– Triple indirect pointers
• Indirect pointer -> a block of indirect pointers

-> each points to a block of indirect pointers
-> each points to a block of direct pointers

38

Double Indirect Pointers

39

inode

Block of indirect pointers

Indirect
pointer

Indirect
pointer

Block of direct pointers
direct pointers

Advantages of multi-level index

• Grow to more levels as needed

• Direct pointers handle most of the cases

– Many files are small

40

Directory organization

• Directory itself stored as a file

• For each file in the directory, it stores:

– name, inumber, record length, string length

41
Actual length

Record length vs string length

• String length = # of characters in file name + 1
(for \0: end of string)

• Record length >= string length

– Due to entry reuse

42

Reusing directory entries

• If file is deleted (using rm command) or a
name is unlinked (using unlink command)

– File is finally deleted when its last (hard) link is
removed

• Then inumber in its directory entry set to 0
(reserved for empty entry)

– So we know it can be reused

43

Storing a directory

• Also as a file with its own inode + data block

• inode:

– file type: directory (instead of regular file)

– pointer to block(s) in data region storing directory
entries

44

Roadmap

• Files and directories

– CRUD operations

• Implementation

– Data structures: how to organize blocks, e.g., into
array/tree

– Access methods: turn system calls to operations
on data structures

45

Open for read

• fd = open(“/foo/bar”, O_RDONLY)

46

Open for read

• fd = open(“/foo/bar”, O_RDONLY)

– Need to locate inode of the file “/foo/bar”

– Assume inumber of root, say 2, is known (e.g.,
when the file system is mounted)

47

Open for read

1. Read inode and content of / (2 reads)

– Look for “foo” in / -> foo’s inumber

2. Read inode and content of /foo (2 reads)

– Look for “bar” in /foo -> bar’s inumber

3. Read inode of /foo/bar (1 read)

– Permission check + allocate file descriptor

48

Cost of open()

• Need 5 reads of inode/data block

49

File-open table per process

File descriptor File name Inumber Position offset …

3 /foo/bar 32382 0

4 /foo/more 48482 512 …

50

Reading the file

• read(fd, buffer, size)

– Note fd is maintained in per-process open-file
table

– Table translates fd -> inumber of file

51

Reading the file

• read(fd, buffer, size)

1. Consult bar’s inode to locate a block

2. Read the block

3. Update inode with newest file access time

4. Update open-file table with new offset

5. Repeat above steps until done (with reading
data of given size)

52

Cost for reading a block
• 3 I/O’s:

– read inode, read data block, write inode

53

2

Open for write

• int fd = open(“/foo/bar”, O_WRONLY)

– Or int fd = create((“/foo/bar”)

– Assume bar is a new file under foo

– (note the difference from reading chapter!)

54

Open for write

• int fd = open(“/foo/bar”, O_WRONLY)

1. Read ‘/’ inode & content

– obtain foo’s inumber

2. Read ‘/foo’ inode & content

– check if bar exists

55

Open for write

3. Read imap, to find a free inode for bar

4. Update imap, setting 1 for allocated inode

5. Write bar’s inode

56

Open for write

6. Update foo’s content block

– Adding an entry for bar

7. Update foo’s inode

– Update its modification time

57

Cost for “open for write”

• int fd = open(“/foo/bar”, O_WRONLY)

• Need 9 I/O’s

58

data

bitmap

inode

bitmap

root

inode

foo

inode

bar

inode

root

data

foo

data

bar

data[0]

bar

data[1]

bar

data[2]

read

read

read

read

read

write

write

write

write

create()

Writing the file: /foo/bar

1. Read inode of bar (by first looking up its
inumber in the file-open table)

2. Allocate new data block

– Read and write bmap

3. Write to data block of bar

4. Update bar inode

– new modification time, add pointer to block

59

Cost of writing /foo/bar

• 5 I/O’s for write a block

60

data

bitmap

inode

bitmap

root

inode

foo

inode

bar

inode

root

data

foo

data

bar

data[0]

bar

data[1]

bar

data[2]

read

read

read

read

read

write

write

write

write

read

read

write

write

write

create()

write()

Caching for read

• First read may be slow

– But subsequent ones will speed up

• Good idea to cache popular blocks

– e.g., determined via LRU strategy

61

Buffering for delayed write

• Improve write performance via:

– Batching (e.g., two updates to the same imap)

– Scheduling (reordering for better performance)

– Avoiding writes (if file created, then quickly
deleted)

• Problem: update may be lost when system
crashes

62

Example file systems

• NTFS
– New technology file system, Microsoft proprietary

• FAT
– File allocation table
– FAT 16, 32, …
– 32 bits = # of sectors a file can occupy

512B/sector => 2TB limit on file size
4KB/sector => 16TB limit

• Ext4
– fourth extended file system, common in Linux

63