程序代写代做代考 file system Files

Files
CSci4061: Introduction to Operating Systems

October 7, 2021
Computer Science & Engineering, University of Minnesota
1

Announcements
• Project 1 due today
• Project 2 to be released
• Part 1: Implementing useful file system–related commands • Part 2: Implementing a Unix shell
• Support file redirection and pipes (covered in today’s lecture) • Due: 10/21 (tight)
• Harder than project 1, so start early!
• First in-class midterm: 10/19
2

Last lecture
• I/O on streams
• Abstraction facilitating programming
• Three standard streams; what are they and how to use? • Open/close non-standard streams
• Common classes of streams
• Character-based I/O
• String-based I/O—be careful about formatting • Binary I/O
• Random I/O
3

Unix Files

Streams I/O vs. Files I/O
• Last lecture: High-level I/O (streams)
• Provide abstraction operations over files
• Different kinds of streams
• Today: Low-level I/O
• System call interfaces for file I/O operations
• High-level I/O calls low-level interfaces
• You can also call the low-level interfaces
• While streams provide fancy features like formatting, files may provdie unique
operations per file type
4

Files
• All devices are abstracted as files through device drivers • Uniform interface for file I/O:
• A small set of system calls (open, close, read, write) with some additional ones • Underlying actual implementations of the system calls vary
• Memory-mapped I/O, direct memory access, interrupts, etc.
• Benefits:
• Hide the complexity of actual device properties
• Programmers need to learn only a few system calls
5

File Types
• Files may be of different types, depending on:
• Type of device (e.g., disk vs. network)
• Type of I/O (sequential vs. random-access)
• Some additional operations may be exposed to the user based on file type
• E.g.: creating a regular file on disk vs. creating a network connection
6

Unix File Types
• Regular files
• Collection of data stored on disk
• Device files: Devices mapped onto files
• Block or character
• Pipes and FIFO
• Special files used for data-sharing across processes
• Directories
• Collection of files
• Symbolic Links
• Pointer to another file
• Sockets
• Files used for network communication
7

File I/O Operations
• Open a file
• Read/write to/from the file
• Move around read/write pointer (offset) in the file • Close the file when done
• Additional operations for different file types
• E.g.: make, change directories
• E.g.: set up, accept network connections
8

File I/O Operations: System Calls
• Open a file: open
• Read/write to/from the file:
• read
• write
• Move around pointer in the file if required
• lseek (for regular/block device files)
• Close the file when done: close
• Additional system calls for different file types
• E.g.: mkdir, chdir for directories • E.g.: connect, accept for sockets
9

Opening a File: open
int open(const char *pathname, int flags, …);
• Opens a file or creates a new file
• Returns a file descriptor, or -1 upon error
• Process-specific handle to the file
• Used by process to identify the file
• Parameters
• pathname: name of file to be opened • flags: Mode of opening the file
• O_RDONLY,O_WRONLY,O_RDWR,O_CREAT,… • Additional flags: Combine using OR operation (|)
// Example
int fd = open(“test.txt”, O_WRONLY | O_CREAT); // Q: what if “test.txt” does not exist?
10

Reading from a File: read
ssize_t read(int fd, void *buf, size_t count);
• Reads data from the current offset in the file • Parameters
• fd: file descriptor of file
• buf: buffer into which data is to be read
• Should have been allocated
• DO NOT pass NULL or unallocated buffer • count: number of bytes to read
• read is typically a blocking system call
11

Return value of read
ssize_t read(int fd, void *buf, size_t nbytes);
• Number of actual bytes read
• Could be less than number of bytes requested
• End-of-file reached
• Character devices: line-by-line reading • Sockets: Network buffering
• -1: Error; errno is set
12

Reading from a File: Example
char *buf= malloc(NBYTES * sizeof(char)); size_t nbytes, bytes_read;
bytes_read = read(fd, buf, nbytes);
if (bytes_read < 0) /* Error */ handle_error(); else if (bytes_read == 0) /* EOF */ handle__eof(); else if (bytes_read < nbytes) /* Try to read remaining bytes */ bytes_read=read(fd, buf+bytes_read, nbytes-bytes_read); 13 Writing to a File: write ssize_t write(int fd, void *buf, size_t nbytes); • Writes data to the current offset in the file • Parameters • fd: file descriptor of file • buf: buffer from which data is to be written • nbytes: number of bytes to write • Return value • The number of bytes written: success • -1: error; the global errno is set • A smaller value than nbytes: unlikely but possible: disk full 14 Closing a File: close int close(int fd); • Closes an open file • Releases resources associated with the file: file descriptors, other kernel data if last close of file • Deletes file if marked for deletion and last close of file • Parameter: file descriptor to close • Returns: • 0 if successful • -1 if error; errno is set 15 Setting Offset in a File: lseek off_t lseek(int fd, off_t offset, int whence); • Current file offset: • Position in file where next byte would be read from or written to • 0 when file is opened, or end-of-file if opened in append mode • Returns: • New file offset • -1 if error 16 Setting Offset in a File: lseek off_t lseek(int fd, off_t offset, int whence); • Parameters: • fd: file descriptor • offset: number of bytes to skip • It is based on whence • whence: where to skip from • SEEK_SET: from beginning of file • SEEK_CUR: from current position • SEEK_END: from end of file 17 Quiz 1: Using lseek 1. How to set offset to start? 2. How to get the current offset? 3. How to get the offset of the last byte? 4. How to get the file size? What would be the parameters? 18 Quiz 1: Using lseek 1. How to set offset to start? 2. How to get the current offset? 3. How to get the offset of the last byte? 4. How to get the file size? What would be the parameters? Answer: of1 = lseek(fd, 0, SEEK_SET); of2 = lseek(fd, 0, SEEK_CUR); of3 = lseek(fd, -1, SEEK_END); size = lseek(fd, 0, SEEK_END); \\ Start of file \\ Current offset \\ Last byte of file \\ Next byte after the end of the file 18 Buffering • All reads/writes eventually go to the disk or I/O device • What if we write 1 byte at a time? • Will have to go to the disk for every byte • Highly inefficient • Solution: Buffer a certain number of bytes before reading/writing them • Can be done in kernel space and/or in user space (e.g., through a library) 19 Buffering Policies—When to write data to the actual file • Depends on device and file type • Fully Buffered • Data read/written only when whole buffer is filled • E.g.: Disk files (buffered in chunks of blocks) • Q: what is BUFSIZE? • Line buffered • Data written on newline • E.g.: Terminal I/O, stdin, stdout • Unbuffered • Data read/written immediately • E.g.: stderr 20 Quiz 2: In what order do the prints happen? printf("A"); printf("B"); fprintf(stderr, "Z"); printf("C\n"); fprintf(stderr, "Y"); printf("D\n"); fprintf(stderr, "X"); 21 Quiz 2: In what order do the prints happen? printf("A"); printf("B"); fprintf(stderr, "Z"); printf("C\n"); fprintf(stderr, "Y"); printf("D\n"); fprintf(stderr, "X"); Answer: ZABC YD X 21 Forced Buffer Cleanup: fflush Forces writing of any unwritten data in a buffer printf("A"); fflush(stdout); //<---data in the buffer will be written to file printf("B"); fprintf(stderr, "Z"); printf("C\n"); fprintf(stderr, "Y"); printf("D\n"); fprintf(stderr, "X"); In what order do the prints happen? 22 Buffering: Benefits and Limitations • Benefits: • Improves I/O efficiency • Need fewer data copies, device accesses • Limitations: • Could have unintended consequences • E.g.: Out of order prints • Might result in loss of data—e.g., unexpected crashes 23 File Descriptors File Descriptors • Identifier returned by the operating system on opening a file • All operations performed on file descriptors • Each process has a file descriptor table • Contains currently open file descriptors • Opening a file adds a new entry to the table • Closing a file removes its entry from the table 24 File Descriptor Table (With two indirect layers) 25 Standard I/O • stdin, stdout, stderr • File descriptors 0, 1 and 2 • STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO • Always open by default for each process • Standard I/O is just like reading/writing to a file 26 Inheriting File Descriptors • After fork(), all file descriptors are copied to the new process • Each forked process has identical file-descriptor table • Each process has the same files open • The file offset is the same within each file for both processes • Each file descriptor points to the same entry in the global open-file table • Reads/writes/lseeks are shared by the processes 27 File Redirection File Redirection • Redirect the standard input or output to a file • Input redirection (<): Takes input from a file instead of keyboard • sort < file • Output redirection (>): Send output to a file instead of display
• ls – l > file
• Append output to a file (>>)
• ls -l >> file • stderr to stdout
• 2>&1
28

Duplicating File Descriptors: dup
int dup(int fd);
• Creates a copy of fd
• Returns new fd where it is copied
• Lowest available entry in the file descriptor table
29

Duplicating File Descriptors: dup2
int dup2(int fd1, int fd2);
• Creates a copy of fd1 onto fd2
• Closes fd2 if open—basically fd2 is replaced with fd1
30

Discussion: How is File Redirection Realized?
• How does a process redirect intput/output to a file instead of standard I/O?
• E.g.: in a shell, how do we do: sort < file 31 File Redirection Example Idea: Modify the file-descriptor table /* Open the desired file*/ fd=open(file1, ...); /* Duplicate fd onto stdin---fd becomes the "stdin" */ newfd=dup2(fd, STDIN_FILENO); /* Now exec desired command (e.g., sort) */ 32 Pipes What are Pipes? • Pipes are special files • Pipes are a mechanism for inter-process communication (IPC, week 12) • Allow processes to communicate and share data signals, etc. • A pipe provides a serial data channel between two processes • Relatively simple IPC mechanism • Channel is one-way • No lseek (treat as character device file) 33 What are Pipes? • prog1 | prog • Allow multiple processes to be linked together • Connects output of prog1 to input of prog • In other words, the output of prog1 automatically becomes the input of prog • Examples: • ls -l | more • cat foo | sort | head 34 How do Pipes Differ from Regular Files? • Sequential read/write • Nameless • Finite-sized memory buffer • cat /proc/sys/fs/pipe-max-size — 1M for my desktop • No disk I/O involved • Sharing has to be done through inheritance • Disappear as soon as all their ends are closed 35 Using Pipes: Overview • Create a pipe: Returns two fds • Share between processes using fork() • One process would read, the other would write • Read and write data through the pipe • Close the pipe when done 36 Creating a Pipe: pipe int pipe(int fds[2]); • Creates a pipe and returns its two ends in an array of two fds • fds[0]: for reading • fds[1]: for writing • Output of fds[1] is input of fds[0] 37 Sharing a Pipe across Processes • The only way to share a pipe is via fork() • After fork(), both processes get copies of the pipe fds 38 Pipe I/O: Initialization • Pipe is used to channel data from one process to another • One process would read from pipe • Another process would write to file • Each process closes one end of the pipe • Writing process closes read-end • Reading process closes write-end 39 Pipe I/O: Initialization • Suppose the parent wants to read from the pipe and the child wants to write to the pipe 40 Replacing Standard I/O with Pipe • Example: ls –l | sort • Use dup2 to replace stdin or stdout • Replace stdin with read-end of pipe (fds[0]) • Replace stdout with write-end of pipe (fds[1]) 41 Pipe I/O: Data Sharing • Reading: Done from read-end of pipe • Returns whatever data is in the pipe • Blocks if pipe is empty • Returns EOF if write-end of pipe is closed • Writing: Done from write-end of pipe • Writes into the pipe buffer • Blocks if pipe is full (finite buffer) • Receives SIGPIPE signal if read-end of pipe is closed 42 FIFO • Named pipes • Unidirectional shared channels like pipes • Have name and path like a regular file • Persistent even when both ends are closed • Benefits: • Can link unrelated processes • Longer lasting than pipes 43 Summary • File I/O operations • File buffering • File descriptors • File redirection • Pipes and FIFO 44 Quiz • What is the return value of read syscall? • All devices can be accessed as files. True/False? • Does buffering make I/O operations more efficient? • What does the following command do ? newfd=dup2(STDOUT_FILENO, fd); 45 References • Robbins 4.7, 5.1, 5.2 • http://www-users.cselabs.umn.edu/classes/Spring- 2018/csci4061/notes/file_io.pdf 46