Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
1
14
–
513
18
–
613
Carnegie Mellon
System-Level I/O
15-213/18-213/14-513/15-513/18-613: Introduction to Computer Systems
21st Lecture, November 10, 2020
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
2
Carnegie Mellon
Today
Unix I/O
Metadata, sharing, and redirection Standard I/O
RIO (robust I/O) package
Closing remarks
CSAPP 10.1-10.4
CSAPP 10.6-10.9 CSAPP 10.10 CSAPP 10.5 CSAPP 10.11
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
3
Carnegie Mellon
Today: Unix I/O, C Standard I/O and RIO Two sets: system-level and C level
Robust I/O (RIO): 15-213 special wrappers
good coding practice: handles error checking, signals, and “short counts”
fopen fdopen fread fwrite fscanf fprintf sscanf sprintf fgets fputs fflush fseek fclose
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
4
C application program
Standard I/O functions
Unix I/O functions (accessed via system calls)
RIO functions
rio_readn
rio_writen
rio_readinitb
rio_readlineb
rio_readnb
open read
write lseek
stat close
Carnegie Mellon
Unix I/O Overview
A Linux file is a sequence of m bytes:
▪ B0 , B1 , …. , Bk , …. , Bm-1
Cool fact: All I/O devices are represented as files: ▪ /dev/sda2 (/usr disk partition)
▪/dev/tty2 (terminal)
Even the kernel is represented as a file:
▪ /boot/vmlinuz-3.13.0-55-generic (kernel image)
▪ /proc (kernel data structures)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
5
Carnegie Mellon
Unix I/O Overview
Elegant mapping of files to devices allows kernel to export simple interface called Unix I/O:
▪ Opening and closing files
▪ open()and close()
▪ Reading and writing a file
▪ read() and write()
▪ Changing the current file position (seek)
▪ indicates next offset into file to read or write ▪ lseek()
Current file position = k
B0
B1
•• •
Bk-1
Bk
Bk+1
•• •
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
6
Carnegie Mellon
File Types
Each file has a type indicating its role in the system ▪ Regular file: Contains arbitrary data
▪ Directory: Index for a related group of files
▪ Socket: For communicating with a process on another machine
Other file types beyond our scope ▪ Named pipes (FIFOs)
▪ Symbolic links
▪ Character and block devices
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
7
Carnegie Mellon
Regular Files
A regular file contains arbitrary data
Applications often distinguish between text files and binary files
▪ Text files are regular files with only ASCII or Unicode characters ▪ Binary files are everything else
▪ e.g., object files, JPEG images
▪ Kernel doesn’t know the difference!
Text file is sequence of text lines
▪ Text line is sequence of chars terminated by newline char (‘\n’)
▪ Newline is 0xa, same as ASCII line feed character (LF)
End of line (EOL) indicators in other systems ▪ Linux and Mac OS: ‘\n’ (0xa)
▪ line feed (LF)
▪ Windows and Internet protocols: ‘\r\n’ (0xd 0xa)
▪ Carriage return (CR) followed by line feed (LF) Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
8
Carnegie Mellon
Directories
Directory consists of an array of links
▪ Each link maps a filename to a file
Each directory contains at least two entries
▪ . (dot) is a link to itself
▪ .. (dot dot) is a link to the parent directory in the directory hierarchy (next slide)
Commands for manipulating directories ▪ mkdir: create empty directory
▪ ls: view directory contents
▪ rmdir: delete empty directory
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
9
Carnegie Mellon
Directory Hierarchy
All files are organized as a hierarchy anchored by root directory
named / (slash)
bin/ dev/ etc/
/
home/ usr/
bash tty1 group passwd droh/ bryant/ include/ bin/
hello.c stdio.h sys/ vim
unistd.h
Kernel maintains current working directory (cwd) for each process ▪ Modified using the cd command
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
10
Carnegie Mellon
Pathnames
Locations of files in the hierarchy denoted by pathnames
▪ Absolute pathname starts with ‘/’ and denotes path from root ▪ /home/droh/hello.c
▪ Relative pathname denotes path from current working directory (cwd) ▪ ../droh/hello.c
bin/ dev/ etc/
/ cwd: /home/bryant
home/ usr/
bash tty1 group passwd droh/ bryant/ include/ bin/
hello.c stdio.h sys/ vim
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
unistd.h
11
Carnegie Mellon
Opening Files
Opening a file informs the kernel that you are getting ready to access that file
int fd; /* file descriptor */
if ((fd = open(“/etc/hosts”, O_RDONLY)) < 0) { perror("open");
exit(1);
}
Returns a small identifying integer file descriptor
▪ Lowest numbered file descriptor not currently open for the process ▪ fd == -1 indicates that an error occurred
Each process created by a Linux shell begins life with three open files associated with a terminal:
▪ 0: standard input (stdin)
▪ 1: standard output (stdout)
▪ 2: standard error (stderr)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
12
Carnegie Mellon
Closing Files
Closing a file informs the kernel that you are finished accessing that file
int fd; /* file descriptor */ int retval; /* return value */
if ((retval = close(fd)) < 0) { perror("close");
exit(1);
}
Closing an already closed file is a recipe for disaster in threaded programs (more on this later)
Moral: Always check return codes, even for seemingly benign functions such as close()
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
13
Carnegie Mellon
Reading Files
Reading a file copies bytes from the current file position to memory, and then updates file position
#include
char buf[512];
int fd; /* file descriptor */ ssize_t nbytes; /* number of bytes read */
/* Open file fd */
…
/* Then read up to 512 bytes from file fd */
if ((nbytes = read(fd, buf, sizeof(buf))) < 0) {
perror("read");
exit(1);
}
Returns number of bytes read from file fd into buf ▪ Return type ssize_t is signed integer
▪ nbytes < 0 indicates that an error occurred
▪ Short counts (nbytes < sizeof(buf)) are possible and are not errors!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
14
Carnegie Mellon
Writing Files
Writing a file copies bytes from memory to the current file position, and then updates current file position
char buf[512];
int fd; /* file descriptor */ ssize_t nbytes; /* number of bytes read */
/* Open the file fd */
...
/* Then write up to 512 bytes from buf to file fd */
if ((nbytes = write(fd, buf, sizeof(buf)) < 0) {
perror("write");
exit(1);
}
Returns number of bytes written from buf to file fd ▪ nbytes < 0 indicates that an error occurred
▪ As with reads, short counts are possible and are not errors!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
15
Carnegie Mellon
Simple Unix I/O example
Copying file to stdout, one byte at a time
#include "csapp.h"
int main(int argc, char *argv[]) {
char c;
int infd = STDIN_FILENO;
if (argc == 2) {
infd = Open(argv[1], O_RDONLY, 0);
}
while(Read(infd, &c, 1) != 0)
Write(STDOUT_FILENO, &c, 1);
exit(0); }
showfile1_nobuf.c
Demo:
linux> strace ./showfile1_nobuf names.txt
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
16
Carnegie Mellon
On Short Counts
Short counts can occur in these situations: ▪ Encountering (end-of-file) EOF on reads
▪ Reading text lines from a terminal
▪ Reading and writing network sockets
Short counts never occur in these situations: ▪ Reading from disk files (except for EOF)
▪ Writing to disk files
Best practice is to always allow for short counts.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
17
Carnegie Mellon
Home-grown buffered I/O code Copying file to stdout, BUFSIZE bytes at a time
#include “csapp.h” #define BUFSIZE 64
int main(int argc, char *argv[])
{
char buf[BUFSIZE];
int infd = STDIN_FILENO; if (argc == 2) {
infd = Open(argv[1], O_RDONLY, 0);
}
while((nread = Read(infd, buf, BUFSIZE)) != 0) Write(STDOUT_FILENO, buf, nread);
exit(0); }
showfile2_buf.c
Demo:
linux> strace ./showfile2_buf names.txt
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
18
Carnegie Mellon
Today
Unix I/O
Metadata, sharing, and redirection Standard I/O
RIO (robust I/O) package
Closing remarks
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
19
Carnegie Mellon
File Metadata
Metadata is data about data, in this case file data
Per-file metadata maintained by kernel
▪ accessed by users with the stat and fstat functions
/* Metadata returned by the stat and fstat functions */
struct stat {
dev_t
ino_t
st_dev; st_ino; st_mode; st_nlink; st_uid; st_gid; st_rdev; st_size;
/* Device */
/* inode */
/* Protection and file type */
/* Number of hard links */
/* User ID of owner */
/* Group ID of owner */
/* Device type (if inode device) */ /* Total size, in bytes */
};
mode_t
nlink_t
uid_t
gid_t
dev_t
off_t
unsigned long st_blksize; /* Blocksize for filesystem I/O */
unsigned long st_blocks;
time_t st_atime;
time_t st_mtime;
time_t st_ctime;
/* Number of blocks allocated */
/* Time of last access */
/* Time of last modification */
/* Time of last change */
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
20
Carnegie Mellon
How the Unix Kernel Represents Open Files
Two descriptors referencing two distinct open files. Descriptor 1 (stdout) points to terminal, and descriptor 4 points to open disk file
Descriptor table
[one table per process]
stdin fd0 stdout fd1 stderr fd2
fd 3 fd 4
Open file table
[shared by all processes]
File A (terminal)
v-node table
[shared by all processes]
Info in
stat
struct
File access
File size
File type
File pos
refcnt=1
File B (disk)
File access
File size
File type
File pos
refcnt=1
File pos is maintained per open file
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
21
… …
… …
Carnegie Mellon
File Sharing
Two distinct descriptors sharing the same disk file through two distinct open file table entries
▪ E.g., Calling open twice with the same filename argument
Descriptor table
[one table per process]
stdin fd0 stdout fd1 stderr fd2
fd 3 fd 4
Open file table
[shared by all processes]
File A (disk)
v-node table
[shared by all processes]
File access
File size
File type
File pos
refcnt=1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
22
File B (disk)
File pos
refcnt=1
Different logical but same physical file
…
… …
Carnegie Mellon
How Processes Share Files: fork A child process inherits its parent’s open files
▪ Note: situation unchanged by exec functions (use fcntl to change) Beforeforkcall:
Descriptor table
[one table per process]
stdin fd0 stdout fd1 stderr fd2
fd 3 fd 4
Open file table
[shared by all processes]
File A (terminal)
v-node table
[shared by all processes]
File access
File size
File type
File pos
refcnt=1
File B (disk)
File access
File size
File type
File pos
refcnt=1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
23
… …
… …
Carnegie Mellon
How Processes Share Files: fork
A child process inherits its parent’s open files
Afterfork:
▪ Child’s table same as parent’s, and +1 to each refcnt
Descriptor table
[one table per process]
Parent
Open file table
[shared by all processes]
File A (terminal)
v-node table
[shared by all processes]
File access
File size
File type
File pos
refcnt=2
fd 0 fd 1 fd 2 fd 3 fd 4
Child fd 0
fd 1 fd 2 fd 3 fd 4
File B (disk)
File access
File size
File type
File pos
refcnt=2
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
File is shared between processes
24
… …
… …
Carnegie Mellon
I/O Redirection
Question: How does a shell implement I/O redirection?
linux> ls > foo.txt
Answer: By calling the dup2(oldfd, newfd) function ▪ Copies (per-process) descriptor table entry oldfd to entry newfd
Descriptor table
before dup2(4,1)
fd0 fd1 fd2 fd3 fd4
Descriptor table
after dup2(4,1)
fd0 fd1 fd2 fd3 fd4
a
b
b
b
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
25
Carnegie Mellon
I/O Redirection Example
Step #1: open file to which stdout should be redirected ▪ Happens in child executing shell code, before exec
Descriptor table
[one table per process]
stdin fd0 stdout fd1 stderr fd2
fd 3 fd 4
Open file table
[shared by all processes]
File A
v-node table
[shared by all processes]
File access
File size
File type
File pos
refcnt=1
File B
File access
File size
File type
File pos
refcnt=1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
26
… …
… …
Carnegie Mellon
I/O Redirection Example (cont.)
Step #2: call dup2(4,1)
▪ cause fd=1 (stdout) to refer to disk file pointed at by fd=4
Descriptor table
[one table per process]
stdin fd0 stdout fd1 stderr fd2
fd 3 fd 4
Open file table
[shared by all processes]
File A
v-node table
[shared by all processes]
File access
File size
File type
File pos
refcnt=0
File B
File access
File size
File type
File pos
refcnt=2
Two descriptors point to the same file
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
27
… …
… …
Carnegie Mellon
Warm-Up: I/O and Redirection Example
#include “csapp.h”
int main(int argc, char *argv[])
{
int fd1, fd2, fd3;
char c1, c2, c3;
char *fname = argv[1];
fd1 = Open(fname, O_RDONLY, 0); fd2 = Open(fname, O_RDONLY, 0); fd3 = Open(fname, O_RDONLY, 0); Dup2(fd2, fd3);
Read(fd1, &c1, 1);
Read(fd2, &c2, 1);
Read(fd3, &c3, 1);
printf(“c1 = %c, c2 = %c, c3 = %c\n”, c1, c2, c3); return 0;
}
ffiles1.c
What would this program print for file containing “abcde”? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
28
Carnegie Mellon
Warm-Up: I/O and Redirection Example
#include “csapp.h”
int main(int argc, char *argv[])
{
int fd1, fd2, fd3;
char c1, c2, c3;
char *fname = argv[1];
fd1 = Open(fname, O_RDONLY, 0); fd2 = Open(fname, O_RDONLY, 0); fd3 = Open(fname, O_RDONLY, 0); Dup2(fd2, fd3);
Read(fd1, &c1, 1);
Read(fd2, &c2, 1);
Read(fd3, &c3, 1);
printf(“c1 = %c, c2 = %c, c3 = %c\n”, c1, c2, c3); return 0;
}
ffiles1.c
What would this program print for file containing “abcde”? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
29
c1 = a, c2 = a, c3 = b
dup2(oldfd, newfd)
Carnegie Mellon
Master Class: Process Control and I/O
#include “csapp.h”
int main(int argc, char *argv[])
{
int fd1;
int s = getpid() & 0x1;
char c1, c2;
char *fname = argv[1];
fd1 = Open(fname, O_RDONLY, 0);
Read(fd1, &c1, 1);
if (fork()) { /* Parent */
sleep(s);
Read(fd1, &c2, 1);
printf(“Parent: c1 = %c, c2 = %c\n”, c1, c2);
} else { /* Child */
sleep(1-s);
Read(fd1, &c2, 1);
printf(“Child: c1 = %c, c2 = %c\n”, c1, c2); }
return 0; }
ffiles2.c
What would this program print for file containing “abcde”? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
30
Carnegie Mellon
Master Class: Process Control and I/O
#include “csapp.h”
int main(int argc, char *argv[])
{
int fd1;
int s = getpid() & 0x1;
char c1, c2;
char *fname = argv[1];
fd1 = Open(fname, O_RDONLY, 0);
Read(fd1, &c1, 1);
if (fork()) { /* Parent */
sleep(s);
Read(fd1, &c2, 1);
Bonus: Which way
printf(“Parent: c1 = %c, c2 = %c\n”, c1, c2); } else { /* Child */
sleep(1-s);
Read(fd1, &c2, 1);
printf(“Child: c1 = %c, c2 = %c\n”, c1, c2); }
return 0; }
ffiles2.c
What would this program print for file containing “abcde”? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
31
Child: c1 = a, c2 = b Parent: c1 = a, c2 = c
Parent: c1 = a, c2 = b
Child: c1 = a, c2 = c
does it go?
Carnegie Mellon
Quiz Time!
Check out:
https://canvas.cmu.edu/courses/17808
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
32
Carnegie Mellon
Today
Unix I/O
Metadata, sharing, and redirection Standard I/O
RIO (robust I/O) package
Closing remarks
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
33
Carnegie Mellon
Standard I/O Functions
The C standard library (libc.so) contains a collection of higher-level standard I/O functions
▪ Documented in Appendix B of K&R
Examples of standard I/O functions:
▪ Opening and closing files (fopen and fclose)
▪ Reading and writing bytes (fread and fwrite)
▪ Reading and writing text lines (fgets and fputs)
▪ Formatted reading and writing (fscanf and fprintf)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
34
Carnegie Mellon
Standard I/O Streams
Standard I/O models open files as streams
▪ Abstraction for a file descriptor and a buffer in memory C programs begin life with three open streams
(defined in stdio.h) ▪ stdin (standard input)
▪ stdout (standard output) ▪ stderr (standard error)
#include
extern FILE *stdin; /* standard input (descriptor 0) */ extern FILE *stdout; /* standard output (descriptor 1) */ extern FILE *stderr; /* standard error (descriptor 2) */
int main() {
fprintf(stdout, “Hello, world\n”);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
35
Carnegie Mellon
Buffered I/O: Motivation
Applications often read/write one character at a time ▪ getc, putc, ungetc
▪ gets, fgets
▪ Read line of text one character at a time, stopping at newline
Implementing as Unix I/O calls expensive ▪ read and write require Unix kernel calls
▪ > 10,000 clock cycles
Solution: Buffered read
▪ Use Unix read to grab block of bytes
▪ User input functions take one byte at a time from buffer
▪ Refill buffer when empty Buffer
already read
unread
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
36
Carnegie Mellon
Buffering in Standard I/O
Standard I/O functions use buffered I/O
printf(“h”);
buf
printf(“e”);
printf(“l”);
printf(“l”);
printf(“o”);
printf(“\n”);
h
e
l
l
o
\n
.
.
fflush(stdout); write(1, buf, 6);
Buffer flushed to output fd on “\n”, call to fflush or exit, or return from main.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
37
Carnegie Mellon
Standard I/O Buffering in Action
You can see this buffering in action for yourself, using the
always fascinating Linux strace program:
#include
int main() {
printf(“h”); printf(“e”); printf(“l”); printf(“l”); printf(“o”); printf(“\n”); fflush(stdout); exit(0);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
38
linux> strace ./hello
execve(“./hello”, [“hello”], [/* … */]).
…
write(1, “hello\n”, 6) = 6
…
exit_group(0) = ?
Carnegie Mellon
Standard I/O Example
Copying file to stdout, line-by-line with stdio
#include “csapp.h” #define MLINE 1024
int main(int argc, char *argv[])
{
char buf[MLINE];
FILE *infile = stdin; if (argc == 2) {
infile = fopen(argv[1], “r”);
if (!infile) exit(1);
}
while(fgets(buf, MLINE, infile) != NULL) fprintf(stdout, buf);
exit(0); }
showfile3_stdio.c
Demo:
linux> strace ./showfile3_stdio names.txt
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
39
Carnegie Mellon
Today
Unix I/O
Metadata, sharing, and redirection Standard I/O
RIO (robust I/O) package
Closing remarks
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
40
Carnegie Mellon
Today: Unix I/O, C Standard I/O and RIO Two incompatible libraries building on Unix I/O
Robust I/O (RIO): 15-213 special wrappers
good coding practice: handles error checking, signals, and “short counts”
fopen fdopen fread fwrite fscanf fprintf sscanf sprintf fgets fputs fflush fseek fclose
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
41
C application program
Standard I/O functions
Unix I/O functions (accessed via system calls)
RIO functions
rio_readn
rio_writen
rio_readinitb
rio_readlineb
rio_readnb
open read
write lseek
stat close
Carnegie Mellon
Unix I/O Recap
/* Read at most max_count bytes from file into buffer. Return number bytes read, or error value */
ssize_t read(int fd, void *buffer, size_t max_count);
/* Write at most max_count bytes from buffer to file. Return number bytes written, or error value */
ssize_t write(int fd, void *buffer, size_t max_count);
Short counts can occur in these situations: ▪ Encountering (end-of-file) EOF on reads
▪ Reading text lines from a terminal
▪ Reading and writing network sockets
Short counts never occur in these situations: ▪ Reading from disk files (except for EOF)
▪ Writing to disk files
Best practice is to always allow for short counts. Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
42
Carnegie Mellon
The RIO Package (15-213/CS:APP Package)
RIO is a set of wrappers that provide efficient and robust I/O in apps, such as network programs that are subject to short counts
RIO provides two different kinds of functions ▪ Unbuffered input and output of binary data
▪ rio_readn and rio_writen
▪ Buffered input of text lines and binary data
▪ rio_readlineb and rio_readnb
▪ Buffered RIO routines are thread-safe and can be interleaved
arbitrarily on the same descriptor
Download from http://csapp.cs.cmu.edu/3e/code.html → src/csapp.c and include/csapp.h
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
43
Carnegie Mellon
Unbuffered RIO Input and Output Same interface as Unix read and write
Especially useful for transferring data on network sockets
#include “csapp.h”
ssize_t rio_readn(int fd, void *usrbuf, size_t n); ssize_t rio_writen(int fd, void *usrbuf, size_t n);
Return: num. bytes transferred if OK, 0 on EOF (rio_readn only), -1 on error
▪ rio_readn returns short count only if it encounters EOF ▪ Only use it when you know how many bytes to read
▪ rio_writen never returns a short count
▪ Calls to rio_readn and rio_writen can be interleaved arbitrarily on
44
the same descriptor
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
Implementation of rio_readn
/*
* rio_readn – Robustly read n bytes (unbuffered) */
ssize_t rio_readn(int fd, void *usrbuf, size_t n) {
size_t nleft = n;
ssize_t nread;
char *bufp = usrbuf;
while (nleft > 0) {
if ((nread = read(fd, bufp, nleft)) < 0) {
if (errno == EINTR) /* Interrupted by sig handler return */
}
csapp.c
nread = 0;
else
return -1;
}
else if (nread == 0)
break; nleft -= nread;
bufp += nread;
}
return (n - nleft);
/* and call read() again */
/* errno set by read() */
/* EOF */
/* Return >= 0 */
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
45
Carnegie Mellon
Buffered RIO Input Functions
Efficiently read text lines and binary data from a file partially cached in an internal memory buffer
#include “csapp.h”
void rio_readinitb(rio_t *rp, int fd);
ssize_t rio_readlineb(rio_t *rp, void *usrbuf, size_t maxlen); ssize_t rio_readnb(rio_t *rp, void *usrbuf, size_t n);
Return: num. bytes read if OK, 0 on EOF, -1 on error
▪ rio_readlineb reads a text line of up to maxlen bytes from file fd and stores the line in usrbuf
▪ Especially useful for reading text lines from network sockets ▪ Stopping conditions
▪ maxlen bytes read
▪ EOF encountered
▪ Newline (‘\n’) encountered
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
46
Carnegie Mellon
Buffered RIO Input Functions (cont)
#include “csapp.h”
void rio_readinitb(rio_t *rp, int fd);
ssize_t rio_readlineb(rio_t *rp, void *usrbuf, size_t maxlen); ssize_t rio_readnb(rio_t *rp, void *usrbuf, size_t n);
Return: num. bytes read if OK, 0 on EOF, -1 on error
▪ rio_readnb reads up to n bytes from file fd
▪ Stopping conditions ▪ n bytes read
▪ EOF encountered
▪ Calls to rio_readlineb and rio_readnb can be interleaved
arbitrarily on the same descriptor
▪ Warning: Don’t interleave with calls to rio_readn
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
47
Carnegie Mellon
Buffered I/O: Implementation For reading from file
File has associated buffer to hold bytes that have been read from file but not yet read by user code
rio_cnt
already read
unread
Buffer
rio_buf
rio_bufptr
Layered on Unix file:
Buffered Portion
no longer in buffer
already read
unread
unseen
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
48
Current File Position
Carnegie Mellon
Buffered I/O: Declaration
All information contained in struct
rio_cnt
already read
unread
Buffer
rio_buf
rio_bufptr
typedef struct {
int rio_fd;
/* descriptor for this internal buf */ /* unread bytes in internal buf */
/* next unread byte in internal buf */
int rio_cnt;
char *rio_bufptr;
char rio_buf[RIO_BUFSIZE]; /* internal buffer */
} rio_t;
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
49
Carnegie Mellon
Standard I/O Example
Copying file to stdout, line-by-line with rio
#include “csapp.h”
#define MLINE 1024
int main(int argc, char *argv[])
{
rio_t rio;
char buf[MLINE];
int infd = STDIN_FILENO; ssize_t nread = 0;
if (argc == 2) {
infd = Open(argv[1], O_RDONLY, 0);
}
Rio_readinitb(&rio, infd);
while((nread = Rio_readlineb(&rio, buf, MLINE)) != 0)
Rio_writen(STDOUT_FILENO, buf, nread); exit(0);
}
showfile4_stdio.c
Demo:
linux> strace ./showfile4_rio names.txt
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
50
Carnegie Mellon
Today
Unix I/O
Metadata, sharing, and redirection Standard I/O
RIO (robust I/O) package
Closing remarks
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
51
Carnegie Mellon
Standard I/O Example
Copying file to stdout, loading entire file with mmap
#include “csapp.h”
int main(int argc, char **argv)
{
struct stat stat;
if (argc != 2) exit(1);
int infd = Open(argv[1], O_RDONLY, 0); Fstat(infd, &stat);
size_t size = stat.st_size;
char *bufp = Mmap(NULL, size, PROT_READ,
MAP_PRIVATE, infd, 0);
Write(1, bufp, size);
exit(0); }
showfile5_mmap.c
Demo:
linux> strace ./showfile5_mmap names.txt
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
52
Carnegie Mellon
Unix I/O vs. Standard I/O vs. RIO
Standard I/O and RIO are implemented using low-level Unix I/O
fopen fdopen fread fwrite fscanf fprintf sscanf sprintf fgets fputs fflush fseek fclose
Which ones should you use in your programs? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
53
C application program
Standard I/O functions
Unix I/O functions (accessed via system calls)
RIO functions
rio_readn
rio_writen
rio_readinitb
rio_readlineb
rio_readnb
open read
write lseek
stat close
Carnegie Mellon
Pros and Cons of Unix I/O
Pros
▪ Unix I/O is the most general and lowest overhead form of I/O
▪ All other I/O packages are implemented using Unix I/O functions
▪ Unix I/O provides functions for accessing file metadata
▪ Unix I/O functions are async-signal-safe and can be used safely in signal handlers
Cons
▪ Dealing with short counts is tricky and error prone
▪ Efficient reading of text lines requires some form of buffering, also tricky and error prone
▪ Both of these issues are addressed by the standard I/O and RIO packages Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
54
Carnegie Mellon
Pros and Cons of Standard I/O Pros:
▪ Buffering increases efficiency by decreasing the number of read and write system calls
▪ Short counts are handled automatically
Cons:
▪ Provides no function for accessing file metadata
▪ Standard I/O functions are not async-signal-safe, and not appropriate for signal handlers
▪ Standard I/O is not appropriate for input and output on network sockets ▪ There are poorly documented restrictions on streams that interact
badly with restrictions on sockets (CS:APP3e, Sec 10.11)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
55
Carnegie Mellon
Choosing I/O Functions
General rule: use the highest-level I/O functions you can
▪ Many C programmers are able to do all of their work using the standard I/O functions
▪ But, be sure to understand the functions you use! When to use standard I/O
▪ When working with disk or terminal files
When to use raw Unix I/O
▪ Inside signal handlers, because Unix I/O is async-signal-safe ▪ In rare cases when you need absolute highest performance
When to use RIO
▪ When you are reading and writing network sockets ▪ Avoid using standard I/O on sockets
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
56
Carnegie Mellon
Aside: Working with Binary Files
Binary File
▪ Sequence of arbitrary bytes ▪ Including byte value 0x00
Functions you should never use on binary files
▪ Text-oriented I/O: such as fgets, scanf, rio_readlineb
▪ Interpret EOL characters.
▪ Use functions like rio_readn or rio_readnb instead
▪ String functions
▪ strlen, strcpy, strcat
▪ Interprets byte value 0 (end of string) as special Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
57
Carnegie Mellon
Extra Slides
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
58
Carnegie Mellon
Fun with File Descriptors (3)
#include “csapp.h”
int main(int argc, char *argv[])
{
int fd1, fd2, fd3;
char *fname = argv[1];
fd1 = Open(fname, O_CREAT|O_TRUNC|O_RDWR, S_IRUSR|S_IWUSR); Write(fd1, “pqrs”, 4);
fd3 = Open(fname, O_APPEND|O_WRONLY, 0);
Write(fd3, “jklmn”, 5);
fd2 = dup(fd1); /* Allocates descriptor */
Write(fd2, “wxyz”, 4);
Write(fd3, “ef”, 2);
return 0;
}
ffiles3.c
What would be the contents of the resulting file?
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
59
Carnegie Mellon
Accessing Directories
Only recommended operation on a directory: read its entries
▪ dirent structure contains information about a directory entry
▪ DIR structure contains information about directory while stepping through its entries
#include
#include
{
DIR *directory;
struct dirent *de;
…
if (!(directory = opendir(dir_name)))
error(“Failed to open directory”);
…
while (0 != (de = readdir(directory))) { printf(“Found file: %s\n”, de->d_name);
}
…
closedir(directory);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
60
Carnegie Mellon
Example of Accessing File Metadata
int main (int argc, char **argv)
{
struct stat stat;
char *type, *readok;
Stat(argv[1], &stat);
if (S_ISREG(stat.st_mode))
type = “regular”;
else if (S_ISDIR(stat.st_mode))
type = “directory”; else
linux> ./statcheck statcheck.c type: regular, read: yes linux> chmod 000 statcheck.c linux> ./statcheck statcheck.c type: regular, read: no
linux> ./statcheck ..
type: directory, read: yes
/* Determine file type */
type = “other”;
if ((stat.st_mode & S_IRUSR)) /* Check read access */
readok = “yes”;
else
readok = “no”;
printf(“type: %s, read: %s\n”, type, readok);
exit(0); }
statcheck.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
61
Carnegie Mellon
For Further Information The Unix bible:
▪ W. Richard Stevens & Stephen A. Rago, Advanced Programming in the Unix Environment, 3rd Edition, Addison Wesley, 2013
▪ Updated from Stevens’s 1993 classic text
The Linux bible:
▪ Michael Kerrisk, The Linux Programming Interface, No Starch Press, 2010
▪ Encyclopedic and authoritative
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
62