ICT374 Unix File Systems and Programming
Unix File Systems and Programming
Objectives
• UnderstandtheconceptoffileinUnixsystems
Copyright By PowCoder代写 加微信 powcoder
• Understandtheinternaldatastructuresforopenfiles
• UnderstandandbeabletouselowlevelI/Oprimitives
• UnderstandtheI/Oefficiencyissuerelatedtothebuffer
• Understandthestandardinput,standardoutputand
standard error and their redirections via dup and dup2
system calls
• Beabletousefunctionfcntl
• UnderstandandbeabletouseC’sstandardI/Olibrary
• Understandtheroleandthecontentofi-nodes
• Beawareofthestructureandcontentofdirectoryfiles
• UnderstandthestructureofatypicalUnixfilesystem
• BeawareofthevarioustypesofUNIXfiles
• Understandandbeabletochangetheaccesspermissions of files in programs
• Understandtheroleofumaskvalue
• Understandandbeabletousebothhardlinksand symbolic links
• Beabletoobtaininginformationaboutafile
• Understandandbeabletoreadandwritedirectoriesin
• Stevens&Rago:Ch3&Ch4 • SkimStallings’Chapter12
Page 1 of 43
ICT374 Unix File Systems and Programming
1. The Unix File Abstraction
To an application programmer, every Unix file has the following features:
(a) a file name
(b) a sequence of bytes
(c) nofurtherstructures(suchaslinesorrecords)
Behind the scenes, a file is stored as a collection of blocks of a fixed size on a storage media such as hard disk. These blocks are scattered around different parts of the disk or similar block device.
Page 2 of 43
ICT374 Unix File Systems and Programming
2. The Kernel Data Structures for File I/O
Before reading from or writing to a file, the kernel must “open” the file first. The open file operation establishes the appropriate data structures in the kernel to facilitate the actual I/O between the disk and the RAM in the future:
(a) To locate the file on the disk according to its file name (or path);
(b) To establish data structures for accessing and sharing the file among different processes;
(c) Toallocateafiledescriptorforthefilesothatthe subsequent read/write requests would use this file descriptor (instead of the file name) to refer to this open file.
The classical Unix system uses three data structures for this purpose: 1) the Per Process Table of Open Files, 2) the system-wide File Table, and 3) the system-wide V- Node Table. Understanding these data structures will help you gain an insight into how the files are accessed and shared and also various efficiency issues.
Please note that not all Unix-like operating systems (e.g., Linux) use exactly the same kernel data structures for their open files. However, they all support the abstraction that the classical Unix kernel data structures implement.
Page 3 of 43
ICT374 Unix File Systems and Programming
The Per Process Table of Open Files
(a) Each process has one Per Process Table of Open
Files, which contains the information of the files the process opened (and not yet closed), one entry for each file. The index of the entry becomes the file descriptor of the open file.
(b) In the entry of an open file, the file descriptor flag specifies whether this file descriptor should be closed upon executing one of the “exec” functions. POSIX.1 only defined one flag: FD_CLOEXEC, which means that the open file is closed when the exec function loads in a new program into the process.
(c) The “ptr” points to the entry in the File Table that describes how the file should be accessed.
(d) When a new process is created, three “files” are
open automatically pointing to the terminal input device (usually the keyboard) and the terminal output device (usually the display or terminal window), and their file descriptors are 0, 1 and 2 respectively. These three open files are known as the standard input (file descriptor 0), the standard output (file descriptor 1) and the standard error (file descriptor 2) for the new process.
(e) Any read/write operation for a file must use its file descriptor (not the file name, why?) to refer to the file.
Page 4 of 43
ICT374 Unix File Systems and Programming
(2) The File Table
The kernel also keeps one system-wide table for all open files and for all processes. Each open call is given one entry in this table. This means that, if the same file is opened several times (not yet closed) from several different processes, there will be multiple entries in this table for that file.
(a) The File Status Flags indicate whether the file is
opened for
• readonly(O_RDONLY)or
• writeonly(O_WRONLY)or
• bothreadandwrite(O_RDWR)
• appendoneachwrite(O_APPEND) • non-blockingread(O_NONBLOCK) • waitforwritetocomplete(O_SYNC)
(b) Current offset indicates the position where the next read/write begins;
(c) V-node ptr points to the entry in the V-node Table that contains all pertinent information on that file.
Page 5 of 43
ICT374 Unix File Systems and Programming
(3) The V-Node Table
Apart from the File Table, the kernel also keeps another system-wide table, known as the V-Node Table, for all open files and for all processes. Unlike the File Table, each open file has one and only one entry in this table.
(a) V-node information: the information about the
type of file (regular, pipe, directory, etc.) and a pointer to the function that operates on that type of file.
(b) If the file is a regular or directory file, the entry in the V-node Table also contains the i-node of the file, which is copied from the disk. The i-node contains all pertinent information about the file, such as owner, access permission, size and access time, about that file.
(c) A file can only have one entry in the V-Node Table, no matter how many times it was opened and how many processes have opened it. This is contrary to the File Table where a file may have several entries in the File Table, depending on how many times it was opened (and not closed).
V-node information I-node information (incl Current file size)
V-node information I-node information (incl Current file size)
Page 6 of 43
ICT374 Unix File Systems and Programming
(4) Sharing Files Among Processes
The following diagram illustrates how file sharing is achieved among different processes via the three tables.
In the above diagram, there are two processes, each having its own Per Process Table of Open Files, sharing two files: File A and File B.
There are two types of file sharing between the two processes. However, there are differences in the way these two files are shared. For example, File B is shared via the same File Table entry while File A is shared via two separate File Table entries.
What are the consequences due to the differences in the way the files are shared?
Page 7 of 43
ICT374 Unix File Systems and Programming
3. System calls open and close
The open system call establishes the necessary data structures (allocates an entry in the Per Process File Table
and add an entry in File Table and possibly entry in V- Node Table if this is the first time the file is open by any process) for a file and returns a file descriptor that will be used to represent the open file. The file descriptor is the index to the allocated entry in the Per Process Table of Open Files.
The close system call removes the data structures for the given file.
#include
#include
#include
int open (const char pathname, int oflag, /* mode_t mode */ )
Return the file descriptor if OK or -1 on error
oflag: consists of exactly one of the following three flags (mutually exclusive):
– read only or
– write only or
— both read and write
and a combination of the following optional flags (incomplete list):
Page 8 of 43
ICT374 Unix File Systems and Programming
O_APPEND — to append to end of the file on each write, i.e., set the offset to the end of file before
each write
O_CREAT — to create the file on disk if the named file
does not exist. This option requires the third argument, mode, which specifies the access permission for the new file.
O_EXCL — to generate an error if option O_CREAT is specified and the file already exists.
O_TRUNC — if the file exists and is open for write or for both read and write, truncate its length to
0 (i.e., replace the old content) O_NONBLOCK – mainly for FIFOs, character special files
or block special files. If the I/O cannot be satisfied immediately, return with an error without wait.
The file descriptor returned from the open or creat system call is guaranteed to be the lowest number available (unused descriptor).
Example 1: Open an existing file for writing fd = open (“foo”, O_WRONLY);
Example 2: Append any writing to the existing file fd = open(“foo”, O_WRONLY|O_APPEND);
Example3: Ifthefile“foo”exists,openitforwriting. Otherwise create a new file on the disk for writing with access mode 0766 (octal number)
fd = open(“foo”, O_WRONLY|O_CREAT, 0766);
Page 9 of 43
ICT374 Unix File Systems and Programming
Example 4: Same as the above except that each write is appended to the end of the file
fd = open(“foo”, O_WRONLY | O_CREAT|
O_APPEND, 0766);
Example 5: Open the terminal device /dev/ttyp0 for non-blocking read
fd = open (“/dev/ttyp0”, O_RDONLY |
O_NONBLOCK );
Close System Calls
#include
int close (int filedescriptor); Return 0 if OK or -1 on error
Page 10 of 43
ICT374 Unix File Systems and Programming
System Calls: read and write
#include
ssize_t read (int filedes, void *buffer,
size_t nbytes );
Return: number of bytes read in, or 0 if the offset is pointing to the end of file, or –1 if error
Note: it is the responsibility of the caller to allocate space for the buffer!
#include
ssize_t write (int filedes, void *buf,
size_t nbytes);
Return: number of bytes written if OK or –1 on error
Example: Copy file fd1 to file fd2
void copy (int fd1, int fd2)
ssize_t nread;
char buf[100];
while ((nread=read(fd1,buf,100))>0)
write (fd2, buf, nread );
Is the above program correct? If not, what could be the problem? (Hints: think about the write call in the program).
Page 11 of 43
ICT374 Unix File Systems and Programming
5. I/O Efficiency
Since disk files (i.e., regular and directory files) are stored block by block in the disk, and disk I/O is performed block by block, the size of the temporary buffer used in read and write operations can affect I/O performance significantly. In order to reduce the number of actual disk I/O, hence increasing I/O performance for large files, the buffer size in our programs should either be the same or multiple of the block size.
The following example illustrates the performance variation for different buffer sizes. The program tries to copy a file of size 1,468,802 bytes. The block size of the disk is 8192 (Stevens P.56).
USER CPU (seconds)
SYS CPU (seconds)
397.9 6.6 0.6 0.3 0.3 0.3
Clock Time (seconds)
1 23.8 64 0.3 1024 0.0 8192 0.0 32768 0.0 131072 0.0
423.4 1468802 7.0 22950 0.6 1435 0.3 180 0.3 45 0.3 12
Page 12 of 43
ICT374 Unix File Systems and Programming
6. Change Offset – lseek
When a file is opened, its offset is set to zero (0), i.e., it points to the beginning of the file. Each successive read and write would advance the offset by the number of bytes read or written. The offset can be changed to any value (>=0) by the lseek system call.
#include
#include
off_t lseek (int filedes, off_t offset,
int WHENCE );
Returns: the new file offset if OK, or -1 on error
WHENCE: SEEK_SET:
new_offset = 0 + offset
new_offset = current_offset + offset
new_offset = file_size + offset
(a) Argument offset can be non-negative as well
as negative
(b) The new offset should normally be non-negative
(at least for regular files)
(c) lseekdoesnotcauseanyfileI/O
Page 13 of 43
ICT374 Unix File Systems and Programming
Example Program: test lseek /* file name: test_lseek.c
#include
#include
#include
#define MODE 0666 /* access permission */
int main(void)
char buf1[]=”abcdefghijklmnopqrstuvwxyz”;
char buf2[]=”**************************”;
off_t newpos;
ssize_t n;
if ((fd=open(“testfile”, O_RDWR
| O_CREAT | O_TRUNC, MODE))== -1){
printf(“Error: unable to open or create file\n”);
exit(1); }
if (write(fd, buf1, 26) != 26) { printf(“Error: cannot write 26 bytes\n”); exit(1);
newpos = lseek(fd, 10, SEEK_SET); // newpos=10
printf(“Current offset = %d\n”, newpos);
newpos = lseek(fd, -10, SEEK_END); // newpos=26-10=16 printf(“Current offset = %d\n”, newpos);
newpos = lseek(fd, -10, SEEK_CUR); // newpos=16-10=6 printf(“Current offset = %d\n”, newpos);
n = read(fd, buf2, 10); // read in “ghijklmnop” printf(“buf2 = %s\n”, buf2);
buf2[n] = ‘\0’;
printf(“buf2 = %s\n”, buf2);
close(fd);
exit(0); }
Page 14 of 43
ICT374 Unix File Systems and Programming
7. The dup and dup2 system call for Standard Input/Output/Error Redirection
The normal output of a command such as ls usually goes to the terminal output (the screen or the terminal window). We can redirect this output to a file from the shell prompt:
% ls > foo
In the above example, the normal output from the ls process would not go to the terminal output device. Instead, it is written to file foo.
Since no one has changed the program ls in anyway, we must assume that the process ls knows nothing about the
redirection. How then is this redirection achieved without the knowledge of the process?
Earlier, we mentioned that when a process is created, three “files” are opened automatically, with the following file descriptors:
Descriptor Constant symbol
0 STDIN_FILENO 1 STDOUT_FILENO 2 STDERR_FILENO
Common name
standard input standard output standard error
These three file descriptors are initially associated with the control terminal of the process:
• the standard input with the terminal input device (eg, keyboard)
• the standard output with the terminal output device (eg, the monitor screen or the terminal window)
• the standard error with the terminal output device Page 15 of 43
ICT374 Unix File Systems and Programming
Most Unix programs are written as “filters”. A filter is a program that takes its input from the standard input and then sends its normal output to the standard output and its error messages to the standard error.
Function printf sends a string to the standard output. This is why statements such as
printf(“hi, there”);
would usually send their arguments to the monitor screen or the terminal window, because the standard output is linked to the terminal output device. Similarly, statements such as
scanf(“%d”, &n);
would usually read their inputs from the keyboard.
However, we can change the content in the entry 0, 1, or 2 in the Per Process Table for Open Files so that it points to the File Table entry of another file. This would cause the standard input, the standard output or the standard error to be directed away from the terminal device to a regular file.
To achieve this, dup or dup2 system call is used to duplicate the entry of a given file descriptor on another entry of the Per Process Table of Open File.
#include
int dup (int filedes);
int dup2 (int filedes, int filedes2);
Both return: the new file descriptor if OK, or -1 on error
Page 16 of 43
ICT374 Unix File Systems and Programming
(a) With dup, the new file descriptor is guaranteed to be
the lowest numbered file descriptor that was
available in the Per Process Table of Open Files.
(b) With dup2, the new file descriptor value is
filedes2. If filedes2 already exists, it is closed
(c) The duplicated file descriptor shares the same entry
in the File Table as the original file descriptor.
fd = open (“FOO”, O_WRONLY|O_CREAT, 0766); close (STDOUT_FILENO);
before the dup after the dup call call
The above close and dup calls can be combined with a single dup2 call:
fd = open(“FOO”, O_WRONLY|O_CREAT, 0766); dup2 (fd, STDOUT_FILENO)
Page 17 of 43
ICT374 Unix File Systems and Programming
How does a shell achieve standard I/O redirection in the command below?
% ls > FOO
Step 1: Step 2:
The shell opens or creates the file FOO and obtains the file descriptor fd allocated to the file.
The shell creates a copy of itself using fork system call. The newly created child process inherits all file descriptors from the parent process, including fd. In the diagram below, the shell is C shell (csh). Other shells achieve this in the same way.
In the child process P2, it closes file descriptor 1 and duplicates fd on 1:
Child process P2 loads the program ls using one of the “exec” functions, and then passes the control to the ls program. The ls program would run and send its output to the standard output, which is now the file FOO!
Page 18 of 43
ICT374 Unix File Systems and Programming
8. The fcntlFunction
This function is used to change the properties of a file that
is already open.
#include
#include
#include
int fcntl (int filedes, int cmd,
/* int arg */ );
Returns: depends on cmd if OK, -1 on Error
Depending on cmd, fcntl can be used to perform various operations including the following:
(1) duplicate existing File Descriptor (cmd = F_DUPFD)
newfd = fcntl (fd, F_DUPFD, 3);
newfd is guaranteed to be the lowest numbered file descriptor available that is greater than or equal to 3, the third argument.
Note: both descriptors, fd and newfd, point to the same entry in the File Table.
Page 19 of 43
ICT374 Unix File Systems and Programming
Get or set File Descriptor Flags (cmd = F_GETFD, F_SETFD)
Example: check the file descriptor flag
fd_flag = fcntl (fd, F_GETFD, 0);
if (fd_flag & FD_CLOEXEC)
printf(“fd flag: close-on-exec \n”);
Example: set close-on-exec flag:
fcntl (fd, F_SETFD, FD_CLOEXEC );
Example: clear file descriptor flag:
fcntl (fd, F_SETFD, 0 );
get or set File Status flags (cmd = F_GETFL,
Example: check file status flag (Stevens P.65):
int accmode, val;
val = fcntl (fd, F_GETFL, 0 );
accmode = val & O_ACCMODE;
if (accmode == O_RDONLY)
printf(“read only”);
else if (accmode == O_WRONLY)
printf(“write only”);
else if (accmode == O_RDWR)
printf(“Read and Write”);
printf(“error in access mode”);
Page 20 of 43
ICT374 Unix File Systems and Programming
If (val & O_APPEND )
printf (“, Append”);
If (val & O_NONBLOCK)
printf (“, NON-Blocking”);
Example: Set non-blocking mode:
val = fcntl (fd, F_GETFL, 0);
val = val | O_NONBLOCK;
fcntl (fd, F_SETFL, val);
Page 21 of 43
ICT374 Unix File Systems and Programming
i-nodes and directory files
Internally in the disk, each file has one i-node, which contains all information about that file except:
• thei-nodenumber • thefilename
“i-node” stands for “information node”. An i-node contains following information about a file:
• filetype
• accesspermissions
• ownerandgroupowner
• sizeofthefile
• numberoflinks
• devicenumberofthedevicewherethefileisstored • timeoflastaccess
• timeoflastmodification
• timeoflastchangetothei-node
• blocknumbersoftheblocksinwhichfileisstored
Inside the Unix kernel, a file is identified by its i-node number, not by its file name.
A Unix directory file contains the names of the files it “contains” and the i-node numbers of these files. It provides a mapping between the file name and its i-node number:
Example: In the following diagram, the directory “assign1” contains three files: q1, q2, q3 with the corresponding i-node numbers 200, 280,and 90. The contents of the directory file would be something like:
Page 22 of 43
ICT374 Unix File Systems and Programming
100 · 140 ·· 200 q1 280 q2 90 q3
Note: each directory contains the i-node number of itself (the dot “.”) as well as the i-node number of its parent directory (dot-dot “..”), even for “empty” directories.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com