Andrew File System
NJIT
Andrew File System
Vishal Patel
Agenda
What is AFS?
History of AFS
Basics of AFS
Benefits of AFS
Drawbacks of AFS
Versions of AFS
References
What is AFS?
AFS is distributed file system that enables co-operating hosts (clients and servers) to efficiently share file system resources across both local area and wide area networks
provides transparent file access between systems running AFS
Similar to Sun Microsystems’ Network File System (NFS)
Software available for most UNIX platforms
AFS runs on systems from: HP, Next, DEC, IBM, SUN, and SGI.
History of AFS
AFS is based on a distributed file system originally developed at the Information Technology Center at Carnegie-Mellon University in 1984.
The idea was to provide a campus-wide file system for home directories which would run effectively using a limited bandwidth campus backbone network.
Figure 8.11 Processes in Andrew File System
Basics of AFS
Cells
Volumes
Tokens
Cache Manager
File Protection
File Space Design
Basics of AFS (Cont’d)
File Operation
File Sharing
Login and Authentication
Venus
Vice
Implementation
AFS Commands
Cells
An AFS cell is a collection of servers grouped together administratively and presenting a single, cohesive file system.
Typically, an AFS cell is a set of hosts that use the same Internet domain name.
Normally, a variation of the domain name is used as the cell name. Users log into AFS client workstations which request information and files from the cell’s servers on behalf of the users.
Volumes
The storage disks in a computer are divided into sections called partitions. AFS further divides partitions into units called volumes.
The volumes provide a convenient container for storing related files and directories.
System administrator can move volumes from one file server to another without noticing, because AFS automatically tracks a volume’s location
Tokens
AFS does not use UNIX user IDs for authentication. In order to access files which are not world accessible using AFS, you must have a valid AFS token. You may see what tokens you currently hold using the tokens command.
Cache Manager
Cache Manager maintains information about the identities of the users logged into the machine, finds and requests data on their behalf, and keeps chunks of retrieved files on local disk.
The effect of this is that as soon as a remote file is accessed a chunk of that file gets copied to local disk and so subsequent accesses (warm reads) are almost as fast as to local disk and considerably faster than a cold read (across the network).
File Protection
File protections do not work the same way in AFS as they do in UNIX.
AFS augments the standard UNIX file protection mechanism, using a more precise mechanism for controlling access to files: an access control list (ACL).
File Space Design
File Space Design (Cont’d)
Hierarchical file structure like the UNIX file system
AFS root is generally named /afs and next level is called a cell
– administrative domain — a defined set of AFS
servers within a company, university, lab, etc.
– local cell — the default cell associated with your
workstation
– foreign cell — other cells in the AFS file space
Subsequent levels are UNIX files
Some facilities use AFS for users’ login directory
Venus and Vice
AFS Client computers run a process called Venus that access the AFS system.
AFS Server computers run a process called Vice that acts as a front end to the replicated data, a hybrid of front ends and replica managers.
File Operation
Andrew caches entire file from the system. A client workstation interacts with Vice servers only during opening and closing of files
Venus – caches files from Vice when they are opened, and stores modified copies of files back when they are closed
Reading and writing bytes of a file are done by the kernel without Venus intervention on the cached copy
Venus caches contents of directories and symbolic links, for path-name translation
Exceptions to the caching policy are modifications to directories that are made directly on the server responsible for that directory
File Sharing
AFS enables users to share remote files as easily as local files. To access a file on a remote machine in AFS, you simply specify the file’s pathname. In contrast, to access a file in a remote machine’s UNIX file system, you must log into the remote machine or create a mount point on the local machine that points to a directory in the remote machine’s UNIX file system
AFS users can see and share all the files under the /afs root directory, given the appropriate privileges. An AFS user who has the necessary privileges can access a file in any AFS cell, simply by specifying the file’s pathname. File sharing in AFS is not restricted by geographical distances or operating system differences
Figure 8.12 File name space seen by clients of AFS
Login and Authentication
To become an authenticated AFS user, you need to provide a
password to AFS
On machines that use an AFS-modified login utility, logging in is a one-step process; your initial login automatically authenticates you with AFS.
On machines that do not use an AFS-modified login utility, you must perform two steps.
Log in to your local machine.
Issue the klog command with the -setpag argument to authenticate with AFS and get your token.
Your system administrator can tell you whether your machine uses an AFS-modified login utility or not
Implementation
Client processes are interfaced to a UNIX kernel with the usual set of system calls
Venus carries out path-name translation component by component
The UNIX file system is used as a low-level storage system for both servers and clients. The client cache is a local directory on the workstation’s disk
Both Venus and server processes access UNIX files directly by their inodes to avoid the expensive path name-to-inode translation routine
Venus manages two separate caches:
(1) for status (2) for data
LRU algorithm used to keep each of them bounded in size
Figure 8.13 System call interception in AFS
Figure 8.14 AFS File System Calls
AFS Commands
AFS commands are grouped into three categories:
File server commands (fs)
– lists AFS server information
– set and list ACLs (access control list)
Protection commands (pts)
– create and manage (ACL) groups
Authentication commands
– klog, unlog, kpasswd, tokens
Benefits of AFS
Caching facility:
Caching significantly reduces the amount of network traffic, improving performance when a cold read is necessary
Location Independence:
AFS does its mapping (filename to location) at the server. This has the tremendous advantage of making the served file space location independent
Benefits of AFS (Cont’d)
Scalability:
An architectural goal of the AFS designers was client/server ratios of 200:1 which has been successfully exceeded at some sites.
Single systems image (SSI):
Establishing the same view of file store from each client and server in a network of systems (that comprise an AFS cell) is an order of magnitude simpler with AFS than it is with, say, NFS.
Benefits of AFS (Cont’d)
Improved security:
Firstly, AFS makes use of Kerberos to authenticate users. This improves security
Secondly, AFS uses access control lists (ACLs) to enable users to restrict access to their own directories.
Benefits of AFS (Cont’d)
“Easy to use” networking
Accessing remote file resources via the network becomes much simpler when using AFS
Improved system management capability
Systems administrators are able to make configuration changes from any client in the AFS cell
Improved robustness to server crash
Replicated AFS volumes
Drawbacks of AFS
Invasive install
Complexity of backend server function
Authentication issues with applications (e.g. ticket expiration)
Some useful commands on AFS
sar 2 10 ( to check the CPU idle)
top ( will give which process was taking the highest CPU)
/usr/bin/lsof (give the open ports and open files that are not closed)
netstat -an | grep will give you whether the port you were going to use for the server /client program is being utilized.
Bibliography
George Coularis, Jean Dollimore and Tim Kindberg, Distributed Systems, Concepts and Design, Addison Wesley, Fourth Edition, 2005
Figures from the Coulouris text are from the instructor’s guide and are copyrighted by Pearson Education 2005
References
AFS Tutorial http://www.alw.nih.gov/Documentation/AFS_tutorial.html
AFS FAQs
http://www.angelfire.com/hi/plutonic/afs-faq.html#sub1.04
AFS documentation
http://www.transarc.ibm.com/Library/documentation/afs_doc.html
http://www2.cs.cmu.edu/afs/andrew.cmu.edu/usr/shadow/www/afs.html
Venus
Workstations
Servers
Venus
Venus
User
program
Network
UNIX kernel
UNIX kernel
Vice
User
program
User
program
Vice
UNIX kernel
UNIX kernel
UNIX kernel
/ (root)
tmp
bin
cmu
vmunix
. . .
bin
Shared
Local
Symbolic
links
UNIX file
system calls
Non-local file
operations
Workstation
Local
disk
User
program
UNIX kernel
Venus
UNIX file system
Venus
User process
UNIX kernel
Venus
Net
Vice
open(FileName,
mode)
If
FileName
refers to a
file in shared file space,
pass the request to
Venus.
Open the local file and
return the file
descriptor to the
application.
Check list of files in
local cache. If not
present or there is no
valid
callback promise
,
send a request for the
file to the Vice server
that is custodian of the
volume containing the
file.
Place the copy of the
file in the local file
system, enter its local
name in the local cache
list and return the local
name to UNIX.
Transfer a copy of the
file and a
callback
promise
to the
workstation. Log the
callback promise.
read(FileDescriptor,
Buffer, length)
Perform a normal
UNIX read operation
on the local copy.
write(FileDescriptor,
Buffer, length)
Perform a normal
UNIX write operation
on the local copy.
close(FileDescriptor)
Close the local copy
and notify Venus that
the file has been closed.
If the local copy has
been changed, send a
copy to the Vice server
that is the custodian of
the file.
Replace the file
contents and send a
callback
to all other
clients holding
c
a
l
l
b
a
c
k
promises
on the file.