CS计算机代考程序代写 database Java file system distributed system case study concurrency cache Excel Figure 15.1 A distributed multimedia system

Figure 15.1 A distributed multimedia system

Distributed File Systems

(DFS)

Updated by Rajkumar Buyya
* Introduction

* File service architecture

* Sun Network File System (NFS)

*→ Andrew File System (personal study)

* Recent advances

* Summary

Most concepts are
drawn from Chapter 12

2

Learning objectives

 Understand the requirements that affect the design

of distributed services

 NFS: understand how a relatively simple, widely-

used service is designed
– Obtain a knowledge of file systems, both local and networked

– Caching as an essential design technique

– Remote interfaces are not the same as APIs

– Security requires special consideration

 Recent advances: appreciate the ongoing research

that often leads to major advances (creation of a

widely used storage infrastructures like DropBox).

Introduction

 Why do we need a DFS?
– Primary purpose of a Distributed System…

– Resources…

 … can be inherently distributed

 … can actually be data (files, databases, …) and…

 … their availability becomes a crucial issue for the performance of a

Distributed System and applications.

Connecting Users and Resources

 A case for DFS

Introduction

I want to store

my thesis on the

server!

I need to have my

book always

available..

I need to

store my

analysis and

reports

safely…I need

storage for

my reports

My boss

wants…

Server A

Uhm… perhaps time

has come to buy a rack

of servers….

Introduction

 A Case for DFS

Server A

Server B

Server C

Wow… now I can

store a lot more

documents…

Hey… but

where did I

put my docs?

Same here…

I don’t

remember..

I am not sure whether

server A, or B, or C…

Uhm… … maybe we

need a DFS?… Well

after the paper and a

nap…

 A Case for DFS

Introduction

Server C

Server B
Server A

Good… I can access my

folders from anywhere..
Wow! I do not have

to remember which

server I stored the

data into…

Nice… my

boss will

promote me!

It is reliable, fault tolerant,

highly available, location

transparent…. I hope I can

finish my newspaper

now…

Distributed File System

7

Storage systems and their properties

 In first generation of distributed systems

(1974-95), file systems (e.g. NFS) were

the only networked storage systems.

 With the advent of distributed object

systems (CORBA, Java) and the web,

the picture has become more complex.

 Current focus is on large scale, scalable

storage.
– Google File System (GFS)

– Amazon S3 (Simple Storage Service)

– Cloud Storage (e.g., DropBox,

Google Drive, Microsoft OneDrive)

1974 – 1995

1995 – 2010

2010 – now

Storage systems and their properties

Sharing Persis-
tence

Distributed
cache/replicas

Consistency
maintenance

Example

Main memory RAM

File system UNIX file system

Distributed file system Sun NFS

Web
Web server

Distributed shared memory Ivy (Ch. 16)

Remote objects (RMI/ORB) CORBA

Persistent object store 1 CORBA Persistent
Object Service

Peer-to-peer storage store OcceanStore

1

1

1

2

Types of consistency between copies: 1 – strict one-copy consistency

√ – approximate/slightly weaker guarantees

X – no automatic consistency

2 – considerably weaker guarantees

9

What is a file system? 1

 Persistent stored data sets

 Hierarchic name space visible to all processes

 API with the following characteristics:
– access and update operations on persistently stored data sets

– Sequential access model (with additional random facilities)

 Sharing of data between users, with access control

 Concurrent access:
– certainly for read-only access

– what about updates?

 Other features:
– mountable file stores

– more? …

USERS

10

What is a file system? 2

filedes = open(name, mode)
filedes = creat(name, mode)

Opens an existing file with the given name.
Creates a new file with the given name.
Both operations deliver a file descriptor referencing the open
file. The mode is read, write or both.

status = close(filedes) Closes the open file filedes.

count = read(filedes, buffer, n)

count = write(filedes, buffer, n)

Transfers n bytes from the file referenced by filedes to buffer.
Transfers n bytes to the file referenced by filedes from buffer.
Both operations deliver the number of bytes actually transferred
and advance the read-write pointer.

pos = lseek(filedes, offset,
whence)

Moves the read-write pointer to offset (relative or absolute,
depending on whence).

status = unlink(name) Removes the file name from the directory structure. If the file
has no other names, it is deleted.

status = link(name1, name2) Adds a new name (name2) for a file (name1).

status = stat(name, buffer) Gets the file attributes for file name into buffer.

UNIX file system operations

11

What is a file system? 2

Class Exercise A
Write a simple C program to copy a file using the UNIX

file system operations:

copyfile(char * oldfile, * newfile)

{

}

Note: remember that read() returns 0 when you attempt

to read beyond the end of the file.

12

A code in C – Copy File program

Write a simple C program to copy a file using the UNIX file system operations.

#define BUFSIZE 1024

#define READ 0

#define FILEMODE 0644

void copyfile(char* oldfile, char* newfile)

{ char buf[BUFSIZE]; int i,n=1, fdold, fdnew;

if((fdold = open(oldfile, READ))>=0) {

fdnew = creat(newfile, FILEMODE);

while (n>0) {

n = read(fdold, buf, BUFSIZE);

if(write(fdnew, buf, n) < 0) break; } close(fdold); close(fdnew); } else printf("Copyfile: couldn't open file: %s \n", oldfile); } main(int argc, char **argv) { copyfile(argv[1], argv[2]); } 13 Di rectory modul e: relates f ile names to f ile IDs File modul e: relates f ile IDs to particular fil es Access control module: checks permi ssion for operation requested File access module: reads or writes fil e data or attri butes Bl ock module: accesses and allocates disk blocks Device modul e: disk I/O and bufferi ng File system modules What is a file system? (a typical module structure for implementation of non-DFS) Files Device Blocks Directories 14 updated by system: File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list E.g. for UNIX: rw-rw-r-- What is a file system? 4 File attribute record structure updated by owner: 15 Tranparencies Access: Same operations (client programs are unaware of distribution of files) Location: Same name space after relocation of files or processes (client programs should see a uniform file name space) Mobility: Automatic relocation of files is possible (neither client programs nor system admin tables in client nodes need to be changed when files are moved). Performance: Satisfactory performance across a specified range of system loads Scaling: Service can be expanded to meet additional loads or growth. Changes to a file by one client should not interfere with the operation of other clients simultaneously accessing or changing the same file. Concurrency properties Isolation File-level or record-level locking Other forms of concurrency control to minimise contention Replication properties File service maintains multiple identical copies of files • Load-sharing between servers makes service more scalable • Local access has better response (lower latency) • Fault tolerance Full replication is difficult to implement. Caching (of all or part of a file) gives most of the benefits (except fault tolerance) Heterogeneity properties Service can be accessed by clients running on (almost) any OS or hardware platform. Design must be compatible with the file systems of different OSes Service interfaces must be open - precise specifications of APIs are published. Fault tolerance Service must continue to operate even when clients make errors or crash. Service must resume after a server machine crashes. If the service is replicated, it can continue to operate even during a server crash. Consistency Unix offers one-copy update semantics for operations on local files - caching is completely transparent. Difficult to achieve the same for distributed file systems while maintaining good performance and scalability. Security Must maintain access control and privacy as for local files. •based on identity of user making request •identities of remote users must be authenticated •privacy requires secure communication Service interfaces are open to all processes not excluded by a firewall. •vulnerable to impersonation and other attacks Efficiency Goal for distributed file systems is usually performance comparable to local file system. Distributed File system/service requirements  Transparency  Concurrency  Replication  Heterogeneity  Fault tolerance  Consistency  Security  Efficiency.. * File service is most heavily loaded service in an intranet, so its functionality and performance are critical 16 File Service Architecture  An architecture that offers a clear separation of the main concerns in providing access to files is obtained by structuring the file service as three components: – A flat file service – A directory service – A client module.  The relevant modules and their relationship is (shown next).  The Client module implements exported interfaces by flat file and directory services on server side. 17 Model file service architecture Client computer Server computer Application program Application program Client module Flat file service Directory service Lookup AddName UnName GetNames Read Write Create Delete GetAttributes SetAttributes 18 Responsibilities of various modules  Flat file service: – Concerned with the implementation of operations on the contents of file. Unique File Identifiers (UFIDs) are used to refer to files in all requests for flat file service operations. UFIDs are long sequences of bits chosen so that each file has a unique among all of the files in a distributed system.  Directory Service: – Provides mapping between text names for the files and their UFIDs. Clients may obtain the UFID of a file by quoting its text name to directory service. Directory service supports functions needed to generate directories and to add new files to directories.  Client Module: – It runs on each computer and provides integrated service (flat file and directory) as a single API to application programs. For example, in UNIX hosts, a client module emulates the full set of Unix file operations. – It holds information about the network locations of flat-file and directory server processes; and achieve better performance through implementation of a cache of recently used file blocks at the client. 19 FileId A unique identifier for files anywhere in the network. Similar to the remote object references described in Section 4.3.3. Server operations/interfaces for the model file service Flat file service Read(FileId, i, n) -> Data

Write(FileId, i, Data)

Create() -> FileId

Delete(FileId)

GetAttributes(FileId) -> Attr

SetAttributes(FileId, Attr)

Directory service

Lookup(Dir, Name) -> FileId

AddName(Dir, Name, File)

UnName(Dir, Name)

GetNames(Dir, Pattern) -> NameSeq

Pathname lookup

Pathnames such as ‘/usr/bin/tar’ are resolved

by iterative calls to lookup(), one call for

each component of the path, starting with

the ID of the root directory ‘/’ which is

known in every client.

position of first byte

position of first byte FileId

20

File Group

A collection of files that can be
located on any server or moved

between servers while

maintaining the same names.

– Similar to a UNIX filesystem

– Helps with distributing the load of file

serving between several servers.

– File groups have identifiers which are

unique throughout the system (and

hence for an open system, they must

be globally unique).

 Used to refer to file groups and files

To construct a globally unique

ID we use some unique

attribute of the machine on

which it is created, e.g. IP

number, even though the file

group may move subsequently.

IP address date

32 bits 16 bits

File Group ID:

21

DFS: Case Studies

 NFS (Network File System)
– Developed by Sun Microsystems (in 1985)

– Most popular, open, and widely used.

– NFS protocol standardised through IETF (RFC 1813)

 AFS (Andrew File System)
– Developed by Carnegie Mellon University as part of Andrew

distributed computing environments (in 1986)

– A research project to create campus wide file system.

– Public domain implementation is available on Linux (LinuxAFS)

– It was adopted as a basis for the DCE/DFS file system in the Open

Software Foundation (OSF, www.opengroup.org) DEC (Distributed

Computing Environment)

22

Case Study: Sun NFS

 An industry standard for file sharing on local networks since the 1980s

 An open standard with clear and simple interfaces

 Closely follows the abstract file service model defined above

 Supports many of the design requirements already mentioned:

– transparency

– heterogeneity

– efficiency

– fault tolerance

 Limited achievement of:

– concurrency

– replication

– consistency

– security

NFS – History

 1985: Original Version (in-house use)

 1989: NFSv2 (RFC 1094)
– Operated entirely over UDP

– Stateless protocol (the core)

– Support for 2GB files

 1995: NFSv3 (RFC 1813)
– Support for 64 bit (> 2GB files)

– Support for asynchronous writes

– Support for TCP

– Support for additional attributes

– Other improvements

 2000-2003: NFSv4 (RFC 3010, RFC 3530)
– Collaboration with IETF

– Sun hands over the development of NFS

 2010: NFSv4.1
– Adds Parallel NFS (pNFS) for parallel data access

 2015
– RFC 7530 – NFS Version 4 Protocol

– Unlike earlier versions, it supports traditional file access while integrating support for file locking and the MOUNT protocol. It

makes NFS operate well in an Internet environment.

https://tools.ietf.org/html/rfc7530

24

NFS architecture

Client computer Server computer

UNIX
file

system

NFS
client

NFS
server

UNIX
file

system

Application
program

Application
program

Virtual file systemVirtual file system

O
th

e
r

fi
le

s
y
s
te

m

UNIX kernel

system calls

NFS
protocol

(remote operations)

UNIX

Operations

on local files

Operations

on

remote files

Application
program

NFS

Client

Kernel
Application

program

NFS

Client

Client computer

25

NFS architecture:
does the implementation have to be in the system kernel?

No:
– there are examples of NFS clients and servers that run at application-

level as libraries or processes (e.g. early Windows and MacOS

implementations, current PocketPC, etc.)

But, for a Unix implementation there are advantages:
– Binary code compatible – no need to recompile applications

 Standard system calls that access remote files can be routed through the

NFS client module by the kernel

– Shared cache of recently-used blocks at client

– Kernel-level server can access i-nodes and file blocks directly

 but a privileged (root) application program could do almost the same.

– Security of the encryption key used for authentication.

26

• read(fh, offset, count) -> attr, data

• write(fh, offset, count, data) -> attr

• create(dirfh, name, attr) -> newfh, attr

• remove(dirfh, name) status

• getattr(fh) -> attr

• setattr(fh, attr) -> attr

• lookup(dirfh, name) -> fh, attr

• rename(dirfh, name, todirfh, toname)

• link(newdirfh, newname, dirfh, name)

• readdir(dirfh, cookie, count) -> entries

• symlink(newdirfh, newname, string) -> status

• readlink(fh) -> string

• mkdir(dirfh, name, attr) -> newfh, attr

• rmdir(dirfh, name) -> status

• statfs(fh) -> fsstats

NFS server operations (simplified)

fh = file handle:

Filesystem identifier i-node number i-node generation

Model flat file service

Read(FileId, i, n) -> Data

Write(FileId, i, Data)

Create() -> FileId

Delete(FileId)

GetAttributes(FileId) -> Attr

SetAttributes(FileId, Attr)

Model directory service

Lookup(Dir, Name) -> FileId

AddName(Dir, Name, File)

UnName(Dir, Name)

GetNames(Dir, Pattern)

->NameSeq

27

NFS access control and authentication

 Stateless server, so the user’s identity and access rights must

be checked by the server on each request.
– In the local file system they are checked only on open()

 Every client request is accompanied by the userID and groupID
– which are inserted by the RPC system

 Server is exposed to imposter attacks unless the userID and

groupID are protected by encryption

 Kerberos has been integrated with NFS to provide a stronger

and more comprehensive security solution

Architecture Components (UNIX / Linux)

 Server:
– nfsd: NFS server daemon that services requests from clients.

– mountd: NFS mount daemon that carries out the mount request

passed on by nfsd.

– rpcbind: RPC port mapper used to locate the nfsd daemon.

– /etc/exports: configuration file that defines which portion of the file

systems are exported through NFS and how.

 Client:
– mount: standard file system mount command.

– /etc/fstab: file system table file.

– nfsiod: (optional) local asynchronous NFS I/O server.

29

Mount service

 Mount operation:

mount(remotehost, remotedirectory, localdirectory)

 Server maintains a table of clients who have

mounted filesystems at that server

 Each client maintains a table of mounted file

systems holding:

< IP address, port number, file handle>

 Hard versus soft mounts

30

Local and remote file systems accessible on an NFS client

ji m jane joeann

usersstudents

usrvmuni x

Cl ient Server 2

. . . nfs

Remote

mount
staff

big bobjon

people

Server 1

export

(root)

Remote

mount

. . .

x

(root) (root)

Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1;

the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.

31

Automounter

NFS client catches attempts to access ’empty’ mount

points and routes them to the Automounter
– Automounter has a table of mount points and multiple candidate

serves for each

– it sends a probe message to each candidate server and then uses the

mount service to mount the filesystem at the first server to respond

 Keeps the mount table small

 Provides a simple form of replication for read-only

filesystems
– E.g. if there are several servers with identical copies of /usr/lib then

each server will have a chance of being mounted at some clients.

32

Kerberized NFS

 Kerberos protocol is too costly to apply on each file access

request

 Kerberos is used in the mount service:
– to authenticate the user’s identity

– User’s UserID and GroupID are stored at the server with the client’s IP address

 For each file request:
– The UserID and GroupID sent must match those stored at the server

– IP addresses must also match

 This approach has some problems
– can’t accommodate multiple users sharing the same client computer

– all remote filestores must be mounted each time a user logs in

33

New design approaches

Distribute file data across several servers
– Exploits high-speed networks (InfiniBand, Gigabit Ethernet)

– Layered approach, lowest level is like a ‘distributed virtual disk’

– Achieves scalability even for a single heavily-used file

‘Serverless’ architecture
– Exploits processing and disk resources in all available network nodes

– Service is distributed at the level of individual files

Examples:
xFS : Experimental implementation demonstrated a substantial performance gain

over NFS and AFS

Peer-to-peer systems: Napster, OceanStore (UCB), Farsite (MSR), Publius (AT&T

research) – see web for documentation on these very recent systems

Cloud-based File Systems: DropBox

Dropbox Folder

Dropbox Folder Dropbox Folder

Automatic

synchronization

DropBox Cloud Storage Architecture

35

Summary

 Distributed File systems provide illusion of a local file system and hide complexity

from end users.

 Sun NFS is an excellent example of a distributed service designed to meet many

important design requirements

 Effective client caching can produce file service performance equal to or better than

local file systems

 Consistency versus update semantics versus fault tolerance remains an issue

 Most client and server failures can be masked

 Superior scalability can be achieved with whole-file serving (Andrew FS) or the

distributed virtual disk approach

 Modern DFSs are Cloud-based Files Systems (..Dropbox, GoogleDrive, OneDrive,..)

Advanced Features:

– support for mobile users, disconnected operation, automatic re-integration

– support for data streaming and quality of service (Tiger file system, Content Delivery

Networks)