The Sprite Network Operating System
John K. Ousterhout Andrew R. Cherenson Frederick Douglis Michael N. B. Science Division
Department of Electrical Engineering and Computer Sciences University of California
Berkeley, CA 94720
Copyright By PowCoder代写 加微信 powcoder
Sprite is a new operating system for networked uniprocessor and multiprocessor workstations with large physical memories. It implements a set of kernel calls much like those of 4.3 BSD UNIX, with extensions to allow processes on the same workstation to share memory and to allow processes to mi- grate between workstations. The implementation of the Sprite kernel contains several interesting features, including a remote procedure call facility for communication between kernels, the use of prefix tables to implement a single file name space and to provide flexibility in administering the net- work file system, and large variable-size file caches on both client and server machines, which provide high performance even for diskless workstations.
Keywords: Operating systems, networks, file systems, process migration, remote procedure call.
The Sprite Network Operating System November 19, 1987
1. Introduction
Sprite is an experimental network operating system under development at the University of Cali- fornia at Berkeley. It is part of a larger research project called SPUR (‘‘Symbolic Processing Using RISCs’’), whose goal is to design and construct a high-performance multiprocessor workstation with special hardware support for Lisp applications [4]. Although one of Sprite’s primary goals is to sup- port applications running on SPUR workstations, we hope that the system will work well for a variety of high-performance engineering workstations. Sprite is currently being used on Sun-2 and Sun-3 workstations.
The motivation for building a new operating system came from three general trends in computer technology: networks, large memories, and multiprocessors. In more and more research and engineer- ing organizations, computing occurs on personal workstations connected by local-area networks, with larger time-shared machines used only for those applications that cannot achieve acceptable perfor- mance on workstations. Unfortunately, workstation environments tend to suffer from poor perfor- mance and difficulties of sharing and administration, due to the distributed nature of the systems. In Sprite, we hope to hide the distribution as much as possible and make available the same ease of shar- ing and communication that is possible on time-shared machines.
The second technology trend that drove the Sprite design is the availability of ever-larger physi- cal memories. Today’s engineering workstations typically contain 4 to 32 Mbytes of physical memory, and we expect memories of 100-500 Mbytes to be commonplace within a few years. We believe that such large memories will change the traditional balance between computation and I/O by permitting all of users’ commonly-accessed files to reside in main memory. The ‘‘RAMdisks’’ avail- able on many commercial personal computers have already demonstrated this effect on a small scale. One of our goals for Sprite is to manage physical memory in a way that maximizes the potential for file caching.
The third driving force behind Sprite is the imminent arrival of multiprocessor workstations. Workstations with more than one processor are currently under development in several research organ- izations (UCB’s SPUR, DEC’s Firefly, and Xerox’s Dragon are a few prominent examples) and we
The Sprite Network Operating System November 19, 1987
expect multiprocessor workstations to be available from several major manufacturers within a few years. We hope that Sprite will facilitate the development of multiprocessor applications, and that the operating system itself will be able to take advantage of multiple processors in providing system ser- vices.
Our overall goal for Sprite is to provide simple and efficient mechanisms that capitalize on the three technology trends. This paper is a discussion of the ways that the trends affected the design of Sprite. In areas where the technology factors did not suggest special techniques, we modelled the sys- tem as closely as possible after Berkeley UNIX.
The technology trends had only a minor impact on the facilities that Sprite provides to applica- tion programs. For the most part, Sprite’s kernel calls are similar to those provided by the 4.3 BSD version of the UNIX operating system. However, we added three additional facilities to Sprite in order to encourage resource sharing: a transparent network file system; a simple mechanism for sharing writable memory between processes on a single workstation; and a mechanism for migrating processes between workstations in order to take advantage of idle machines.
Alhthough the technology trends did not have a large effect on Sprite’s kernel interface, they suggested dramatic changes in the kernel implementation, relative to UNIX. This is not surprising, since networks, large memories, and multiprocessors were not important issues in the early 1970’s when the UNIX kernel was designed. We built the Sprite kernel from scratch, rather than modifying an existing UNIX kernel. Some of the interesting features of the kernel implementation are:
The kernel contains a remote procedure call (RPC) facility that allows the kernel of each works- tation to invoke operations on other workstations. The RPC mechanism is used extensively in Sprite to implement other features, such as the network file system and process migration.
Although the Sprite file system is implemented as a collection of domains on different server machines, it appears to users as a single hierarchy that is shared by all the workstations. Sprite uses a simple mechanism called prefix tables to manage the name space; these dynamic struc- tures facilitate system administration and reconfiguration.
The Sprite Network Operating System November 19, 1987
To achieve high performance in the file system, and also to capitalize on large physical memories, Sprite caches file data both on server machines and client machines. A simple cache consistency mechanism guarantees that applications running on different workstations always use the most up-to-date versions of files, in exactly the same fashion as if the applications were executing on a single machine.
The virtual memory system uses ordinary files for backing storage; this simplifies the implemen- tation, facilitates process migration, and may even improve performance relative to schemes based on a special-purpose swap area. Sprite retains the code segments for programs in main memory even after the programs complete, in order to allow quick start-up when programs are reused. Lastly, the virtual memory system negotiates with the file system over physical memory usage, in order to permit the file cache to be as large as possible without degrading virtual memory performance.
Sprite guarantees that processes behave the same whether migrated or not. This is achieved by designating a home machine for each process and forwarding location-dependent kernel calls to the process’s home machine.
The rest of this paper describes the Sprite facilities and implementation in more detail. Section 2 discusses the unusual features of Sprite’s application interface, and Sections 3-7 provide more details on the aspects of the kernel implementation summarized above. Section 8 gives a brief status report and conclusions.
2. ApplicationInterface
Sprite’s application interface contains little that is new. The Sprite kernel calls are very similar to those provided by the Berkeley versions of UNIX (we have ported a large number of traditional UNIX applications to Sprite with relatively little effort). This section describes three unusual aspects of the application interface, all of which can be summed up in one word: sharing. First, the Sprite file system allows all of the disk storage and all of the I/O devices in the network to be shared by all processes so that processes need not worry about machine boundaries. Second, the virtual memory
The Sprite Network Operating System November 19, 1987
mechanism allows physical memory to be shared between processes on the same workstation so that they can extract the highest possible performance from multiprocessors. Third, Sprite implements pro- cess migration, which allows jobs to be offloaded to idle workstations and thereby allows processing power to be shared.
2.1. The File System
Almost all modern network file systems, including Sprite’s, have the same ultimate goal: net- work transparency. Network transparency means that users should be able to manipulate files in the same ways they did under time-sharing on a single machine; the distridbuted nature of the file system and the techniques used to access remote files should be invisible to users under normal conditions. The LOCUS system was one of the first to make transparency an explicit goal [10]; other file systems with varying degrees of transparency are CMU-ITC’s Andrew [5] and Sun’s NFS [11].
Most network file systems fail to meet the transparency goal in one or more ways. The earliest systems (and even some later systems, such as 4.2 BSD) allowed remote file access only with a few special programs (e.g., rcp in 4.2 BSD); most application programs could only access files stored on local disks. Second-generation systems, such as Apollo’s Aegis [6], allow any application to access files on any machine in the network but special names must be used for remote files (e.g., ‘‘file’’ for a local file but ‘‘[ivy]file’’ for a file stored on the Ivy server). Third-generation network file systems, such as LOCUS, Andrew, NFS, and Sprite, provide name transparency: the location of a file is not indicated directly by its name, and it is possible to move groups of files from one machine to another without changing their names.
Most third-generation systems still have some non-transparent aspects. For example, in Andrew and NFS only a portion of the file system hierarchy is shared; each machine must also have a private partition that is accessible only to that machine. In addition, Andrew and NFS do not permit applica- tions running on one machine to access I/O devices on other machines. LOCUS appears to be alone among current systems in providing complete file transparency.
The Sprite Network Operating System November 19, 1987
Sprite is similar to LOCUS in that it provides complete transparency, so that the behavior seen by applications running on different workstations is exactly the same as it would be if all the applica- tions were executing on a single time-shared machine. There is a single file hierarchy that is accessi- ble uniformly to all workstations. Although it is possible to determine where a file is stored, that infor- mation is not needed in normal operation. There are no special programs for operating on remote files as opposed to local ones, and there are no operations that can be used only on local files. Sprite also provides transparent access to remote I/O devices. As in UNIX, Sprite represents devices as special files; unlike most versions of UNIX, Sprite allows any process to access any device, regardless of the device’s location.
2.2. Shared Address Spaces
The early versions of UNIX did not permit memory to be shared between user processes, except for read-only code. Each process had private data and stack segments, as shown in Figure 1. Since then, extensions to allow read-write memory sharing have been implemented or proposed for several versions of UNIX, including System V, SunOS, Berkeley UNIX, and Mach.
There are two reasons for providing shared memory. First, the most natural way to program many applications is to use a collection of processes in a shared address space. It is particularly con- venient to use multiple processes when an application consists of mostly-independent sub-activities (e.g., one process to respond to keystrokes and another to respond to packets arriving over a network), while the shared address space allows them to cooperate to achieve a common goal (e.g., managing a collection of windows on a screen). The second motivation for shared memory is the advent of mul- tiprocessors. If an application is to be decomposed into pieces that can be executed concurrently, there must be fast communication between the pieces. The faster the communication, the greater the degree of concurrency that can be achieved. Shared memory provides the fastest possible communication, hence the greatest opportunity for concurrent execution.
Sprite provides a particularly simple form of memory sharing: when a process invokes the Proc_Fork kernel call to create a new process, it may request that the new process share the parent’s
The Sprite Network Operating System November 19, 1987
data segment (see Figure 2). The stack segment is still private to each process; it contains procedure invocation records and private data for the process. For simplicity, Sprite’s mechanism provides all- or-nothing sharing: it isn’t possible for a process to share part of its data segment with one process and part with another.
We expect multiprocess applications to synchronize using hardware mutual-exclusion instruc- tions (e.g., test-and-set) directly on shared memory. In most cases it will not be necessary to invoke the kernel, so synchronization can be accomplished in just a few instructions. The kernel participates only when it is necessary to delay the execution of a process, e.g., to wait for a lock to be released. For these situations, Sprite provides kernel calls to put a process to sleep and wake it up again later.
Stack (private)
Data (private) Code (sharable)
Dynamic Data Static Data
Private Static Data
Dynamic Data Sharable Static Data
Stack (private)
Data (sharable) Code (sharable)
Figure 1. The organization of virtual memory as seen by user processes in traditional UNIX (left) and Sprite (right). In both systems there are three distinct segments. The lower portion of the data segment contains static data known at compile time, and the upper portion expands to accomodate dynamically-allocated data. In UNIX, processes may share code but not data or stack. In Sprite the data segment may be shared between processes, including both statically-allocated and dynamic data. Private static data may be stored at the top of the stack segment.
Parent Parent Stack Stack
Child Stack
Code (shared read-only)
Data (shared read-write)
Code (shared read-only)
Figure 2. The UNIX fork operation, shown in (a), creates a new process that shares code with its parent while using private copies of the data and stack segments. Sprite provides both the traditional fork and also a shared fork, shown in (b), in which the child shares its parent’s data as well as code.
shared fork
The Sprite Network Operating System November 19, 1987
This has allowed us to implement efficient synchronization primitives.
2.3. Process Migration
In an environment with a workstation for each person, many of the machines will be idle at any given time. In order to allow users to harness this idle computing power, Sprite provides a new kernel call, Proc_Migrate, which will move a process or group of processes to an idle machine. Processes sharing a heap segment must migrate together. Sprite keeps track of which machines are idle and selects one as the target for the migration. The fact that a process has migrated is transparent both to the migrated process and to the user, as described below. The only noticeable difference after migra- tion will be a reduction in the load of the home machine.
Initially, we expect migration to be used in two ways. First, there are shell commands for manual migration, which allow users to migrate processes in much the same way that the UNIX shell allows users to place processes in background. Second, we have written a new version of the UNIX make utility, called pmake. Pmake, like make, carries out the recompilation of programs when their source files change. Whereas make invokes the recompilations sequentially, pmake is organized to invoke multiple recompilations concurrently, using process migration to offload the compilations to idle machines. We hope to see more and more automatic uses of migration, like pmake, in the future.
The idea of moving work to idle machines is not a new one. Unfortunately, though, the most widely available facilities (e.g., the rsh command of 4.2BSD UNIX and the rex facility of Sun’s UNIX) provide only remote invocation: the ability to initiate new processes on other machines, but not the ability to move processes once they have started execution. Process migration, which allows processes to be moved at any time, has been implemented in several systems (e.g., LOCUS [10], V [12], and Accent [15]) but is still not widely available. For Sprite we decided to implement process migration. We think that the additional flexibility provided by migration is particularly important in a workstation environment. For example, if remote invocation is used to offload work onto an idle machine and then the machine’s user returns, either the foreign processes will have to be killed or else the machine’s user will receive degraded response until the foreign processes complete. In Sprite, the
The Sprite Network Operating System November 19, 1987
foreign processes can be migrated away.
One of the most important attributes of Sprite’s migration mechanism is its transparency, both to the process and to the user. This means that a process will produce exactly the same results when migrated as it would if it were not migrated: Sprite preserves the environment of the process as it migrates, including files, working directory, device access, environment variables, and anything else that could affect the execution of the process. In addition, a migrated process appears to its user still to be running on the user’s home machine: it will appear in listings of processes on that machine and can be stopped or killed or debugged just like the user’s other processes. In contrast, rsh does not preserve the working directory and other aspects of the environment, and neither rsh nor rex allows a remotely-executing process to be examined or manipulated in the same fashion as local processes. Even the other implementations of process migration tend not to provide complete transparency to users, although they do provide complete transparency to the migrated processes. Section 7 describes how Sprite achieves execution transparency.
3. Basic Kernel Structure
Application programs invoke kernel functions via a collection of kernel calls. The basic flow of control in a kernel call is similar in Sprite to what it is in UNIX: user processes execute ‘‘trap’’ instructions to switch to supervisor state, and the kernel executes as a privileged extension of the user process, using a small per-process kernel stack for procedure invocation within the kernel.
This section describes two features of Sprite’s basic kernel structure that were provided in order to support multiprocessor and network operation. First, a multi-threaded synchronization structure allows the Sprite kernel to run efficiently on multiprocessors. Second, a remote procedure call facility allows kernels to invoke operations remotely over the network.
3.1. Multi-threading
Many operating system kernels, including UNIX, are single-threaded: a single lock is acquired when a process calls the kernel and released when the process puts itself to sleep or returns to user
The Sprite Network Operating System November 19, 1987
state. In these systems processes are never pre-empted while executing kernel code, except by inter- rupt routines. The single-threaded approach simplifies the implementation of the kernel by eliminating many po
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com