The Structureof the “THE”-MultiprogrammingSystem Edsger W. Dijkstra
Technological University, Eindhoven, The Netherlands
A multiprogramming system is described in which all ac- tivities are divided over a number of sequential processes. Thesesequential processes are placed at various hierarchical levels, in each of which one or more independent abstractions
have been implemented. The hierarchical structure proved to be vital for the verification of the logical soundness of the design and the correctness of its implementation.
Copyright By PowCoder代写 加微信 powcoder
KEYWORDSAND PHRASES:operating system,multlprogrammlngsystem, systemhierarchy, systemstructure,real-tlme debugging, program verification, synchronizingprimitives, cooperating sequential processes,systemlevels, input-outputbufferingt mulfiprogramming,processorsharing,mulfiprocessing’
CRCATEGORIES: 4.30, 4.32
Introduction
In response to a call explicitly asldng for papers “on timely research and development efforts,” I present a progress report on the multiprogramming effort at the Department of Mathematics at the Technological Uni- versity in Eindhoven.
Having very limited resources (viz. a group of six peo- ple of, on the average, haif-time availability) and wishing to contribute to the art of system design–including all the stages of conception, construction, and verification, wewere faced with the problem of how to get the necessary experience. To solve this problem we adopted the follow- ing three guiding principles:
(1) Select a project as advanced as you can conceive, as ambitious as you can justify, in the hope that routine work earl be kept to a minimum; hold out against all pres- sure to incorporate such system expansions that would only result into a purely quantitative increase of the total amount of work to be done.
(2) Select a machine with sound basic characteristics (e.g. an interrupt system to fall in love with is certainly an inspiring feature); from then on try to keep the spe- cificproperties of the configuration for which you are pre- paring the system out of your considerations as long as possible.
(3) Be aware of the fact that experience does by no means automatically lead to wisdom and understanding; in other words, make a conscious effort to learn as much as possible fl’om your previous experiences.
Presented at an ACM Symposium on Operating System Principles, Gatlinburg, Tennessee, October 1-4, 1967.
Accordingly, I shall try to go beyond just reporting what we have done and how, and I shall try to formulate as well what we have learned.
I should like to end the introduction with two short remarks on working conditions, which I make for the sake of completeness. I shall not stress these points any further.
One remark is that production speed is severely slowed down if one works with half-time people who have other obligations as well. This is at least a factor of four; prob- ably it is worse. The people themselves lose time and energy in switching over; the group as a whole loses de- cision speed as discussions, when needed, have often to be postponed until all people concerned are available.
The other remark is that the members of the group (mostly mathematicians) have previously enjoyed as good students a university training of five to eight years and are of Master’s or Ph.D. level. I mention this explicitly because at least it1my country the intellectual level needed
for system design is in general grossly underestimated. I am convinced more than ever that this type of work is very difficult, and that every effort to do it with other than the best people is doomed to either failure or moderate success at enormous expense.
The Tool and the Goal
The system has been designed for a Dutch machine, the EL X8 (N.V. Electrologica, Rijswijk (ZH)). Charac- teristics of our configuration are:
(1) core memory cycle time 2.5usec, 27 bits; at present 32K;
(2) drum of 512K words, 1024 words per track, rev. time 40msec;
(3) an indirect addressing mechanism very well suited for stack implementation;
(4) a sound system for commanding peripherals and controlling of interrupts;
(5) a potentially great number of low capacity chan- nels; ten of them are used (3 paper tape readers at 1000char/see; 3 paper tape punches at 150char/ sec; 2 teleprinters; a plotter; a line printer);
(6) absence of a number of not unusual, awkward features.
The primary goal of the system is to process smoothly a continuous flow of user programs as a service to the Uni- versity. A multiprograrmning system has been chosen with the following objectives in mind: (1) a reduction of turn-around time for programs of short duration, (2) economic use of peripheral devices, (3) automatic control
Volume 11 / Number 5 / May, 1968
Communications of the ACM 341
of backing store to be combined with economic use of the central processor, and (4) the economic feasibility to use the machine for those applications for which only the flexi- bility of a general purpose computer is needed, but (as a rule) not the capacity nor the processing power.
The system is not intended as a multiaeeess system. There is no common data base via which independent users can communicate with each other: they only share the configuration and a procedure library (that includes a translator for ALGOL60 extended with complex numbers). The system does not eater for user programs written in machine language.
Compared with larger efforts one can state that quanti- tatively spealdng the goals have been set as modest as the equipment and our other resources. Qualitatively speak- ing, I am afraid, we became more and more immodest as the work progressed.
A Progress Report
We have made some minor mistakes of the usual type (such as paying too much attention to eliminating what was not the real bottleneck) and two major ones.
Our first major mistake was that for too long a time we confined our attention to % perfect installation”; by the time we considered howto make the best of it, one of the peripherals broke down, we were faced with nasty prob- lems. Taking care of the “pathology” took more energy than we had expected, and some of our troubles were a direct consequence of our earlier ingenuity, i.e. the com- plexity of the situation into which the system could have maneuvered itself. Had we paid attention to the pathology at an earlier stage of the design, our management rules would certainly have been less refined.
The second major mistake has been that we conceived and programmed the major part of the system without giving more than scanty thought to the problem of de- bugging it. I must decline all credit for the fact that this mistake had no serious consequences–on the contrary! one might argue as an afterthought.
As captain of the crew I had had extensive experience (dating back to 1958) in making basic software dealing with real-time interrupts, and I knew by bitter experience that as a result of the irreproducibility of the interrupt moments a program error could present itself misleadingly
like an occasional machine malfunctioning. As a result I was terribly afraid. Having fears regarding the possibility of debugging, we decided to be as careful as possible and, prevention being better than cure, to try to prevent nasty bugs from entering the construction.
This decision, inspired by fear, is at the bottom of what I regard as the group’s main contribution to the art of system design. We have found that it is possible to design a refined multiprogramming system in such a way that its logical soundness can be proved a priori and its implemen- tation can admit exhaustive testing. The only errors that
showed up during testing were trivial coding errors (occurring with a density of one error per 500 instructions), each of them located within 10 minutes (classical) inspec- tion by the machine and each of them correspondingly easy to remedy. At the time this was written the testing had not yet been completed, but the resulting system is guaranteed to be flawless. When the system is delivered we
shall not live in the perpetual fear that a system derail- ment may still occur in an unlikely situation, such as might result from an unhappy “coincidence” of two or more critical occurrences, for we shall have proved the eon’eetness of the system with a rigor and explicitness that is unusual for the great majority of mathematical proofs.
A Survey of the System gtructure
Storage Allocation. In the classical yon Neumann machine, information is identified by the address of the memory location containing the information. When we started to think about the automatic control of secondary storage we were familiar with a system (viz. GmR ALGOL) in which all information was identified by its drum address
(as in the classical yon Neumann machine) and in which the function of the core memory was nothing more than to make the information “page-wise” accessible.
We have followed another approach and, as it turned out, to great advantage. In our terminology we made a strict distinction between memory units (we called them “pages” and had “core pages” and “drum pages”) and corresponding information units (for lack of a better word we called them “segments”), a segment just fitting in a page. For segments we created a completely independent identification mechanism in which the number of possible segment identifiers is much larger than the total number of pages in primary and secondary store. The segment iden- tifier gives fast access to a so-called “segment variable” in core whose value denotes whether the segment is still empty or not, and if not empty, in which page (or pages) it can be found.
As a consequence of this approach, if a segment of in- formation, residing in a core page, has to be dumped onto the drum in order to make the core page available for other use, there is no need to return the segment to the same drum page from which it originally came. In fact, this freedom is exploited: among the free drum pages the one with minimum latency time is selected.
A next consequence is the total absence of a drum allo- cation problem: there is not the slightest reason why, say, a program should occupy consecutive drum pages. In a multiprogramming environment this is very convenient.
Processor Allocation. We have given full recognition to the fact that in a single sequential process (such as ca~ be performed by a sequential automaton) only the time succession of the various states has a logical meaning, but not the actual speed with which the sequential process is
342 Communications of the ACM
Volume 11 / Number 5 / May, 1968
performed. Therefore we have arranged the whole system as a society of sequential processes, progressing with un- defined speed ratios. To each user program accepted by the system corresponds a sequential process, to each input peripheral corresponds a sequential process (buffering input streams in synchronism with the execution of the input commands), to each output peripheral corresponds a sequential process (unbuffering output streams in syn- chronism with the execution of the output commands); furthermore, we have the “segment controller” associated with the drum and the “message interpreter” associated with the console keyboard.
This enabled us to design the whole system in terms of these abstract “sequential processes.” Their harmonious cooperation is regulated by means of explicit mutuM synchronization statements. On the one hand, this ex- plicit mutual synchronization is necessary, as we do not make any assumption about speed ratios; on the other hand, this mutual synchronization is possible because “delaying the progress of a process temporarily” can never be harmful to the interior logic of the process delayed. The fundamental consequence of this approaeh–viz, the ex- plicit mutual synchronization–is that the harmonious cooperation of a set of such sequential processes can be
established by discrete reasoning; as a further consequence the whole harmonious society of cooperating sequential processes is independent of the actual number of processors available to carry out these processes, provided the proces- sors available can switch from process to process.
System Hierarchy. The total system admits a strict hierarchical structure.
At level 0 we find the responsibility for processor allo- cation to one of the processes whose dynamic progress is logically permissible (i.e. in view of the explicit mutual synchronization). At this level the interrupt of the real- time clock is processed and introduced to prevent any process to monopolize processing power. At this level a priority rule is incorporated to achieve quick response of the system where this is needed. Our first abstraction has been achieved; above level 0 the number of processors actually shared is no longer relevant. At higher levels we
find the activity of the different sequential processes, the actual processor that had lost its identity having disap- peared from the picture.
At level 1 we have the so-called “segment controller,” a sequential process synchronized with respect to the drum interrupt and the sequential processes on higher levels. At level 1 we find the responsibility to cater to the book- keeping resulting from the automatic backing store. At this level our next abstraction has been achieved; at all higher levels identification of information takes place in terms of segments, the actual storage pages that had lost their identity having disappeared from the picture.
At level 2 we find the “message interpreter” taking care of the allocation of the console keyboard via which con-
Volume 11 / Number 5 / May, 1968
versations between the operator and any of the higher level processes can be carried out. The message interpreter works in close synchronism with the operator. When the operator presses a key, a character is sent to the machine together with an interrupt signal to announce the next keyboard character, whereas the actual printing is done through an output command generated by the machine
under control of the message interpreter. (As far as the hardware is concerned the console teleprinter is regarded as two independent peripherals: an input keyboard and an output printer.) If one of the processes opens a conversa- tion, it identifies itself in the opening sentence of the con- versation for the benefit of the operator. If, however, the operator opens a conversation, he must identify the process he is addressing, in the opening sentence of the conversation, i.e. this opening sentence must be inter- preted before it is known to which of the processes the conversation is addressed! Here lies the logical reason for
the introduction of a separate sequential process for the console teleprinter, a reason that is reflected in its name, “message interpreter.”
Above level 2 it is as if each process had its private con- versational console. The fact that they share the same physical console is translated into a resource restriction of the form “only one conversation at a time,” a restriction that is satisfied via mutual synchronization. At this level the next abstraction has been implemented; at higher levels the actual console teleprinter loses its identity. (If the message interpreter had not been on a higher level than the segment controller, then the only way to imple- ment it would have been to make a permanent reservation in core for it; as the conversational vocabulary might be- come large (as soon as our operators wish to be addressed in fancy messages), this would result in too heavy a per- manent demand upon core storage. Therefore, the vo- cabulary in which the messages are expressed is stored on segments, i.e. as information units that can reside on the drum as well. For this reason the message interpreter is one level higher than the segment controller.)
At level 3 we find the sequential processes associated with buffering of input streams and unbuffering of out- put streams. At this level the next abstraction is effeeted, viz. the abstraction of the actual peripherals used that are allocated at this level to the “logical communication units” in terms of which are worked in the still higher levels. The sequential processes associated with the peripherals are of a level above the message interpreter, because they must be able to converse with the operator (e.g. in the case of detected malfunctioning). The limited number of periph- erals again acts as a resource restriction for the processes at higher levels to be satisfied by mutual synchronizatioI~
between them.
At level 4 we find the independent:user programs and
at level 5 the operator (not implemented by us).
The system structure has been described at length in
order to make the next section intelligible. Communications of the ACM 343
Design Experience
• The conception stage took a long time. During that period of time the concepts have been born in terms of which we sketched the system in the previous section. Furthermore, we learned the art of reasoning by which we could deduce from our requirements the way in which the processes should influence each other by their inutual synchronization so that these requirements would be met. (The requirements being that no information can be used before it has been produced, that no peripheral can be set to two tasks simultaneously, etc.). Finally we learned the art of reasoning by which we could prove that the society composed of processes thus mutually synchronized by each other would indeed in its time behavior satisfy all requirements.
The construction stage has been rather traditional, perhaps even old-fashioned, that is, plain machine code. Reprogramming on account of a change of specifications has been rare, a circumstance that must have contributed greatly to the feasibility of the “steam method.” That the first two stages took more time than planned was some- what compensated by a delay in the delivery of the machine.
In the verification stage we had the machine, during short shots, completely at our disposal; these were shots during which we worked with a virgin machine without any software aids for debugging. Starting at level 0 the system was tested, each time adding (a portion of) the next level only after the previous level had been thoroughly tested. Each test shot itself contained, on top of the (par- tial) system to be tested, a number of testing processes with a double function. First, they had to force the system into all different relevant states; second, they had to verify that the system continued to react according to specifica- tion.
I shall not deny that the construction of these testing programs has been a major intellectual effort: to convince oneself that one has not overlooked “a relevant state” and to convince oneself that the testing programs generate them all is no simple matter. The encouraging thing is that (as far as we know!) it could be done.
This fact was one of the happy consequences of the hierarchical structure.
Testing level 0 (the real-time clock and processor allo- cation) implied a number of testing sequential processes on top of it, inspecting together that under all circum- stances processor time was divided among them accord- ing to the rules. This being established, sequential processes as such were implemented.
Testing the segment controller at level 1 meant that all “relevant states” could be formulated in terms of se- quential processes making (in various combinations) demands on core pages, situations that could be provoked by explicit synchronization among the testing programs. At this stage the existence of the real-time clock–al- though interrupting all the time–was so immaterial that one of the testers indeed forgot its existence!
By that time we had implemented the correct reaction upon the (mutually unsynchronized) interrupts from the reaI-time clock and the drum. If we ihad not introduced the separate levels 0 and 1, and if we had not created a ternfinology (viz. that of the rather abstract sequential proces
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com