Marks for each part of each question are indicated in square brackets Calculators are permitted
1. Debuggers and Virtual Memory in WeensyOS
In Linux and UNIX, a debugger is an application that runs as one process and facilitates stepping through the execution of a separate application (the one the user wants to debug) that runs as a second, separate process. This question concerns how operating systems provide support for debuggers and asks you to think about how one might incorporate support for debuggers into WeensyOS.
When you ask a debugger like gdb to print the value of a variable in the target application being debugged, the debugger computes the virtual address of that variable in the target application’s virtual address space, and issues a system call that asks the operating system to obtain the value at that virtual address in the process with the process ID specified by the debugger (which the debugger specifies as the process ID of the target process being debugged). The system call returns the requested value at the target address in the virtual address space of the target process. The debugger then prints out the requested value for the user.
Consider how you could design and implement a new system call for WeensyOS for this functionality—to let a debugger application running as a WeensyOS process display a value stored at a virtual memory address in the virtual address space of some other target
1 TURN OVER
WeensyOS process. Suppose the C function prototype for this new WeensyOS system call is as follows:
int debug_get_long(int processid, long *targetaddr, long *result);
where:
• processid is the WeensyOS process ID of the target process within whose virtual address space an 8-byte value should be retrieved;
• targetaddr is the virtual address within the target process’s virtual address space from which an 8-byte value should be retrieved;
• result is a pointer to a long in the calling process’s virtual address space, where the system call will store the requested value retrieved from the virtual address space of the target process;
• the int return value indicates whether the system call has completed successfully or with an error.
Design debug get long() for the WeensyOS kernel. Give your design in C code, and comment your code thoroughly, explaining each significant step taken in the code. Assume that there is already a symbolic constant SYS DEBUG GET LONG that can be used in the relevant switch statement in the WeensyOS kernel source to dispatch to your code when an application invokes the debug get long() system call. Provide the code for the system call starting with the relevant case statement. (You may find it useful to recall how you added support for fork()to the WeensyOS kernel in CW4.)
Hints: You will need to refamiliarize yourself with how process and kernel page ta- bles work in WeensyOS to solve this problem. For full marks, you will need to include error-handling code that returns an error and avoids crashing the kernel if the call to debug get long() provides an invalidtargetaddr and processid, or an in- valid combination of the two.
We will mark your answer by evaluating the correctness and completeness of your design as expressed in code. This is a design exercise and not a programming task: we won’t try to compile and run your code, or test it. We will not deduct marks for minor syntax errors. Don’t try to compile and run your code as you work on your solution; we are only asking you to write out the code to make your design concrete, as a way of demonstrating your understanding of virtual memory and how WeensyOS supports it.
[Question 1: Total 20 marks]
2 CONTINUED
2. Linux/UNIX File I/O and Process Control
Recall the notion of a file pointer in the Linux/UNIX file input/output (I/O) system calls, which tracks the current byte offset within a currently open Linux/UNIX file—where in the bytes of the file to read or write next, when the next read() or write() system call is issued. As discussed in class, when a process has a file descriptor for a currently open file, modern Linux and UNIX OSes track the file pointer for that file using state held in the kernel’s memory.
Note: throughout this question, we are referring to the UNIX file I/O system calls, such as read() and write(), and not the C library’s buffered standard I/O file I/O calls, such as fread() and fwrite().
By contrast, in an early version of UNIX, the various UNIX file I/O system calls main- tained the file pointer for a currently open file in user-level memory within the process’s virtual address space. When the file pointer advanced (after a read() or write() sys- tem call on the file, using the exact same API for these calls as in modern UNIX), the kernel would update the file’s file pointer value in the user-level memory of the process that called read() or write().
Note that in all versions of UNIX (the early version discussed above and modern ones), output to a terminal always appends to the end of the terminal, regardless of the file pointer value for the open file for that terminal. Terminals were originally teletypes, one- character-at-time physical printers onto paper, and modern terminals are windows in a windowing system; in both cases, each successive write() always adds to the end of the terminal’s output. When a shell is interactive (when it accepts commands from the user interactively), its output goes to a terminal, and thus interactive shells always append their output to the end of the terminal, regardless of the file pointer value for the open file for that terminal.
Suppose you run a fully implemented sh0019 shell (that fully solves CW5) ac- cording to the following scenario:
• You create a file /tmp/athat contains the string This is file A.
• You create a file /tmp/bthat contains the string This is File B.
• In the current working directory, you create a shell script file named script con- taining the following two commands:
cat /tmp/a
cat /tmp/b
• You start an interactive command-line sh0019 shell. At its command prompt, you run a second instance of your sh0019 shell and have it run the commands in the above shell script by typing the command:
sh0019 -q script > outfile
a. Supposeyourunsh0019intheabovescenarioontheaforementionedearlyversion of UNIX. Assume that sh0019 compiles and runs on this early version of UNIX, but experiences the early UNIX version’s different file pointer implementation de- scribed above.
3 TURN OVER
On this early version of UNIX that maintains the file pointer in memory within a process’s address space, what output results (and in what file)? Justify your answer by describing the sequence of process creation, program execution, and file I/O system calls that occurs, and how the file pointer influences the output.
[10 marks]
b. What output does the same command give (and in what file) when run on a modern version of Linux, which maintains file pointers as described in the 0019 lectures and in CS:APP/3e? Justify your answer by explaining what is different about modern Linux’s implementation of UNIX file I/O operations, and how this implementation difference gives rise to the output.
[10 marks] [Question 2: Total 20 marks]
4 CONTINUED
3. C Compilation and x86-64 Assembly
Consider the below disassembly of the compiled x86-64 object code for the C function mystery(), as output by objdump -S:
Disassembly of section .text:
0000000000000000
pushq %rbp
movq %rsp,%rbp
testl %edx,%edx
jle 1c
movl %edx,%eax
xorl %ecx,%ecx
movl (%rsi,%rcx,4),%edx
addl %edx,%edx
addl %edx,(%rdi,%rcx,4)
incq %rcx
cmpq %rcx,%rax
jne c
popq %rbp
retq
1: 48 4: 85 6: 7e 8: 89 a: 31 c: 8b f: 01
11: 01 14: 48 17: 48 1a: 75 1c: 5d 1d: c3
89
d2
14
d0
c9
14
d2
14
ff
39
f0
e5
8e
8f c1 c8
a. HowmanyargumentsdoestheCfunctionmystery()take,andwhatevidencein the disassembly supports this conclusion?
[2 marks]
b. Below is a skeleton for the C code that when compiled yields the above assembly code. In your answers document, write out the skeleton and fill in the missing code (where there are blanks in the skeleton) that matches the above assembly code (2 marks per blank):
void mystery( {
i;
for (i = 0; i ; i++) a[i] +=
)
}
;
[8 marks] [Question 3 Total: 10 marks]
5
TURN OVER
4. Concurrency with pthreads on x86-64
a. ConsiderthefollowingCcode,whichstartsfourpthreads,eachofwhichincrements each element of an array of unsigneds ten million times, and uses C11 atomics for synchronization. Include files have been omitted in the interest of brevity.
void *threadfunc(void *arg) {
_Atomic unsigned **x = (_Atomic unsigned **) arg; for (int i = 0; i != 10000000; i++) {
for (int j = 0; j < 10; j++) {
atomic_fetch_add(x[j], 1);
} }
}
int main() {
pthread_t th[4];
_Atomic unsigned n[10];
_Atomic unsigned *pn[10];
for (int i = 0; i < 10; i++) {
n[i] = ATOMIC_VAR_INIT(0);
pn[i] = &(n[i]);
}
for (int i = 0; i < 4; i++) {
pthread_create(&th[i], NULL, threadfunc, (void *) pn);
}
for (int i = 0; i != 4; i++) {
pthread_join(th[i], NULL); }
for (int i = 0; i < 10; i++) {
printf("%u\n", n[i]);
} }
Rewrite the above code to instead use spin locks for synchronization. You may not place a spin lock outside the outer for() loop in threadfunc(), as doing so reduces concurrency among threads too much. Subject to that prohibition, you must place spin lock(s) so as to ensure correct synchronization, and to minimize the overhead of locking.
Omit include files. Use the spin lock API discussed in 0019 lecture, summarized below for your reference. You need not provide code for the implementation of spin_lock() or spin_unlock().
/* allocate and initialize spin lock */
6 CONTINUED
atomic_flag spinlock = ATOMIC_FLAG_INIT;
int spin_lock(atomic_flag *lock); // acquire spin lock int spin_unlock(atomic_flag *lock); // release spin lock
[6 marks]
b. How does the C compiler enforce atomicity on the x86-64 CPU instructions that implement C11 atomic operations?
[2 marks]
c. WhichversionofthecodediscussedinthisexamquestionrequiresmoreC11atomic operations for correct synchronization: the one in the exam question paper, which directly invokes C11 atomics, or your answer that uses spin locks? Justify your answer by referring to the relevant aspects of these two versions of the code.
[2 marks] [Question 4 Total: 10 marks]
7 TURN OVER
5. Designing with Concurrency and Virtual Memory under Linux/UNIX
Unlike the preceding questions, this question is open-ended.
Your answer is expected not to exceed 1,000 words in length. It may include any number of figures or tables (including zero!) to support the arguments you make in your answer. We would expect every figure and table you include to be discussed in the text. The quantity of figures or tables you include does not in and of itself influence your mark.
You have joined the development team for a main-memory database software system implemented in C on Linux. This main-memory database, as the name suggests, ac- cepts read and write requests to the database contents, and stores all data in the database in RAM only, in a single Linux process, within which only one thread runs. A single write request can require writes to multiple disjoint memory regions within the virtual address space of the database process. When you join the project, the system provides no persistence—i.e., none of the data is stored on non-volatile storage, such that if the machine on which the database runs crashes, loses power, or reboots, the entire database contents are lost.
You are given the task of implementing a simple form of persistence for this main- memory database system. You are told to design and implement extensions to the main- memory database that take a complete, consistent, persistent snapshot of the current con- tents of the database in main memory. “Persistent” means that the contents of the database should be written to non-volatile storage: in this case, to a single file in the standard Linux filesystem stored on a sufficiently large magnetic spinning disk. “Consistent” means that the copy of the database written to disk must not reflect partially complete write opera- tions; for any given write to the database requested by a client, a snapshot must contain all or none of the changes to the database requested in that write operation. “Complete” means that at the time a snapshot operation is initiated, all prior write operations that have completed in main memory must be included in the copy of the database written to disk.
Your design must comply with the following criteria and goals (beyond the “persistent,” “consistent,” and “complete” definitions above):
• The main-memory database must continue to process subsequent read and write re- quests that arrive during the taking of a snapshot; that is, your design may not pause the handling of read and write requests to the database while a snapshot completes.
• The main-memory database does not process multiple read or write requests (or a mix of the two) concurrently; it processes a single read or write request to com- pletion in isolation before moving on to the next one. Your persistence extensions should not change this behavior.
• Once a snapshot is initiated, no changes to memory caused by writes requested after the snapshot began should be included in that snapshot.
• You may use multiple pthreads, multiple processes, or some combination of both in your design.
• You may use any of the pthread synchronization primitives described in the 0019 lectures on synchronization primitives.
• You may use any of the Linux system calls described in the 0019 lectures.
8 CONTINUED
• You should aim to make your design both physical-memory efficient (i.e., minimize the quantity of RAM your design needs) and compute-efficient (i.e., minimize the number of CPU instructions or other execution time overheads your design intro- duces).
Assume there are four CPU cores in the Intel Core i7 CPU in the server where the main- memory database runs. And assume that the mix of read and write requests that the database receives over time is not known in advance, and that an administrator’s request for a snapshot can arrive at any time.
Hint: you may find it useful to review CS:APP/3e Sections 9.8.1. and 9.8.2, which de- scribe how Linux uses copy-on-write page mappings.
Describe your design in an essay (optionally including any figures or tables you deem necessary), taking care to specify the Linux constructs, system calls, synchronization primitives, etc., that you use, and how they fit together in your design. You must also explain how your design meets the above requirements, and why you believe your design meets the physical-memory and compute efficiency goals described above.
There is no unique correct answer to this question. We expect that even experts in Systems may produce different designs, and use different arguments for why these designs meet the requirements and goals set out in the question.
In marking your answers, we will focus on your understanding of systems design prin- ciples and constructs in C on Linux, and the tradeoffs inherent in the use of different concurrency and synchronization techniques. That is, your mark will reflect the examin- ers’ assessment of your ability to reason about the material covered in 0019, and how well the logical arguments you make in support of your design provide evidence of that un- derstanding. Your answer to this question will thus be marked according to the following criteria:
• the extent to which your answer demonstrates understanding of 0019 topics, includ- ing technical correctness of the written sentences and of the observations made;
• the completeness of the answer, in terms of whether it addresses all parts of the question as stated;
• the clarity and concision of the answer: we expect that your answer will be easily grasped by the 0019 examiners, and will not exceed 1,000 words.
• the quality of the insight offered in your answer, including originality and depth of discussion, soundness of the arguments made, and ability to make correct and in- sightful connections between designs, techniques, and systems programming prim- itives.
9
[Question 5 Total: 40 marks] END OF PAPER