CS代考 15-213 Introduction to Computer Systems

15-213 Introduction to Computer Systems
Final Exam May 3, 2006
Name: ID: Recitation Section:
• This is an open-book exam.

Copyright By PowCoder代写 加微信 powcoder

• Notes and calculators are permitted, but not computers. • Write your answers legibly in the space provided.
• You have 180 minutes for this exam.
Assembly Language Out-of-Order Execution Pointer Arithmetic Caching Signals Semaphores Servers System-Level I/O

1. Assembly Language (15 points)
Consider the following declaration of a binary search tree data structure.
struct TREE {
struct TREE* left; struct TREE* right;
Describe the layout of this structure on an x86-64 architecture.
1. (3 pts) If a is the address of the beginning of the structure, then
• is the address of the data field,
• is the address of the left field, and
• is the address of the right field. Give your address calculations in bytes in decimal.
2. (1 pts) a will be aligned to 0 modulo .
3. (1 pts) The total size of a TREE structure will be bytes.

Next we consider the tree as a binary search tree, where elements to the left of a node are smaller and elements to the right of a node are larger than the data stored in the node. The following function checks whether a given integer x is stored in the global tree root.
typedef struct TREE tree; tree* root;
int member(int x) { tree* t = root; while (t != NULL) {
if (x == t->data)
if (x < t->data)
t = t->left;
t = t->right;
return 0; }
This function might compile to the following piece of assembly code, omitting some code alignment directives and three lines for you to fill in.
movq root(%rip), %rax testq %rax, %rax
___________________________ je .L13
movq 8(%rax), %rax
testq %rax, %rax jne .L14
___________________________
movq 16(%rax), %rax jmp .L11
___________________________

4. (4 pts) Complete the following table, associating C variables with machine registers or assembly expressions.
C variable
Assembly expression
return value
5. (6 pts) Fill in the missing three lines of assembly code.

2. Out-of-Order Execution (20 points)
We continue the code from the previous problem
1. (15 pts) On a machine with pipelining and out-of-order execution as the machines used in this course, the efficiency of the inner loop can be improved by the use of conditional move instructions. Rewrite the code between .L14 and .L9 by using only instructions from the original program, ordinary move instructions, and one or more of the following conditional move instructions.
cmovl S,D cmovle S,D cmove S,D cmovge S,D cmovg S,D
As usual, S stands for the source, D for the destination, and the suffix l, le, e, ge, g has the same meaning as for conditional branches.
testq %rax, %rax jne .L14
2. (5 pts) Explain why the code with conditional moves can be more efficient than the original code produced by the compiler.

3. Pointer Arithmetic (10 points)
A desparate student decided to write a dynamic memory allocator for an x86-64 machine in which each block has the following form:
• Header is a 4 byte header
• Id string is an 8 byte string
• Payload is arbitrary size, including padding • Footer is a 4 byte footer
Assume the student wants to print the Id string with the following function
void print_block(void *bp) {
printf(“Found block ID: %s\n”, GET_ID(bp));
where bp points to the beginning of the payload and is aligned to 0 modulo 8. Circle each
of the following letters A–J for which the macros will correctly print the id string. A #define GET_ID(bp) ((char *)(((long)bp)-8))
B #define GET_ID(bp) ((char *)(((int)bp)-8))
C #define GET_ID(bp) ((char *)(((char)bp)-8)) D #define GET_ID(bp) ((char *)(((long *)bp)-1))
E #define GET_ID(bp) ((char *)(((int *)bp)-2)) F #define GET_ID(bp) ((char *)(((char *)bp)-8)) G #define GET_ID(bp) ((char *)(((char**)bp)-1)) H #define GET_ID(bp) ((char *)(((char**)bp)-2)) I #define GET_ID(bp) ((char *)(((char**)bp)-4)) J #define GET_ID(bp) ((char *)(((char**)bp)-8))

4. Caching (20 points)
Assume the following situation.
• All caches are fully associative, with LRU eviction policy.
• The cache is write-back, write-allocate.
• All caches are empty at the beginning of an execution.
• Variables i, j, and k are stored in registers.
• A float is 4 bytes.
The function mm_ijk multiplies two N × N arrays A and B and puts the result in R. For simplicity, we assume R is initialized to all zeros.
void mm_ijk (float A[N][N], float B[N][N], float R[N][N]) { int i,j,k;
for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < N; k++) R[i][j] += A[i][k] * B[k][j]; 1. (6 pts) Consider the executions of mm_ijk with N=2 and N=4 or a 64-byte fully associative LRU cache with 4-byte lines (the cache holds 16 lines). Fill in the table below with the number of cache misses caused by accesses to each of the arrays A, B, and R, assuming that the argument arrays are 16-byte aligned. 2. (6 pts) Now suppose we consider the previous experiment on a 64-byte fully asso- ciative LRU cache with 16 byte lines (the cache holds 4 lines). Fill in the table below with the number of cache misses due to each array, assuming that the argument arrays are 16 byte aligned. 3. (5pts)EvenifRisinitializedtoallzeros,andeveniftheprogramissingle-threaded and no signals occur, after the execution of the mm_ijk function, R will not neces- sarily contain the product of A and B. Give a concrete counterexample. 4. (3 pts) Under which additional assumption is mm_ijk correct? 5. Signals (15 points) For each code segment below, give the largest value that could be printed to stdout. Re- member that when the system executes a signal handler, it blocks signals of the type currently being handled (and no others). /* Version A */ int i = 0; void handler(int s) { if (!i) { kill(getpid(), SIGINT); } int main() { signal(SIGINT, handler); kill(getpid(), SIGINT); printf("%d\n", i); return 0; 1. (5 pts) Largest value for version A: . /* Version B */ int i = 0; void handler(int s) { if (!i) { kill(getpid(), SIGINT); kill(getpid(), SIGINT); } int main() { signal(SIGINT, handler); kill(getpid(), SIGINT); printf("%d\n", i); return 0; 2. (5 pts) Largest value for version B: . 9 /* Version C */ int i = 0; void handler(int s) { if (!i) { kill(getpid(), SIGINT); kill(getpid(), SIGUSR1); } int main() { signal(SIGINT, handler); signal(SIGUSR1, handler); kill(getpid(), SIGUSR1); printf("%d\n", i); 3. (5 pts) Largest value for version C: . 6. Semaphores (30 points) We would now like to use binary search trees with (long) integer data to represent sets of integers. struct TREE { long int data; struct TREE* left; struct TREE* right; typedef struct TREE tree; tree* root = NULL; /* tree is initially empty */ int member(long int x); /* returns 1 if x is in *root, 0 otherwise */ void insert(long int x); /* insert x into tree at *root */ We assume that the functions member and insert are straightforward (as member in an earlier problem) and are not thread-safe. In this problem we explore how to write wrappers so they can be used in a thread-safe manner. We would like to allow any number of concurrent readers (via the member function), since simultaneous reading the data structure is safe. On the other hand, we would like to ensure that a writer (via the insert function) has exclusive access to the data structure (no other writers or readers allowed). To implement this, we use one global variable, readers, which counts the number of currently executing readers and two semaphores: r to ensure mutually exclusive access to the readers variable, and w to ensure that a writer has exclusive access to the data structure stored in the root variable. int readers = 0; sem_t w; 1. (15 pts) Below is the skeleton of the wrapper functions. Fill in the missing lines from the following selection. Note that you may need some commands more than once while others may remain unused. readers++; readers--; if (readers == 1) P(&w); if (readers == 1) V(&w); if (readers == 0) P(&w); if (readers == 0) V(&w); int member_safe(long int x) { int b; b = member(x); return(b); } void insert_safe(long int x) { insert(x); 2. (5 pts) Which code is needed to initialize the semaphores. Circle all needed initial- ization. • sem_init(&r, 0, 0); • sem_init(&r, 0, 1); • sem_init(&w, 0, 0); • sem_init(&w, 0, 1); Yes No Yes No Yes No Yes No 3. (5 pts) Note that it is possible for a write request never to succeed by having certain read patterns. Precisely explain the circumstances under which this may happen. 4. (5 pts) Propose a simple solution to avoid this problem so that every write request is eventually honored (in words—no need to show code). 7. Servers (20 points) We now write a server which maintains a do-not-call list of telephone numbers. Tele- marketers who call numbers in this list are subject to heavy fines. Our protocol is much simpler than HTTP. A client connects to the server and then sends either • +x\n to add the phone number x to the do-not-call list, or • ?x\n to query whether the number x is in the do-not-call list, whereupon the server responds with 0\n (not in the list) or 1\n (in the list). Here, a phone number x is just a 10-digit number (no spaces or parentheses). The server spawns a separate thread for each connection. We assume that serve(connfd) reads one line of input from connfd, parses it, and responds if appropriate according to the protocol above. For conciseness, the code below does not check return codes for system calls. void* thread(void* vargp) { int connfd = *((int *) vargp); pthread_detach(pthread_self()); serve(connfd); close(connfd); return NULL; int main() { int listenfd, connfd; pthread_t tid; /* semaphores are initialized here */ ... listenfd = open_listenfd(15213); while (1) { connfd = accept(listenfd, NULL, NULL); pthread_create(&tid, NULL, thread, &connfd); } 1. (10 pts) Even if serve is thread-safe, the code above exhibits a race condition. Ex- plain this race condition in detail. 2. (10 pts) Correct the code above to avoid the race condition. Your code should not cast integers to pointers or pointers to integers (which would be in poor taste). 8. System-Level I/O (20 points) We continue the example from the previous question. To complete the serve function below, you should use void rio_readinitb(rio_t* rp, int fd); ssize_t rio_readlineb(rio_t* rp, char* usrbuf, size_t maxlen); ssize_t rio_writen(int fd, char* usrbuf, size_t len); from the robust I/O package. The function long atol(char* s) converts the begin- ning of the string s to a long int. 1. (10 pts) Complete the following function serve. void serve(connfd) { long int phone_number; rio_t rio; ________________________________; ________________________________; if (_________________________________________) { if (buf[0] == ’?’) { phone_number = atol(buf+1); if (member_safe(phone_number)) _____________________________________; else _____________________________________; } if (buf[0] == ’+’) { phone_number = atol(buf+1); insert_safe(phone_number); 2. (5 pts) HTTP 1.1 allows for multiple client requests and and server responses on a single connection. Explain why. 3. (5 pts) We would like our protocol to support multiple interactions per connection. Show how to change the the code for the serve function above to handle multiple requests per connection. [Hint: you don’t have to change much.] 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com