CS代考 CS 15-213, Fall 2006

Andrew login ID: Full Name:
CS 15-213, Fall 2006
Final Exam
Thursday Dec 14, 2006

Copyright By PowCoder代写 加微信 powcoder

• Make sure that your exam is not missing any sheets, then write your full name and Andrew login ID on the front.
• Write your answers in the space provided below the problem. If you make a mess, clearly indicate your final answer.
• The exam has a maximum score of 92 points.
• This exam is OPEN BOOK. You may use any books or notes you like. You may use a calculator, but no other electronic devices. Good luck!
TOTAL (92):
Page 1 of 17

Problem 1. (6 points):
Floating point encoding. Consider the following 5-bit floating point representation based on the IEEE floating point format. This format does not have a sign bit – it can only represent nonnegative numbers.
• There are k = 3 exponent bits. The exponent bias is 3.
• There are n = 2 fraction bits.
Numeric values are encoded as a value of the form V = M × 2E , where E is exponent after biasing, and M is the significand value. The fraction bits encode the significand value M using either a denormalized (exponent field 0) or a normalized representation (exponent field nonzero). Any rounding of the significand is based on round-to-even.
Below, you are given some decimal values, and your task it to encode them in floating point format. In addition, you should give the rounded value of the encoded floating point number. Give these as whole numbers (e.g., 17) or as fractions in reduced form (e.g., 3/4).
Floating Point Bits
Rounded value
Page 2 of 17

Problem 2. (9 points):
Structs and arrays. The next two problems require understanding how C code accessing structures and arrays is compiled. Assume the x86-64 conventions for data sizes and alignments.
You are given the following C code:
#include “decls.h”
typedef struct { int x[CNT2];
int z[CNT3];
} struct_a;
/* Unknown constant */ /* Unknown constant */
typedef struct{
struct_a data[CNT1]; /* Unknown constant */ int idx;
} struct_b;
void set_y(struct_b *bp, int val) {
int idx = bp->idx;
bp->data[idx].y = val; }
You do not have a copy of the file decls.h, in which constants CNT1, CNT2, and CNT3 are defined, but you have the following x86-64 code for the function set_y:
bp in %rdi, val in %esi
movslq 168(%rdi),%rax
leaq (%rax,%rax,2), %rax movl %esi, 12(%rdi,%rax,8) ret
Based on this code, determine the values of the three constants A. CNT1 =
Page 3 of 17

Problem 3. (6 points):
Structs and arrays. As in the previous problem, assume the x86-64 conventions for data sizes and align- ments.
You are given the following C code:
#include “decls.h”
typedef struct{ type_t x; int y[3];
} struct_a;
typedef struct{ int low;
/* Unknown type */
struct_a val[N]; /* Unknown constant */
} struct_b;
int get_high(struct_b *bp) {
return bp->high;
You do not have a copy of the file decls.h, in which constant N and data type type_t are declared, but you have the following x86-64 code for the function get_high:
bp in %rdi
movl 104(%rdi), %eax ret
Provide some valid combination of these two parameters for which the assembly code would be generated. A. type_t:
Page 4 of 17

Problem 4. (12 points):
Loops. Consider the following x86-64 assembly function, called looped: looped:
# a in %rdi, n in %esi
movl $0, %edx
testl %esi, %esi
jle .L4
movl $0, %ecx
movslq %ecx,%rax
movl (%rdi,%rax,4), %eax cmpl %eax, %edx
cmovl %eax, %edx
cmpl %edx, %esi
movl %edx, %eax
Fill in the blanks of the corresponding C code.
• YoumayonlyusetheCvariablenamesn,a,iandx,notregisternames. • Use array notation in showing accesses or updates to elements of a.
int looped(int a[], int n) {
int x = ______________;
for(i = ____________; ________________; i++) { if (___________________)
return x; }
x = _________________;
Page 5 of 17

Problem 5. (12 points):
Stack discipline. Below is a segment of code you will remember from your buffer lab, the section that reads a string from standard input.
int getbuf() { char buf[8];
Gets(buf);
return 1; }
The function Gets is similar to the library function gets. It reads a string from standard input (terminated by \n or end-of-file) and stores it (along with a null terminator) at the specified destination. Gets has no way of determining whether buf is large enough to store the whole input. It simply copies the entire input string, possibly overrunning the bounds of the storage allocated at the destination.
Below is the object dump of the getbuf function:
08048c4b : 8048c4b: 55
mov %esp,%ebp
sub $0x20,%esp
lea 0xfffffff0(%ebp),%eax mov %eax,(%esp)
call 8048d4e
mov $0x1,%eax
8048c4c: 89 8048c4e: 83 8048c51: 8d 8048c54: 89 8048c57: e8 8048c5c: b8 8048c61: c9 8048c62: c3
00 00 00 00
Page 6 of 17

Suppose that we set a breakpoint in function getbuf and then use gdb to run the program with an input file redirected to standard input. The program stops at the breakpoint when it has completed the sub instruction at 0x08048c4e and is poised to execute the lea instruction at 0x08048c51. At this point we run the following gdb command that lists the 12 4-byte words on the stack starting at the address in %esp:
0x08048c51 in getbuf ()
(gdb) x/12w $esp
0x55683a58: 0x003164f8 0x00000001 0x55683a98 0x0030bab6 0x55683a68: 0x003166a4 0x555832e8 0x00000001 0x00000001 0x55683a78: 0x55683ab0 0x08048bf9 0x55683ab0 0x0035b690
A. What is the address of buf? 0x_________________
B. When the program reaches the breakpoint, what is the value of %ebp? 0x__________________
C.Towhichaddresswillgetbufreturnafterexecuting? 0x__________________
D. When the program reaches the breakpoint, what is the value of %esp? 0x__________________
E. Instead of having getbuf return to its calling function, suppose we want it to return to a function smoke that has the address 0x8048b20.
Below is an incomplete sequence of the hex values of each byte in the file that was input to the program (we have given you the first 8 padding values). Fill in the remaining blank hex values so that the call to Gets will return to smoke. Note that smoke does not depend on the value stored in %ebp.
0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___ 0x___
Page 7 of 17

Problem 6. (6 points):
Consider the following function for computing the dot product of two arrays of n integers each. We have unrolled the loop by a factor of 3.
int dotprod(int a[], int b[], int n) {
int i, x1, y1, x2, y2, x3, y3; int r = 0;
for (i = 0; i < n-2; i += 3) { x1 = a[i]; x2 = a[i+1]; x3 = a[i+2]; y1 = b[i]; y2 = b[i+1]; y3 = b[i+2]; r=r+x1*y1+x2*y2+x3*y3; //Corecomputation } for (; i < n; i++) r += a[i] * b[i]; return r; } Compute the performance of this function in terms of cycles per element (CPE) for each of the following associations for the core computation. Assume that we run this code on a machine in which multiplica- tion requires 7 cycles, while addition requires 5. Further, assume that these latencies are the only factors constraining the performance of the program. Don’t worry about the cost of memory references or integer operations, resource limitations, etc. Re-association ((r + x1 * y1) + x2 * y2) + x3 * y3 (r + (x1 * y1 + x2 * y2)) + x3 * y3 r + ((x1 * y1 + x2 * y2) + x3 * y3) r + (x1 * y1 + (x2 * y2 + x3 * y3)) (r + x1 * y1) + (x2 * y2 + x3 * y3) Page 8 of 17 Problem 7. (15 points): Cache memories. This problem requires you to analyze both high-level and low-level aspects of caches. You will be required to perform part of a a cache translation, determine individual hits and misses, and analyze overall cache performance. For this problem, you should assume the following: • Memory is byte addressable. • Physical addesses are 14 bits wide. • The cache is 2-way set associative with an 8 byte block-size and 2 sets. • Least-Recently-Used (LRU) replacement policy is used. • sizeof(int) = 4 bytes. Page 9 of 17 A. The following question deals with a matrix declared as int arr[4][3]. Assume that the array has already been initialized. (a) (1 point) The box below shows the format of a physical address. Indicate (by labeling the diagram) the fields that would be used to determine the following: CO The block offset within the cache line CI The set index CT The cache tag 13 12 11 10 9 8 7 6 5 4 3 2 1 0 (b) (1 point) Given that the address of arr[0][0] has value 0x2CCC, perform a cache address translation to determine the block offset and set index for the first item in the array. CI = 0x_______ CO = 0x_______ 13 12 11 10 9 8 7 6 5 4 3 2 1 0 (c) (3 points) For each element in the matrix int arr[4][3], label the diagram below with the set index that it will map to. Row 0 Row 1 Row 2 Row 3 Page 10 of 17 B. (6 points) The following questions also deals with int arr[4][3] and the cache defined at the beginning of the problem. Assume the cache stores only the matrix elements; variables i, j, and sum are stored in registers. int sum = 0; for(i=0; i<4; i++){ for(j=0; j<3; j++){ sum += arr[i][j]; /* second access begins */ for(i=2; i>=0; i=i-2){
for(j=0; j<3; j++){ sum += arr[i][j]; sum += arr[i+1][j]; /* second access ends */ Assume the above piece of code is executed. Fill out the table to indicate if the corresponding memory access will be a hit (h) or a miss (m) when accessing the array arr[4][3] for the second time (between the comments ’second access begins’ and ’second access ends’). arr[4][3] Col 0 Col 1 Col 2 Row 0 Row 1 Row 2 Row 3 The following grids can be used as scrap space: Page 11 of 17 C. The following question deals with a different matrix, declared as int arr[5][5]. Again assume that i, j, and sum are all stored in registers. Consider the following piece of code: #define ITERATIONS 1 int i, j, k; int sum = 0; for(k=0; kCS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com