程序代写 CSE 420 Fall 2018 Module 1 Sample Questions

CSE 420 Fall 2018 Module 1 Sample Questions
Question 1 [MIPS Assembly Language Programming]
Write MIPS assembly language program for the AddOddElement routine shown below. This routine sums only the odd elements of an array. You should write a recursive function, similar to the one below. Stick to MIPS calling and register conventions as much as possible.
Hint: You can check your solution by trying it out on MARS.

Copyright By PowCoder代写 加微信 powcoder

int AddOddElement( int array[], int size ) { if ( size == 0 )
return 0 ;
else if ( array[ size – 1 ] % 2 == 1 )
return AddOddElement( array, size – 1 ) + array[ size – 1 ] ; else
return AddOddElement( array, size – 1 ) ; }
Usually, the hard part is to decide which values to save on the stack. arr is stored in $a1, and that this value doesn’t change throughout. size may need to be saved, though size – 1 appears to be more useful. Since we make calls to AddOddElement, we also need to save $ra. So, let’s save size – 1 and $ra to the stack. It turns out we also need to save array[ size – 1 ] to the stack too.
AddOddElement:
addi $sp, $sp, -12
addi $t0, $a0, -1
sw $t0, 0($sp)
sw $ra, 4($sp)
bne $a0, $zero, CASE1 li $v0, 0
addi $sp, $sp, 12 jr $ra
# Check if array[size-1] is odd CASE1:
mult $t0, $t7
add $t1, $t1, $a1
lw $t2, 0($t1)
andi $t3, $t2, 1
beq $t3, $zero, CASE2
# Adjust sp
# Compute size – 1
# Save size – 1 to stack # Save return address # branch if size != 0
# Set return value to 0 # Adjust sp
# Multiply size – 1 by 4
# Put result in t1
# Compute address of array[ size – 1 ] # t2 = array[ size – 1 ]
# is array[ size – 1 ] odd
# branch if even

# If array[size-1] is odd, recursively call function # and add the computed value array[size-1]
sw $t2, 8($sp)
move $a0, $t0
jal AddOddElement lw $t2, 8($sp)
add $v0, $v0, $t2
lw $ra, 4($sp)
addi $sp, $sp, 12
# save array[ size – 1 ] on stack # update first argument as size-1
# restore array[ size – 1 ] from stack # update return value
# restore return address from stack # Adjust stack pointer
# If array[size-1] is even, recusively call function AddOddElement CASE2:
move $a0, $t0
jal AddOddElement lw $ra, 4($sp)
addi $sp, $sp, 12
# update first argument as size-1
# restore return address from stack # Adjust sp
As it turns out, we didn’t have to save size – 1. So we can leave that off the stack if you want. As a rule of thumb, you may end up deciding later about what to save and what not to save. A compiler, of course, has to determine this without being too clever. Sometimes a compiler might save too much to the stack, just so it’s easier to generate code.
A few comments:
• In CASE1, we use mult. The result is placed in two registers HI and LO. Since the number is likely to be quite small, we can assume the result is in LO, and HI contains all 0’s. We can access the low word from LO, using mflo (move from LO).
• Note that using mult to multiply by 4 is generally a slow instruction compared to the doubling a value twice, or logical left shifting by 2 bits.
• In CASE1 we need to access array[ size – 1 ]. This then has to be placed on the stack because we’ll need to use the value after the jal call.
• We use andi (bitwise and immediate) to test for even or odd. Essentially we mask all but the least significant bit and test if that value is 0 or 1. Fortunately, this trick works for UB and 2C representations.
• CASE2 is quite short because we don’t access the array elements. We don’t even need to worry about $v0 because the call to jal AddOddElement fills it in for us. All we have to do is to restore the return address and jump back.
• Array access tends to bloat the length of recursive function code in assembly language.

Question 2 [MIPS Assembly Language Programming]
Following the MIPS subroutine procedure using the argument registers($a0-a3), translate the following C function to assembly.
int sum(int a, int b, int c, int d, int e, int f){ return a+b+c+d+e+f;
sum: add $t0, $a0, $a1 # $t0 = a + b add $t1, $a2, $a3 # $t1 = c + d add $t0, $t0, $t1
lw $t2, ($sp) #$t2 = f
lw $t3, 4($sp)# $t3 =e
add $t0, $t0, $t2
add $t0, $t0, $t3
add $v0, $t0, $0 # $v0 is one of the return parameters needed
Note: Even though the solution uses the argument registers, the stack is still needed for extra arguments.
Question 3 [MIPS Assembly Language Programming]
For each of the following pseudo-instructions, produce a minimal sequence of real MIPS instructions to accomplish the same thing. You may only use the $at register as a temporary register.
a) bltu $s1, $s2, Label b) rol $s1, $s2, 5
a) bltu $s1, $s2, Label
sltu $at, $s1, $s2
bne $at, $zero, Label
b) rol $s1, $s2, 5
srl $at, $s2, 27 sll $s1, $s2, 5 or $s1, $s1, $at
# branch less than unsigned # rol = rotate left $s2 by 5 bits
# branch less than unsigned
# rol = rotate left $s2 by 5 bits
Question 4 [MIPS Assembly Language Programming]
Consider the following program fragment in MIPS32 assembly code: li $s0, -1
srl $v0, $s0, 1 addiu $a0, $v0, 1

Which of the following statements are correct?
(I) The fragment can occupy 128 bits in memory
(ii) Register $a0 will contain the largest positive representable signed number after executing the fragment
(iii) Register $a0 will contain the least negative representable signed number after executing the fragment
(iv) The fragment will raise an overflow exception
(v) Register $v0 will contain 1 after executing the fragment
li is a pseudo instruction in MIPS assembly, which translates to two instructions. For example, li $8, 0x3BF20 translates to:
lui $8, 0x0003
ori $8, $8, 0xBF20
Therefore, the fragment can have 4 instructions, occupying 128 bits in memory.
Register $a0 will contain the least negative representable signed number after executing the fragment

Question 5 [Assembling a program]
Consider this switch code section written in C language, which has three cases.
switch( i ) {
case 1: i++ ; // falls through case 2: i += 2 ;
case 3: i += 3 ;
The following MIPS instructions implement this switch.
addi $s4, $zero, 1
bne $s1, $s4, C2_COND j C1_BODY
C2_COND: addi $s4, $zero, 2
bne $s1, $s4, C3_COND
j C2_BODY C3_COND: addi $s4, $zero, 3
bne $s1, $s4, EXIT
j C3_BODY C1_BODY: addi $s1, $s1, 1 C2_BODY: addi $s1, $s1, 2
j EXIT C3_BODY: addi $s1, $s1, 3
# case 1: set temp to 1
# false: branch to case 2 cond # true: branch to case 1 body # case 2: set temp to 2
# false: branch to case 3 cond # true: branch to case 2 body # case 3: set temp to 3
# false: branch to exit
# true: branch to case 3 body # case 1 body: i++
# case 2 body: i += 2
# case 3 body: i += 3
Problem: Translate the machine instructions into instruction fields with decimal representation.
Register Variable
$s1 i $s4 temp
Assembly Instruction
Instruction Fields (fill with decimal
0x100 addi $s4, $zero, 1 8 0x104 bne $s1, $s4, 5
0 20 1 17 20 1
0x108 j C1_BODY 2 0x124/4
0x10c addi $s4, $zero, 2 8
0x110 bne $s1, $s4, 5 C3_COND
0 20 2 17 20 1
0x114 j C2_BODY
0x118 addi $s4, $zero, 3 0x11c bne $s1, $s4, EXIT 0x120 j C3_BODY
0x124 addi $s1, $s1, 1 0x128 addi $s1, $s1, 2 0x12c j EXIT
0x130 addi $s1, $s1, 3
8 0 20 3 5 17 20 5 2 0x130/4
8 17 17 1 8 17 17 2 2 0x134/4

Question 6 [Datapath and control]
We wish to add a variant of the lw (load word) instruction to the single-cycle datapath, which increments the index register after loading word from memory. This instruction (lwi) corresponds to the following two instructions:
lw $rt, offset( $rs ) addi $rs, $rs, 4
The single-cycle datapath given in the Figure 1 (is same as in Figure 5.17, P&H edition 3, page 307). Add any necessary datapaths and control signals to Figure 1 and show the necessary additions to Table (same as Figure 5.18, P&H edition 3, page 308).
R-format 1 0 0 1 0 0 0 10
Lw 0 1 1 1 1 0 0 00
Sw X 1 X 0 0 1 0 00
Beq X 0 X 0 0 0 1 01
Lwi 0 1 1 1 1 0 0 00 1
Instruction

A modification is required for the datapath of Figure 1 to perform the autoincrement by adding 4 to the $rs register through an incrementer. Also we need a second write port to the register file because two register writes are required for this instruction. The new write port will be controlled by a new signal, “Write 2”, and a data port, “Write data 2.” We assume that the Write register 2 identifier is always the same as Read register 1 ($rs). This way “Write 2” indicates that there is a second write to register file to the register identified by “Read register 1,” and the data is fed through Write data 2.
A new line should be added to the truth table in Figure 2 for the lwi command as follows:
RegDst = 0: First write to $rt.
ALUSrc = 1: Address field for address calculation.
MemtoReg = 1: Write loaded data from memory. RegWrite = 1: Write loaded data into $rt.
Branch ALUOp Write2
= 0: Not a branch, output from the PCSrc controlled mux ignored. = 00: Address calculation.
= 1: Second register write (to $rs).
Such a modification of the register file architecture may not be required for a multiple-cycle implementation, since multiple writes to the same port can occur on different cycles.
Question 7 [Cycle time]
A single-cycle processor implementation of the simplified MIPS architecture is shown in Figure 2. The table below gives the delays of the various microarchitectural blocks in the processor. Assume that the delay of wires and anything not mentioned in the table is 0.
400ps 100ps
30ps 120ps 200ps 350ps 20ps
50ps 100ps

Problem a: What is the cycle time (in ps) of the processor? The
critical path should come from the instruction “ lw” i.e. instruction
mem → reg. file → ALU → data mem → MUX → reg. file Hence,
cycle time = 400 + 200 + 120 + 350 + 30 + 200 = 1300 ps
Note: If the latency of the blocks control + ALUcontrol is larger than that of a register file + multiplexer, then they will replace the later in the calculation of the path delay.
Problem b: What is the path delay (in ps) for executing AND instruction?
The path is – instruction mem → reg. file → mux → ALU → MUX → reg. file Hence, time to execute AND instruction is – 400 + 200 +30 +120 + 30 + 200 = 980 ps
Problem c: What is the path delay (in ps) for MIPS BNE instruction?
The path is – instruction mem → max(reg. file, control) → mux → ALU → MUX →
Hence, time to execute BNE instruction is – 400 + 200 +30 +120 + 30 + 30 = 810 ps

Once the signal (say zero) is raised by ALU, AND gate and first multiplexor follow in the path for branch. Then it goes through the second mux to select between the output of first mux and the destination for the jump.
Problem d: What is the execution time (in cycles and in ps) for code given below?
Given assembly code: add $s1, $s0, 40
LOOP: beq $s0, $s1, EXIT
lw $t0, 0($s0) add $s0, $s0, 4 add $t1, $t0, $t1 j LOOP
The loop comprises of 5 instructions and would execute for 10 times. Therefore, Total Cycles: 2 + 5*10 + 1 = 53
Execution time (in ps): Total Cycles * Cycle Time (in ps) = 53 * 1300 = 68900 ps
Problem e: Now suppose we enhance the processor with a new instruction lwi (semantics of lwi explained in Question 5). Adding this new instruction will require an extra port in the register file. Because of this, the delay of the register read or register write increases from 200 ps to 300 ps.
What is the cycle time (in ps) of the processor with enhanced functionality?
The critical path comes from the instruction “ lw”.
Therefore, cycle time = 400 + 300 + 120 + 350 + 30 + 300 = 1500 ps
Problem f: You can improve the code given in problem (d) for the processor with this
enhanced functionality. What will be the new (optimized) code?
Optimized assembly code: add $s1, $s0, 40
LOOP: beq $s0, $s1, EXIT
lwi $t0, 0($s0) add $t1, $t0, $t1 j LOOP
Problem g: What is the execution time (in cycles and in ps) for your new code?
Total Cycles: 2 + 4*10 + 1 = 43
Execution time (in ps): 43 * 1500 = 64500 ps

Question 8 [MIPS Assembly Language Programming]
(1) Write MIPS assembly code to do the following arithmetic
calculation. Note:
1) All variables A, and X are integers.
2) The data value for A, should be loaded WITHOUT using “li” instruction.
Write final value into register$t2as if you are writing to the variableX. You can use any number of registers (any type), to perform the computation below.
A = 0x12344321
X = (A >> 2)
# program-1.asm
Lui $t0, 0x1234, Ori $t0, $t0, 0x4321 Srl $t2, $t0, 2
# program-2.asm
addi $t0, $zero, 0x1234 sll $t0, $t0, 16
addi $t0, $t0, 0x4321 srl $t2, $t0, 2
/* A = 30541494510 */
/* >> denotes Right Shift */
(2) Write a MIPS assembly language translation for the following C++ code. Note –
(1) You are required to load the input data into the respective registers for “A”and “B”, using the “li” instruction. You are then required to use the memory for the pointer operations in the program, and complete the MIPS program.
(2) You are required to write the code that performs“print A”and“EXIT” functions, using the “syscall” numbers “1”, and “10” respectively.
(3) You are required to be aware of the register usage guidelines discussed in class, and use appropriate registers in the program.
void chico (int *X, int Y) {
*X = *X + ((Y + 2)<< 2); /* << denotes Left Shift */ } intA=10; intB=20; chico(&A, B); cout << A; /*A<--10 */ /*B<--20 */ # program-3.asm .text # Load the first input : A li $t0, 10 # load integer number 10 # Store data into the stack for temporary data storage addi $sp, $sp, -4 sw $t0, 4($sp) # A stored in the first stack position # Read the second input : B li $t1, 20 move $a0, $sp move $a1, $t1 jal chico lw $t0, 4($sp) # Print out and exit. li $v0, 1 move $a0, $t2 syscall li $v0, 10 syscall add $t0, $a1, 2 sll $t0, $t0, 2 lw $t1, 4($a0) add $t1, $t1, $t0 sw $t1, 4($a0) jr $ra # load integer number 20 # load &A into a first register argument # ($a1 has the value of B) # call the function : chico # load the value A from memory # syscall number 1 -- print int # update the value J into $a0 for print # print the integer # syscall number 10 -- terminate the program # exit now # $t0 = Y + 2 ($a1 holds the value Y (B) # $t0 = (Y+2) << 2 # load the value *X from the memory address in $a0 # $t1 = *X + ((Y+2) << 2) # *X = $t1 (store $t1 in the memory where X points) The important things to ensure (and evaluate for), in this program are: 1. The s register should be used to hold the address of A, which is used across the function call. If you use the stack, then $sp is used for this purpose. 2. The total number of registers required are only 2, and they are only temporary registers - $t0, and $t1. No need to use S-registers for this, as no data is saved across the function call. 3. The use of the right arguments for the function call. 4. The data arguments to the function call must be passed as pointers, to load and store values into the memory location within the function. Question 9 [Assembling a program] Consider the following MIPS assembly code snippet. .data temp: .text Do_loop: For_loop: . word 4, 0, 1, 2, 0 addi $t0, $zero, 4 addi $t0, $t0, -1 la $s0, temp addi $s1, $zero, 0 sllv $s2, $s1, $s3 add $s2, $s0, $s2 lw $s3, 32($s2) beq $s3, $zero, Exit add $s3, $s3, $s3 sw $s3, 0($s2) addi $s1, $s1, 1 j For_loop bne $t0, $zero, Do_loop add $s5, $s1, $t0 # $t0 - -; # $s0 = &temp # $s2 = $s1 << $s3; # $s2 = &temp + $s2; # load $s3M[$s2+32]; (temp[$s2+32]) # if (temp[$s2+32] == 0) goto exit; # $s3 = 2 * $s3; # temp[$s2+32]$s3; Fill in the following table as you translate the above code into machine code in hexadecimal representation. Some part is done for your convenience. Assume that: the data segment starts at address: 0x800 and the text segment at 0x100. You are required to only fill the white boxes with appropriate hexadecimal values. 0x11c 0x120 addi$t0,$zero,4 addi $t0, $t0, -1 la $s0, temp addi $s1, $zero, 0 0xFFFF (-1) 0x0000 (0) 0x12 0x0 0x4 0x12 0x0 0x20 0x20 0x0005 (5) 0x13 0x0 0x20 0x00 Instruction Machine Code 0x13 0x11 add $s2, $s0, $s2 0x10 0x12 sllv $s2, $s1, $s3 lw $s3, 32($s2) beq $s3,$zero,exit 0x13 0x00 add $s3, $s3, $s3 sw $s3, 0($s2) addi $s1, $s1,1 j For_loop 0x13 0x13 0x12 0x13 0x11 0x11 0x01 0x110/4 = 0x44 bne $t0,$0,Do_loop 0x08 0x00 0xFFF4 (-12) 0x134 add $s5, $s1, $t0 Question 10 [Machine Language] Find the corresponding Hex code for the given assembly code add $s0, $a1, $t7 Since the above code is an R-type instruction we the first write down the R-type format with opcode, rs, rt, rd, shamt and function as shown below. Opcode register s register t register d shift amount function From the register chart rs corresponds to $a1, rt corresponds to $t7 and rd corresponds to $s0 whose values are 5, 15 and 16 respectively. Convert these into binary and plug into the corresponding columns. The R-type instructions have opcode as 000000 and function as 100000 and Shamt is also 00000. 000000 00101 01111 10000 00000 100000 Converting this 32 bit binary number 000000 00101 01111 10000 00000 100000 We get the answer as 0x00af8020 Question 11 [Instruction Set Architecture] Suppose we want to implement register indirect conditional branch (beqr) as a new pseudo-instruction. The format of the instruction is as follows: beqr $s, $t, $d If the value of register s is equals to that of register t, we want to jump to the address specified as the value of register d. Give a proposal for adding such pseudo-instruction to the ISA (i.e., describe how it could be encoded in the I-Type, R-Type, or J-Type format. Ans: One way to implement pseudo-instruction beqr $s0, $s1, $s2 is: bne $s0, $s1, # if (s0==s1) skip branch skip j $s2 # branch to $s2 skip: We can encode this instruction as an R-Type instruction, because it has three register parameters. We can use opcode 0, and any funct value that has not yet been assigned. Question 12 [Performance] The table below describes the performance of two processors, the rAlpha and the c86, and two compilers on a common 'benchmark' program. Processor GHz Compiler A Avg CPI 1.2 2.2 Which is the best compiler-machine combination? Ans: Using the formula gives us total time (in seconds per program) as follows: Compiler B rAlpha 3.4 c86 2.6 1500 Instructions 7000 Instructions 5000 1000 Avg CPI 1.5 4.0 Compiler A Compiler B 11 = 2.47×10−6 5000×1.5× = 2.21×10−6 = 1.54×10−6 = . × − 1000×4× Configuration with c86 and Compiler A is the best compiler-machine combination. Question 13 [Assembling a Program] Consider following snippet of the C program – struct bar_t { int number; int type[3]; struct bar_t bar[4]; a) If the starting address of bar[0] is 0x50060000, what is the address of the member type[0] of the variable bar[1]? Each object of the type of structure bar_t contains 4*4 or 16 bytes. The member type[0] of a variable bar[1] is located 16 + 4 i.e. 20 bytes far from the beginning address of the variable bar[0]. So, the address required is 0x50060014. b) If the starting address of bar[0] is stored in the register t0, what is the instruction to load the value of the member type[1] of the variable bar[2] in re 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com