程序代写代做代考 mips CS232 Section 6: Solutions

CS232 Section 6: Solutions
Problems
1. The following code is run on a 5-stage MIPS pipeline with full forwarding.
# Original code
lw $t0, 0($a0)
addi $t0, $t0, 1
sw $t0, 0($a0)
addi $a0, $a0, 4
# Solution to 1(b)
lw $t0, 0($a0)
addi $a0, $a0, 4
addi $t0, $t0, 1
sw $t0, -4($a0)
(a) Identify all the data hazards that cannot be resolved with forwarding.
Solution: The load-use hazard between the first two instructions cannot be resolved with forwarding. (b) Rewrite the code to eliminate stalls on the 5-stage pipeline with full forwarding.
Solution: See above.
2. The target of a j instruction (unconditional jump) is computed in the ID stage of the 5-stage MIPS pipeline. How many instructions must be flushed each time a j instruction is executed? Draw a pipeline diagram to support your answer.
Solution: One instruction, as shown in the pipeline diagram below.
3. The following code is run on the 5-stage MIPS pipeline with full forwarding. Branch targets and decisions are resolved in the ID stage, and branches are always predicted not taken.
1
2
3
4
5
6
7
j
IF
ID
EX
MEM
WB
PC+4
IXF
target
IF
ID
EX
MEM
WB
# a0 = head node of list ## do-while style for part (b)
li $v0, 0 #a0
= head node of list
$v0, 0
loop:
beq $a0, $0, done
lw $t0, 0($a0)
add $v0, $v0, $t0
lw $a0, 4($a0)
j loop
done:
# other stuff
li loop:
# restructured code (see Note 2)
loop:
lw $t0, 0($a0)
lw $a0, 4($a0)
add $v0, $v0, $t0
bne $a0, $0, loop
lw
add $v0, $v0, $t0
lw $a0, 4($a0)
bne $a0, $0, loop
# other stuff
$t0, 0($a0)
(a) If the linked list has length n, how many cycles will the above code take to execute?
Solution: The total number of instructions executed will be 1 + 5n + 1 = 5n + 2, which requires 5n + 6 cycles on the 5-stage pipeline (not counting stalls and flushes). In each iteration of the loop, we have one stall due to the load-use hazard between lw and add, and one flush due to the j instruction. (There is no hazard between the second lw and the beq at the start of the next iteration, because of the j instruction and its flush.) Also the beq is correctly predicted not taken for n iterations (no flushes), and incorrectly predicted once (1 flush) at the end. Thus, the total number of cycles needed to execute the above code is 5n + 6 + 2n + 1 = 7n + 7 cycles.
Note 1: The load-use hazard is easy to eliminate: switch the order of the add and second lw instructions. With this change, the code takes 6n + 7 cycles.
(b) Given that n > 0, would it be beneficial to convert this into a do-while style loop?
Solution: Refer to the modified code above. The total number of instructions executed will be 1 + 4n, which requires 4n + 5 cycles (not counting stalls and flushes). In each iteration of the loop, we have one stall due to the load-use hazard between lw and add, two stalls due to the load-use hazard between the second lw and the bne (because branches decisions are resolved in the ID stage!), and one flush (except in the final iteration) because the bne is usually taken. Thus the total number of cycles needed to execute the above code is 4n+5+n+2n+n−1 = 8n+4, which is worse than the while-style code when n is large. Hence, it is not beneficial to convert the code.
Note 2: If we restructure the code as shown above, we need to stall just once per loop iteration, plus incur one flush per iteration (except the last), to yield a total of 4n+5+n+n−1 = 6n+4 cycles, which is only slightly better than the original version.
1