1.
1) pipelined: The clock cycle time equals the biggest stage latency,
so it is 210ps.
non-pipelined: The clcok cyle time is the total stage latencies.
So it is 200 + 100 + 120 + 210 + 150 = 780.
2) LW uses all the 5 stages.
pipelined: 210 * 5 = 1050ps
non-pipelined: 200 + 100 + 120 + 210 + 150 = 780.
3) I would split the MEM stage, because it has the biggest latency.
After split, now the biggest latency is IF stage with 200ps, so
the new cycle time is 200ps.
2.
1) The first 2 instructions have load-use hazard between them
and it can’t be resolved with forwarding.
2)
lw $t0, 0($a0)
addi $a0, $a0, 4
addi $t0, $t0, 1
sw $t0, -4($a0)
3.
1) l2 depends on l1 ($R1)
l3 depends on l1 ($R1)
l4 depends on l2 ($R2)
l4 depends on l3 ($R1)
2) without forwarding:
between l1 and l2
between l1 and l3
between l2 and l4
between l3 and l4
with forwarding
between l2 and l4
between l3 and l4
3) without forwarding:
between l1 and l2
between l1 and l3
between l2 and l4
between l3 and l4
with forwarding
between l2 and l4
between l3 and l4
4.
1)
2)
Possible forwarding situation: ALU can directly send result to the next instruction.
Possible hazard: load-use hazard.
/docProps/thumbnail.jpeg