程序代写代做代考 assembly mips clock cache C compiler CS 2506 Computer Organization II MIPS 4: Branch and Memory

CS 2506 Computer Organization II MIPS 4: Branch and Memory
You may work in pairs for this assignment. If you choose to work with a partner, make sure only one of you submits a solution, and you paste a copy of the Partners Template that contains the names and PIDs of both students at the beginning of the file.
Always justify your answer; describe your reasoning clearly. No points will be given if a GTA cannot understand your description. This will help you
Branch Prediction: Misprediction Handling Branches 24
practice for the real world; your team/manager will judge you poorly if your team/manager cannot understand your written communication/documentation.
Unsimplified Datapath with Hazard DetePcipteioline: Independent Insts 3
What to submit: Submit a word document containing your answers. No late submissions will be accepted.
beq (taken)
CS@VT
Computer Organization II
CS@VT
Computer Organization II
Pipeline B: Resolve branch early + branch prediction
Pipeline A
Both Pipelines have a cycle time of 200ps. Fetch, Execute, and Memory stages take 200ps each. Writeback stage takes 100ps; it writes to the register file in the middle of clock cycle. Decode stage may take as little as 100ps.
You are permitted to work in pairs for this assignment! 1
sub

CS 2506 Computer Organization II MIPS 4: Branch and Memory
1. Implement J in a Pipeline
The pipelines on the cover page do not implement J. This problem asks you to implement J in Pipeline B. Assume the following:
• Pipeline B already has a Jump control output, just like the single-cycle CPU; this Jump control output is just floating (i.e., not connected to anything).
• There is a strict design requirement to not increase CPU cycle time.
• Due to area constraint, the design is limited to just a few logic gates.
1) List what current connection(s) to remove, if any.
2) List all new gates/module you want to add.
3) For EACH new gate/module, describe what EACH of its ports should connect to.
You are permitted to work in pairs for this assignment! 2

CS 2506 Computer Organization II MIPS 4: Branch and Memory
2. Control Hazards
The following assembly code is generated by compiling a C program to run on a single-cycle CPU:
1:
2:
3:
4: label1: 5:
6:
7:
8:
9: label2:
beq $0, $7, label1 #$7 is currently 1 beq $0, $0, label1
add $1, $2, $0
and $4, $5, $6
or $8, $9, $10
sub $11, $7, $0 beq $11, $0, label2 slt $7, $8, $9
add $10, $11, $12
a) After the original code is recompiled for pipeline A, how many cycles of penalty will pipeline A encounter due to software bubbles and/or instruction squash? Assume a simple compiler that uses NOPs to ensure correctness. Show your work if you want partial credit.
b) After the original code is recompiled for pipeline B, how many cycles of penalty will pipeline B encounter due to software bubbles and/or instruction squash? Assume a simple compiler than uses adds NOPs to ensure correctness. Show your work if you want partial credit.
You are permitted to work in pairs for this assignment! 3

CS 2506 Computer Organization II MIPS 4: Branch and Memory
3. Dynamic Branch Prediction
The following code creates a histogram of the number of true and false in a very large Boolean array. Currently, array has a repeating value of true, false, true, false, true, false… (These array values are unknown at compile time). histogram is an array with 2 elements.
for (int bucket=0; bucket<=1; bucket++){ for(int i=0;idata==list1Iter->data){
matchesFound++;
break; }
list2Iter = list2Iter ->next; }
list1Iter=list1Iter->next; }
a) For a 16KB fully-associative LRU cache with 32B/cacheline, what is the hit rate of the two nested loops as a whole? Only two digits of significance are required for your answer.
b) For a 16KB fully-associative LRU cache with 8B/cacheline, what is the hit rate of the two nested loops as a whole? Only two digits of significance are required for your answer.
c) For a 16KB fully-associative LRU cache with 64B/cacheline, how would you optimize the nest loops to achieve >90% hit rate? After your optimization, the result of matchesFound should still be same as before. Show your optimized code.
You are permitted to work in pairs for this assignment! 7
MIPS 4: Branch and Memory