代写 R MIPS assembly We will examine a simplified MIPS implementation first, and then produce a more realistic pipelined version.

We will examine a simplified MIPS implementation first, and then produce a more realistic pipelined version.
A simple, representative subset of machine instructions, shows most aspects:
– Memory reference: lw, sw
– Arithmetic/logical: add, sub, and, or, slt
– Transfer of control: beq, j
R
I J
op
rs
rt
rd
shamt
funct
op
rs
rt
16-bit immediate
op
26-bit immediate
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
Introduction Datapath Design 1
CS@VT Computer Organization II ©2005-2016 McQuain

I PC  instruction memory, fetch instruction
II Register numbers  register file, read registers to get operand(s)
III Depending on instruction class
– May use ALU to calculate needed value
– R-type:
– Load/store:
– Branch:
need result of specified operation
need memory address to be read from/written to
need to compare registers AND need the branch target address
– May access data memory
– Load/store: access data memory to read/write value
– Set address for next instruction fetch: PC  branch target OR PC + 4 OR jump target
Instruction Execution Datapath Design 2
CS@VT Computer Organization II ©2005-2016 McQuain

Here’s what we need for sequential fetches (no beq or j):
increment by 4 for next instruction (for now)
32-bit register storing address of instruction to fetch next
assume there are separate storage for instructions and data (for now)
Executing Instruction Fetch Datapath Design 3
CS@VT Computer Organization II ©2005-2016 McQuain

Read two operands from register file
Use ALU to perform the arithmetic/logical operation Write the result to a register
GPR[rd] = GPR[rs] funct GPR[rt]
op
rs
rt
rd
shamt
funct
Register numbers
Control
Operands
Operation results
Register file
Executing R-Format Instructions Datapath Design 4
CS@VT Computer Organization II ©2005-2016 McQuain

GPR[rd] = GPR[rs] funct GPR[rt]
op
rs
rt
rd
shamt
funct
rs and rt specify where operands are
add, sub, and, or, slt
ALU result goes to register file
Control tells ALU Control to use the funct bits… and sets RegWrite to 1
ALU applies specified operation to operands
rd specifies where result goes
ALU Control sets ALU to correct operation
Executing R-Format Instructions Datapath Design 5
CS@VT Computer Organization II ©2005-2016 McQuain

Readregisteroperand GPR[rt] = Mem[GPR[rs] + imm] Calculate the address to be read using register operand and 16-bit offset from instruction
– Use ALU, but sign-extend offset …
op
rs
rt
16-bit immediate
Executing Load Instructions Datapath Design 6
CS@VT Computer Organization II ©2005-2016 McQuain

… GPR[rt] = Mem[GPR[rs] + imm] Read memory and update register
op
rs
rt
16-bit immediate
Executing Load Instructions Datapath Design 7
CS@VT Computer Organization II ©2005-2016 McQuain

Executing Load Instructions Datapath Design 8 GPR[rt] = Mem[GPR[rs] + imm]
op rs
rt 16-bit immediate
rs specifies where operand is
rt specifies where result goes
lw
16-bit immediate is extended
CS@VT
Computer Organization II
©2005-2016 McQuain
Control tells ALU Control to add operands and sets RegWrite to 1 and MemRead to 1
ALU computes address to read
Value is retrieved from memory location and sent to register file

Executing Store Instructions Datapath Design 9 Mem[GPR[rs] + imm] = GPR[rt]
op
rs rt
16-bit immediate
Control tells ALU Control to add operands, and sets MemWrite to 1
ALU computes address to read
rs specifies where operand is
rt specifies where data comes from
sw
16-bit immediate is extended
Value is retrieved from register file and sent to memory
CS@VT
Computer Organization II
©2005-2016 McQuain

In order to produce a complete datapath design, we must identify and deal with any conflicts.
First, consider the specification of the register numbers supplied to the register file. They come from the current instruction, but using which bits?
Read reg 1
Read reg 2
Write reg
R-type
25:21
20:16
15:11
load
25:21
not used
20:16
store
25:21
20:16
not used
OK OK Conflict We have a conflict regarding the write register.
To resolve the conflict, we must be able to select one set of bits for R-type instructions and a different set of bits for the load instructions… how do we make a selection in hardware?
Unifying the Designs Datapath Design 10
CS@VT Computer Organization II ©2005-2016 McQuain

We also have a conflicts regarding the source of the write data and the source of the right (lower) operand to the ALU:
Write data source
Right operand source
R-type
ALU output
register file
load
Data memory
sign-extend
store
not used
sign-extend
Conflict Conflict
To resolve these conflicts, we must be able to:
– send the ALU output to the register file for R-type instructions, but send the data read from the memory unit to the register file for load instructions
– send the data read from the register file to the ALU for R-type instructions, but send the output from the sign-extender to the ALU for load and store instructions
Unifying the Designs Datapath Design 11
CS@VT Computer Organization II ©2005-2016 McQuain

Unified R-type/Load/Store Datapath Datapath Design 12
Multiplexor to select write register number
CS@VT
Computer Organization II
©2005-2016 McQuain
By adding three multiplexors, we can resolve the conflicts that we have identified and produce a design that will (almost) handle the R-type, load and store instructions we are considering.
Multiplexor to select right operand to ALU
Multiplexor to select write data for register file
add, sub, and, or, slt, lw, sw

We’ve identified quite a few necessary control signals in our design.
Which ones are two-valued? Do any require more than two values?
How should they be set?
The datapath cannot operate correctly unless every control signal is managed properly. Two things to remember:
– Every line always carries a value. We may not know (or care) what it is in some cases. But there is always a value there.
– It doesn’t matter what value a line carries if that value is never stored or used to make a decision. (Hence, we will find that we have don’t-care conditions.)
Analyzing the Control Signals Datapath Design 13
CS@VT Computer Organization II ©2005-2016 McQuain

R-type and load instructions require writing a value into a register, but the store instruction does not.
Writing a value modifies the state of the system, so it is not a don’t-care. So, we must manage the RegWrite signal accordingly:
RegWrite
R-type
1
load
1
store
0
Value at Write data is written to the Write register iff RegWrite is 1.
Register File Control Datapath Design 14
CS@VT Computer Organization II ©2005-2016 McQuain

MemRead
– must be 1 for load instructions, since they copy a value from memory to a register – might be a don’t-care for R-type and store instructions… why? If not, should be 0.
MemWrite
– must be 1 for store instructions, since they copy a value from a register to memory
– must be 0 for R-type and load instructions; otherwise they could modify a memory
value that should not be changed
MemRead
MemWrite
R-type
?
0
load
1
0
store
?
1
Data Memory Control Datapath Design 15
CS@VT Computer Organization II ©2005-2016 McQuain

RegDst
– must be 0 for load instructions
– must be 1 for R-type instructions
– don’t-care for store instructions (why?)
ALUSrc
– must be 0 for R-type instructions
– must be 1 for load and store instructions
MemtoReg
– must be 1 for load instructions
– must be 0 for R-type instructions
– don’t-care for store instructions (why?)
see slide 10
see slide 11
see slide 15
Multiplexor Control Datapath Design 16
CS@VT Computer Organization II ©2005-2016 McQuain

There are a lot of control signals, even in our simple datapath.
At this point, almost all of them are single-bit signals (i.e., they make a choice between two alternate actions).
The ALU control needs to be different because there are more than two choices for what it will actually do:
ALU
R-type add sub and or slt
add subtract and or
slt
load
add
store
add
So, the ALU will require a multi-bit control signal… why? How many bits?
ALU Control Datapath Design 17
CS@VT Computer Organization II ©2005-2016 McQuain

This suggests separating the control logic into two modules:
– a master control module that determines the type of instruction being executed and sets the non-ALU control signals
– a secondary control module that manages the interface of the ALU itself
The master module will send :
– a specific selector pattern for the ALU if the instruction is not R-type; e.g., it sends
the ADD selector pattern for lw and sw instructions
– a flag telling the secondary module to analyze the funct bits if the instruction is R-
type
We’ll fill in the details of the two modules later, but for now we do know what each must do, at a high level.
ALU Control Datapath Design 18
CS@VT Computer Organization II ©2005-2016 McQuain

Unified Datapath Design with Control Datapath Design 19
CS@VT Computer Organization II ©2005-2016 McQuain
add, sub, and, or, slt, lw, sw

Read two operands from the register file
Use the ALU to compare the operands: subtract and check Zero signal …
op
rs
rt
16-bit immediate
if GPR[rs] == GPR[rt] then
PC = PC + 4 + (imm << 2) Executing Branch Instructions Datapath Design 20 CS@VT Computer Organization II ©2005-2016 McQuain ... Calculate the branch target address: - Sign-extend displacement (immediate from instruction) - Shift left 2 places (MIPS uses word displacement – why?) - Add to PC + 4 (already calculated PC + 4 during the instruction fetch) Send computed branch target address to the PC (if taken) op rs rt 16-bit immediate if GPR[rs] == GPR[rt] then PC = PC + 4 + (imm << 2) Executing Branch Instructions Datapath Design 21 CS@VT Computer Organization II ©2005-2016 McQuain Examine the display below of a short MIPS assembly program, and the addresses at which the instructions will be loaded into memory when the program is executed: address assembly source code --------------------------------------------- 0x00400000 0x00400004 0x00400008 0x0040000C 0x00400010 0x00400014 0x00400018 0x0040001C 0x00400020 0x00400024 main: repeat: check: exit: .text addi $s0, $zero, 10 addi $s1, $zero, 20 j $check sub $s3, $s1, $s0 sra $s3, $s3, 1 beq $s3, $zero, exit add $s0, $s0, $s3 bne $s0, $s1, repeat li $v0, 10 syscall If taken, a conditional branch causes the PC to be set to the address of the instruction that's the target of the branch: Aside: Branch Target Address Datapath Design 22 CS@VT Computer Organization II ©2005-2016 McQuain address assembly source code --------------------------------------------- 0x00400000 0x00400004 0x00400008 0x0040000C 0x00400010 0x00400014 0x00400018 0x0040001C 0x00400020 0x00400024 main: repeat: check: exit: .text addi $s0, $zero, 10 addi $s1, $zero, 20 j $check sub $s3, $s1, $s0 sra $s3, $s3, 1 beq $s3, $zero, exit add $s0, $s0, $s3 bne $s0, $s1, repeat li $v0, 10 syscall First of all, note that all the addresses are multiples of 4*. Therefore, the "distance" between two instructions is always a multiple of 4. *QTP: why isn't this surprising? Aside: Branch Target Address Datapath Design 23 CS@VT Computer Organization II ©2005-2016 McQuain The immediate we store for a beq instruction is (almost) the "distance" from the beq instruction to the instruction that is the target of the branch. address assembly source code --------------------------------------------- ... 0x0040000C repeat: sub $s3, $s1, $s0 ... 0x0040001C check: bne $s0, $s1, repeat ... Here, that "distance" is 0x0C – 0x1C = – 0x10. In 16-bit 2's complement, that's represented as: 11111111 11110000 Now, recall that the base-2 representation of a multiple of 4 ends in 00. If the "distance" always ends in 00, there's no reason to store that 00 in the instruction... Aside: Branch Target Address Datapath Design 24 CS@VT Computer Organization II ©2005-2016 McQuain Do we gain anything if we don't store those two 0's in the instruction? address assembly source code --------------------------------------------- ... 0x0040000C repeat: sub $s3, $s1, $s0 ... 0x0040001C check: bne $s0, $s1, repeat ... Yes. We can store a 16-bit "distance" in the instruction, but effectively use an 18-bit "distance" at runtime. That means that a conditional branch can "branch" farther... which is a gain. Aside: Branch Target Address Datapath Design 25 CS@VT Computer Organization II ©2005-2016 McQuain But, there's more... address assembly source code --------------------------------------------- ... 0x0040000C repeat: sub $s3, $s1, $s0 ... 0x0040001C check: bne $s0, $s1, repeat ... It takes time to evaluate the difference of the two registers; we can do other, useful things in the meantime, like calculate PC + 4: - If we don't take the branch, we need that address to fetch the next instruction. - If we do take the branch, we wasted no time computing PC + 4 along the way. But... that means the "distance" we'll store is actually from PC + 4 to the branch target. Aside: Branch Target Address Datapath Design 26 CS@VT Computer Organization II ©2005-2016 McQuain Now, consider the machine code for our example: target address machine code ------------------------------------------------------- ... 0x0040000C . . . -0x10 0x0040001C 000101 10000 10001 1111111111111100 ... bne The immediate in the beq instruction is: 1111111111111100 That's -4. If we shift that left by 2 bits (multiply by 4), we get -16. If we add -16 to the address 0x0040001C, we get 0x0040000C, which is the target address we need. Aside: Branch Target Address Datapath Design 27 CS@VT Computer Organization II ©2005-2016 McQuain Executing Branch Instructions Datapath Design 28 op rs rt rs and rt specify where operands are 16-bit immediate Control tells ALU to subtract operands, sets Branch signal Immediate is sign-extended and shifted AND gate controls which address goes to the PC ALU subtracts and sets Zero beq CS@VT Computer Organization II ©2005-2016 McQuain Adder computes branch target PC = (PC+4)[31:28]|(IR[25:0] << 2) - Shift left 2 places (just as with branch target address calculation) - Concatenate with PC + 4[31:28] (already calculated PC + 4 during the instruction fetch) Send computed jump target address to the PC Calculate the jump target address: from PC+4 p31p30p29p28 i25i24i23i22...i3i2i1i0 00 op 26-bit immediate Executing Jump Instructions Datapath Design 29 CS@VT Computer Organization II ©2005-2016 McQuain The calculation of the jump target address is similar to the calculation of the branch target address (beq). 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 1111111111111111111111111111 ... 0000000000000000000000000000 The MIPS model is to view memory as a sequence of 256 MB segments. 1111 1110 1101 The starting address of the current memory segment is given by the high 4 bits of PC + 4.* Now, 256 MB can be addressed by 28-bit addresses.* For jump (j) instructions, the 26-bit immediate is shifted 2 bits to the left, yielding a 28-bit offset, which is then added to the 0010 starting address of the current memory segment. *QTP: why? 0001 0000 4GB address range Aside: Jump Target Address Datapath Design 30 CS@VT Computer Organization II ©2005-2016 McQuain ... op 26-bit immediate Concatenate to form 32-bit address for jump MUX passes jump address back to PC High 4 bits of PC + 4 specify "segment" containing this code. Calculate address of next sequential instruction Control sets Jump signal so that the jump address will be used to fetch the next instruction Executing Jump Instructions Datapath Design 31 CS@VT Computer Organization II ©2005-2016 McQuain Unified Datapath Datapath Design 32 add, sub, and, or, slt, lw, sw, beq, j CS@VT Computer Organization II ©2005-2016 McQuain The unified datapath that we have designed: - illustrates many of the logical issues that must be solved in designing any datapath - can be extended to support additional instructions (easily for some, less so for others) - is fundamentally unsatisfactory in that it requires a single clock cycle be long enough for every path within the datapath to stabilize before the next instruction is fetched We may explore the second issue in exercises. The third issue can only be dealt with by transforming the design to incorporate a pipeline. Summary Datapath Design 33 CS@VT Computer Organization II ©2005-2016 McQuain