We will examine a simplified MIPS implementation first, and then produce a more realistic pipelined version.
A simple, representative subset of machine instructions, shows most aspects:
– Memory reference: lw, sw
– Arithmetic/logical: add, sub, and, or, slt
– Transfer of control: beq, j
R
I J
op
rs
rt
rd
shamt
funct
op
rs
rt
16-bit immediate
op
26-bit immediate
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
Introduction Datapath Design 1
CS@VT Computer Organization II ©2005-2016 McQuain
I PC instruction memory, fetch instruction
II Register numbers register file, read registers to get operand(s)
III Depending on instruction class
– May use ALU to calculate needed value
– R-type:
– Load/store:
– Branch:
need result of specified operation
need memory address to be read from/written to
need to compare registers AND need the branch target address
– May access data memory
– Load/store: access data memory to read/write value
– Set address for next instruction fetch: PC branch target OR PC + 4 OR jump target
Instruction Execution Datapath Design 2
CS@VT Computer Organization II ©2005-2016 McQuain
Here’s what we need for sequential fetches (no beq or j):
increment by 4 for next instruction (for now)
32-bit register storing address of instruction to fetch next
assume there are separate storage for instructions and data (for now)
Executing Instruction Fetch Datapath Design 3
CS@VT Computer Organization II ©2005-2016 McQuain
Read two operands from register file
Use ALU to perform the arithmetic/logical operation Write the result to a register
GPR[rd] = GPR[rs] funct GPR[rt]
op
rs
rt
rd
shamt
funct
Register numbers
Control
Operands
Operation results
Register file
Executing R-Format Instructions Datapath Design 4
CS@VT Computer Organization II ©2005-2016 McQuain
GPR[rd] = GPR[rs] funct GPR[rt]
op
rs
rt
rd
shamt
funct
rs and rt specify where operands are
add, sub, and, or, slt
ALU result goes to register file
Control tells ALU Control to use the funct bits… and sets RegWrite to 1
ALU applies specified operation to operands
rd specifies where result goes
ALU Control sets ALU to correct operation
Executing R-Format Instructions Datapath Design 5
CS@VT Computer Organization II ©2005-2016 McQuain
Readregisteroperand GPR[rt] = Mem[GPR[rs] + imm] Calculate the address to be read using register operand and 16-bit offset from instruction
– Use ALU, but sign-extend offset …
op
rs
rt
16-bit immediate
Executing Load Instructions Datapath Design 6
CS@VT Computer Organization II ©2005-2016 McQuain
… GPR[rt] = Mem[GPR[rs] + imm] Read memory and update register
op
rs
rt
16-bit immediate
Executing Load Instructions Datapath Design 7
CS@VT Computer Organization II ©2005-2016 McQuain
Executing Load Instructions Datapath Design 8 GPR[rt] = Mem[GPR[rs] + imm]
op rs
rt 16-bit immediate
rs specifies where operand is
rt specifies where result goes
lw
16-bit immediate is extended
CS@VT
Computer Organization II
©2005-2016 McQuain
Control tells ALU Control to add operands and sets RegWrite to 1 and MemRead to 1
ALU computes address to read
Value is retrieved from memory location and sent to register file
Executing Store Instructions Datapath Design 9 Mem[GPR[rs] + imm] = GPR[rt]
op
rs rt
16-bit immediate
Control tells ALU Control to add operands, and sets MemWrite to 1
ALU computes address to read
rs specifies where operand is
rt specifies where data comes from
sw
16-bit immediate is extended
Value is retrieved from register file and sent to memory
CS@VT
Computer Organization II
©2005-2016 McQuain
In order to produce a complete datapath design, we must identify and deal with any conflicts.
First, consider the specification of the register numbers supplied to the register file. They come from the current instruction, but using which bits?
Read reg 1
Read reg 2
Write reg
R-type
25:21
20:16
15:11
load
25:21
not used
20:16
store
25:21
20:16
not used
OK OK Conflict We have a conflict regarding the write register.
To resolve the conflict, we must be able to select one set of bits for R-type instructions and a different set of bits for the load instructions… how do we make a selection in hardware?
Unifying the Designs Datapath Design 10
CS@VT Computer Organization II ©2005-2016 McQuain
We also have a conflicts regarding the source of the write data and the source of the right (lower) operand to the ALU:
Write data source
Right operand source
R-type
ALU output
register file
load
Data memory
sign-extend
store
not used
sign-extend
Conflict Conflict
To resolve these conflicts, we must be able to:
– send the ALU output to the register file for R-type instructions, but send the data read from the memory unit to the register file for load instructions
– send the data read from the register file to the ALU for R-type instructions, but send the output from the sign-extender to the ALU for load and store instructions
Unifying the Designs Datapath Design 11
CS@VT Computer Organization II ©2005-2016 McQuain
Unified R-type/Load/Store Datapath Datapath Design 12
Multiplexor to select write register number
CS@VT
Computer Organization II
©2005-2016 McQuain
By adding three multiplexors, we can resolve the conflicts that we have identified and produce a design that will (almost) handle the R-type, load and store instructions we are considering.
Multiplexor to select right operand to ALU
Multiplexor to select write data for register file
add, sub, and, or, slt, lw, sw
We’ve identified quite a few necessary control signals in our design.
Which ones are two-valued? Do any require more than two values?
How should they be set?
The datapath cannot operate correctly unless every control signal is managed properly. Two things to remember:
– Every line always carries a value. We may not know (or care) what it is in some cases. But there is always a value there.
– It doesn’t matter what value a line carries if that value is never stored or used to make a decision. (Hence, we will find that we have don’t-care conditions.)
Analyzing the Control Signals Datapath Design 13
CS@VT Computer Organization II ©2005-2016 McQuain
R-type and load instructions require writing a value into a register, but the store instruction does not.
Writing a value modifies the state of the system, so it is not a don’t-care. So, we must manage the RegWrite signal accordingly:
RegWrite
R-type
1
load
1
store
0
Value at Write data is written to the Write register iff RegWrite is 1.
Register File Control Datapath Design 14
CS@VT Computer Organization II ©2005-2016 McQuain
MemRead
– must be 1 for load instructions, since they copy a value from memory to a register – might be a don’t-care for R-type and store instructions… why? If not, should be 0.
MemWrite
– must be 1 for store instructions, since they copy a value from a register to memory
– must be 0 for R-type and load instructions; otherwise they could modify a memory
value that should not be changed
MemRead
MemWrite
R-type
?
0
load
1
0
store
?
1
Data Memory Control Datapath Design 15
CS@VT Computer Organization II ©2005-2016 McQuain
RegDst
– must be 0 for load instructions
– must be 1 for R-type instructions
– don’t-care for store instructions (why?)
ALUSrc
– must be 0 for R-type instructions
– must be 1 for load and store instructions
MemtoReg
– must be 1 for load instructions
– must be 0 for R-type instructions
– don’t-care for store instructions (why?)
see slide 10
see slide 11
see slide 15
Multiplexor Control Datapath Design 16
CS@VT Computer Organization II ©2005-2016 McQuain
There are a lot of control signals, even in our simple datapath.
At this point, almost all of them are single-bit signals (i.e., they make a choice between two alternate actions).
The ALU control needs to be different because there are more than two choices for what it will actually do:
ALU
R-type add sub and or slt
add subtract and or
slt
load
add
store
add
So, the ALU will require a multi-bit control signal… why? How many bits?
ALU Control Datapath Design 17
CS@VT Computer Organization II ©2005-2016 McQuain
This suggests separating the control logic into two modules:
– a master control module that determines the type of instruction being executed and sets the non-ALU control signals
– a secondary control module that manages the interface of the ALU itself
The master module will send :
– a specific selector pattern for the ALU if the instruction is not R-type; e.g., it sends
the ADD selector pattern for lw and sw instructions
– a flag telling the secondary module to analyze the funct bits if the instruction is R-
type
We’ll fill in the details of the two modules later, but for now we do know what each must do, at a high level.
ALU Control Datapath Design 18
CS@VT Computer Organization II ©2005-2016 McQuain
Unified Datapath Design with Control Datapath Design 19
CS@VT Computer Organization II ©2005-2016 McQuain
add, sub, and, or, slt, lw, sw
Read two operands from the register file
Use the ALU to compare the operands: subtract and check Zero signal …
op
rs
rt
16-bit immediate
if GPR[rs] == GPR[rt] then
PC = PC + 4 + (imm << 2)
Executing Branch Instructions Datapath Design 20
CS@VT Computer Organization II ©2005-2016 McQuain
...
Calculate the branch target address:
- Sign-extend displacement (immediate from instruction)
- Shift left 2 places (MIPS uses word displacement – why?)
- Add to PC + 4 (already calculated PC + 4 during the instruction fetch)
Send computed branch target address to the PC (if taken)
op
rs
rt
16-bit immediate
if GPR[rs] == GPR[rt] then
PC = PC + 4 + (imm << 2)
Executing Branch Instructions Datapath Design 21
CS@VT Computer Organization II ©2005-2016 McQuain
Examine the display below of a short MIPS assembly program, and the addresses at which the instructions will be loaded into memory when the program is executed:
address assembly source code ---------------------------------------------
0x00400000
0x00400004
0x00400008
0x0040000C 0x00400010 0x00400014 0x00400018 0x0040001C
0x00400020
0x00400024
main:
repeat:
check: exit:
.text
addi $s0, $zero, 10
addi $s1, $zero, 20
j $check
sub $s3, $s1, $s0 sra $s3, $s3, 1
beq $s3, $zero, exit add $s0, $s0, $s3 bne $s0, $s1, repeat
li $v0, 10
syscall
If taken, a conditional branch causes the PC to be set to the address of the instruction that's the target of the branch:
Aside: Branch Target Address Datapath Design 22
CS@VT Computer Organization II ©2005-2016 McQuain
address assembly source code ---------------------------------------------
0x00400000
0x00400004
0x00400008
0x0040000C 0x00400010 0x00400014 0x00400018 0x0040001C
0x00400020
0x00400024
main:
repeat:
check: exit:
.text
addi $s0, $zero, 10
addi $s1, $zero, 20
j $check
sub $s3, $s1, $s0 sra $s3, $s3, 1
beq $s3, $zero, exit add $s0, $s0, $s3 bne $s0, $s1, repeat
li $v0, 10
syscall
First of all, note that all the addresses are multiples of 4*.
Therefore, the "distance" between two instructions is always a multiple of 4.
*QTP: why isn't this surprising?
Aside: Branch Target Address Datapath Design 23
CS@VT Computer Organization II ©2005-2016 McQuain
The immediate we store for a beq instruction is (almost) the "distance" from the beq instruction to the instruction that is the target of the branch.
address assembly source code --------------------------------------------- ...
0x0040000C repeat: sub $s3, $s1, $s0 ...
0x0040001C check: bne $s0, $s1, repeat ...
Here, that "distance" is 0x0C – 0x1C = – 0x10. In 16-bit 2's complement, that's represented as:
11111111 11110000
Now, recall that the base-2 representation of a multiple of 4 ends in 00.
If the "distance" always ends in 00, there's no reason to store that 00 in the instruction...
Aside: Branch Target Address Datapath Design 24
CS@VT Computer Organization II ©2005-2016 McQuain
Do we gain anything if we don't store those two 0's in the instruction?
address assembly source code --------------------------------------------- ...
0x0040000C repeat: sub $s3, $s1, $s0 ...
0x0040001C check: bne $s0, $s1, repeat ...
Yes. We can store a 16-bit "distance" in the instruction, but effectively use an 18-bit "distance" at runtime.
That means that a conditional branch can "branch" farther... which is a gain.
Aside: Branch Target Address Datapath Design 25
CS@VT Computer Organization II ©2005-2016 McQuain
But, there's more...
address assembly source code --------------------------------------------- ...
0x0040000C repeat: sub $s3, $s1, $s0 ...
0x0040001C check: bne $s0, $s1, repeat ...
It takes time to evaluate the difference of the two registers; we can do other, useful things in the meantime, like calculate PC + 4:
- If we don't take the branch, we need that address to fetch the next instruction.
- If we do take the branch, we wasted no time computing PC + 4 along the way.
But... that means the "distance" we'll store is actually from PC + 4 to the branch target.
Aside: Branch Target Address Datapath Design 26
CS@VT Computer Organization II ©2005-2016 McQuain
Now, consider the machine code for our example:
target
address machine code ------------------------------------------------------- ...
0x0040000C
. . . -0x10
0x0040001C 000101 10000 10001 1111111111111100 ...
bne
The immediate in the beq instruction is: 1111111111111100 That's -4. If we shift that left by 2 bits (multiply by 4), we get -16.
If we add -16 to the address 0x0040001C, we get 0x0040000C, which is the target address we need.
Aside: Branch Target Address Datapath Design 27
CS@VT Computer Organization II ©2005-2016 McQuain
Executing Branch Instructions Datapath Design 28
op rs
rt
rs and rt specify where operands are
16-bit immediate
Control tells ALU to subtract operands, sets Branch signal
Immediate is sign-extended and shifted
AND gate controls which address goes to the PC
ALU subtracts and sets Zero
beq
CS@VT Computer Organization II
©2005-2016 McQuain
Adder computes branch target
PC = (PC+4)[31:28]|(IR[25:0] << 2)
- Shift left 2 places (just as with branch target address calculation)
- Concatenate with PC + 4[31:28] (already calculated PC + 4 during the instruction fetch) Send computed jump target address to the PC
Calculate the jump target address:
from PC+4
p31p30p29p28 i25i24i23i22...i3i2i1i0 00
op
26-bit immediate
Executing Jump Instructions Datapath Design 29
CS@VT Computer Organization II ©2005-2016 McQuain
The calculation of the jump target address is similar to the calculation of the branch target address (beq).
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
1111111111111111111111111111 ... 0000000000000000000000000000
The MIPS model is to view memory as a sequence of 256 MB segments.
1111
1110 1101
The starting address of the current memory segment is given by the high 4 bits of PC + 4.*
Now, 256 MB can be addressed by 28-bit addresses.*
For jump (j) instructions, the 26-bit immediate is shifted 2 bits to
the left, yielding a 28-bit offset, which is then added to the 0010 starting address of the current memory segment.
*QTP: why?
0001
0000
4GB address range
Aside: Jump Target Address Datapath Design 30
CS@VT Computer Organization II ©2005-2016 McQuain
...
op
26-bit immediate
Concatenate to form 32-bit address for jump
MUX passes jump address back to PC
High 4 bits of PC + 4 specify "segment" containing this code.
Calculate address of next sequential instruction
Control sets Jump signal so that the jump address will be used to fetch the next instruction
Executing Jump Instructions Datapath Design 31
CS@VT Computer Organization II ©2005-2016 McQuain
Unified Datapath Datapath Design 32
add, sub, and, or, slt, lw, sw, beq, j
CS@VT Computer Organization II ©2005-2016 McQuain
The unified datapath that we have designed:
- illustrates many of the logical issues that must be solved in designing any datapath
- can be extended to support additional instructions (easily for some, less so for others)
- is fundamentally unsatisfactory in that it requires a single clock cycle be long enough for every path within the datapath to stabilize before the next instruction is fetched
We may explore the second issue in exercises.
The third issue can only be dealt with by transforming the design to incorporate a pipeline.
Summary Datapath Design 33
CS@VT Computer Organization II ©2005-2016 McQuain