Chapter …
Microarchitecture
Hungwen Li
CMPE 120
COMPUTER ORGANIZATION AND DESIGN
The Hardware/Software Interface
COMPUTER ORGANIZATION AND DESIGN
The Hardware/Software Interface
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
1
This slide set
Provides a review of data path design,
Provides a review of control path design,
provides examples of designing data/control path for basic instructions, and
Serves as a template for your project on microarchitecture design.
Build Datapaths
Time to Design a mini-MIPS Processor!
Simple subset of 32bit MIPS ISA
Memory reference: lw, sw
Arithmetic/logical: add, sub, and, or, slt
Control transfer: beq
§4.1 Introduction
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
4
Process to Design the mini-MIPS Processor
Assume 2 separate memories for instructions & data
For simplicity
Focus on datapath first
Treat control signals as a black magic box initially
Add their implementation later
Recall that “processor = datapath + control”
Starting small and adding features/instructions gradually
Fetching instruction
R-format instructions
Load/store instructions
BEQ instruction
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
5
First thing’s First – Fetch Instructions
Objective
Read 32-bit instruction from memory pointed by PC register
Increment PC by 4
What components should we use?
Put It Together – Fetch 32bits at PC-Pointed Address
32
32
Put It Together – Increment PC by 4
32
4
32
32
Put It Together – Constant 4
Chapter 4 — The Processor — 9
How do we implement constant 4?
32
32
32
32 wires with all fixed low voltage except for the 3rd wire fixed to high voltage
32
Put It Together – Clock Control
Chapter 4 — The Processor — 10
Why will PC NOT go crazy and keep on incrementing its value?
clock
32
32
32
32
R-Format Instructions (Review)
Op code: 0
$rd : destination
$rs : operand #1
$rt : operand #2
“funct” decides ALU operations:
add, sub, and, or, slt
0 rs rt rd shamt funct
31:26
5:0
25:21
20:16
15:11
10:6
R-type
different ALU instructions
Read register
Write register
Examples
add $3, $4, $5
slt $t0, $s0, $s1
R-Format Instructions – Building Blocks
Objectives
Perform 5 ops (add, sub, and, or, slt) on 2 read registers
Write result to the 3rd write register
What components do we need?
Register file for reading/writing registers
ALU for performing 5 ops (add, sub, and, or, slt)
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
12
Notes about control signals
RegWrite is a control signal that is determined by opcode and funct fields
ALU operation are control signals derived from “funct” field of the instruction
Put It Together for R-Format Instructions
R-Format Instruction Example – ADD $3, $1, $2
000000
00001
00010
00011
00000
100000
Control Signals
Decode ALU operation
opcode
rs
rt
shamt
rd
funct
ADD $3,$1,$2
ALU should perform ADD
Instruction consists of 32 wires coming from instruction fetch logic designed in the previous subsection
R-Format Instruction Example – ADD $3, $1, $2
000000
00001
00010
00011
00000
100000
Control Signals
Decode ALU operation
opcode
rs
rt
shamt
rd
funct
ADD $3,$1,$2
Where are the 32-bit instruction coming from?
ALU should perform ADD
Load/Store Instructions (Review)
Op code
sw : 43 (1010112)
lw : 35 (1000112)
$rs – base address register
Address = $rs + off16 (signed)
$rt – read from or write to
For “sw”, $rt data is written to memory
For “lw”, $rt is written with content from memory
35 or 43 rs rt off16 (signed 16bit)
31:26
25:21
20:16
15:0
Load/ Store (I-type)
Examples
lw $a0, -12($sp)
sw $2, 16($8)
Load/Store Instructions – Building Blocks
Objective
Calculate target memory address ($rs + off16)
For LW, read from memory and write to $rt
For SW, write data from $rt to memory
What components do we need?
Register file, ALU, data memory and sign extension
Sign-bit wire in input is replicated to top 16 wires in output
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
17
Put It Together for Load/Store
Store Example – sw $2, 16($8)
Opcode = 1010112
101011
01000
00010
0000 0000 0001 0000
opcode
rs
rt
off16
Control Signals
Sw $2, 16($8)
Datapath for Store
A subset of all components and paths
101011
01000
00010
0000 0000 0001 0000
opcode
rs
rt
off16
Control Signals
Sw $2, 16($8)
Load Example – lw $a0, -12($sp)
$a0 = $4, $sp =$29
-12 = 11111111111101002
Opcode = 1000112
opcode
rs
rt
off16
Control Signals
lw $a0, -12($sp)
100011
11101
00100
1111 1111 1111 0100
Datapath for Load
A different subset of components and paths
opcode
rs
rt
off16
Control Signals
lw $a0, -12($sp)
100011
11101
00100
1111 1111 1111 0100
What happens to components/paths not used?
???
$a0 value
Merge R-Type/Load/Store Datapath
+
Why do we need to merge them?
Can we not merge them?
Merge R-Type/Load/Store Datapath
+
#2: Use MUX to join incoming sources. Control signals control behavior.
#3: Use splitting to forward data to multiple destinations
#1: Identified common components and paths.
Merged R-Type/Load/Store Datapath
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
25
Branch Instruction (BEQ)
Op code : 4 (0001002)
$rs : operand #1
$rt: operand #2
Target address:
PC + 4 + (off16 << 2) (signed)
4
rs
rt
off16(16bit signed)
31:26
25:21
20:16
15:0
Branch
(I-type)
Examples
beq $3, $4, -20
BEQ – Building Blocks
Objectives
Compare $rs and $rt (subtract and check zero flag)
Compute target address, PC + 4 + (off16<<2)
Control logic to set PC depending on zero flag
What components should we use?
How to implement "shift left 2" logic?
Datapath for Branch Instructions (BEQ)
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
28
BEQ Example – beq $3, $4, -20
000100
00011
00100
1111 1111 1110 1100
opcode
rs
rt
off16
Control Signals
beq $3, $4, -20
Opcode = 0001002
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
29
Merge Them All!
+
+
= ?
Full Datapath – Merge Them All!
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
31
Full Datapath – PC and Instruction Fetching
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
32
Full Datapath – R-Format
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
33
Full Datapath – Load/Store
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
34
Full Datapath – Branch (BEQ)
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
35
Something Still Fuzzy Here …
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
36
Final Datapath with Control Signals and Multiplexers
Add Controls
Where are the Control Signals?
Control clock pulses, MUX and ALU
Depend on opcode and funct for R-type instructions
ALU Usage
Depend on instructions
For R-type, it depends on opcode and funct
For BEQ, it performs subtraction (and generate Zero flag)
For LW/SW, it performs ADD
opcode Operation ALU function
lw load word add
sw store word add
beq branch equal subtract (zero flag)
R-type add add
subtract subtract
AND AND
OR OR
set-on-less-than set-on-less-than
What is set-on-less-than ALU operation?
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
40
Set-on-less-than ALU Operation
Function
If A < B, then Out = 1
Otherwise Out = 0
How to implement it?
Perform subtraction, A – B
Route sign bit to Out[0]
Set all other bits Out[31:1] to 0
So it is a variation of Subtraction
A
B
Out
Strategy to Generate ALU Control
Opcode (6bit)
ALUOp (2bit)
+
Funct (6bit)
ALU control (4bit)
Note the same ALUOp code for lw/sw
because both instructions have the same need for ALU
i.e., add offset to base register
opcode ALUOp
lw(100011) 00
sw(101011) 00
beq(000100) 01
R-type (000000) 10
ALU Control – Step 1
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
43
opcode ALUOp Operation funct
lw(100011) 00 load word XXXXXX
sw(101011) 00 store word XXXXXX
beq(000100) 01 branch equal XXXXXX
R-type (000000) 10 add 100000
subtract 100010
AND 100100
OR 100101
set-on-less-than 101010
ALU Control – Step 2
Combined with funct field
Funct field not exist for lw/sw/beq instructions
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
44
ALU Control – Step 3
Assign ALU control values
We have a truth table for ALU control logic!
And we can implement with basic logic gates!
opcode ALUOp Operation funct ALU function ALU control
lw(100011) 00 load word XXXXXX add 0010
sw(101011) 00 store word XXXXXX add 0010
beq(000100) 01 branch equal XXXXXX subtract 0110
R-type (000000) 10 add 100000 add 0010
subtract 100010 subtract 0110
AND 100100 AND 0000
OR 100101 OR 0001
set-on-less-than 101010 set-on-less-than 0111
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
45
Control Signal - ALUSrc
Control Signal - MemWrite
Control Signal - MemRead
Control Signal - RegWrite
Control Signal - RegDst
"X" means "Doesn't matter“. It helps simplifying design!
Control Signal - MemRead
Control Signal - PCSrc
Introduce "Branch" signal for BEQ
PCSrc = (Branch AND Zero)
Summary of All Control Signals
Truth Table for Main Control Signals!
Use opcode as input
Implementation of Control Signal Logic
Opcode[5] Opcode[4] Opcode[3] Opcode[2] Opcode[1] Opcode[0]
Branch RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite ALUOp1 ALUOp0
R-Type
LW
SW
BEQ
Decoder
Opcode:
R-Type : 000000 (0)
LW : 100011 (35)
SW : 101011 (43)
BEQ : 000100 (4)
Datapath With Control – We Are Done!
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
56
Example Walk-Through
(add)
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Grayed-out areas are not used by "add" instruction.
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
58
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
59
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
60
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
61
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
62
R-Type Instruction – add $t0, $s1, $s2
0
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
63
R-Type Instruction – add $t0, $s1, $s2
0010
(add)
100000
10
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
64
R-Type Instruction – add $t0, $s1, $s2
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
65
R-Type Instruction – add $t0, $s1, $s2
0
1
1
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
66
Example Walk-Through
(load)
Load Instruction – ld $t0, 8($s0)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
68
Load Instruction – ld $t0, 8($s0)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
69
Load Instruction – ld $t0, 8($s0)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
70
Load Instruction – lw $t0, 32($s0)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
71
Load Instruction – ld $t0, 32($s0)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
72
Load Instruction – lw $t0, 8($s0)
8
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
73
Load Instruction – lw $t0, 8($s0)
8
1
8
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
74
Load Instruction – lw $t0, 8($s0)
0010
(add)
8
$s0
00
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
75
Load Instruction – lw $t0, 8($s0)
1
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
76
Load Instruction – ld $t0, 8($s0)
1
0
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
77
Example Walk-Through
(beq)
Branch-on-Equal Instruction – beq $s0, $s1, -64
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
79
Branch-on-Equal Instruction – beq $s0, $s1, -64
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
80
Branch-on-Equal Instruction – beq $s0, $s1, -64
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
81
Branch-on-Equal Instruction – beq $s0, $s1, -64
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
82
Branch-on-Equal Instruction – beq $s0, $s1, -64
-64
-64
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
83
Branch-on-Equal Instruction – beq $s0, $s1, -64
-64
-256
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
84
Branch-on-Equal Instruction – beq $s0, $s1, -64
PC+4
-256
PC-252
PC+4
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
85
Branch-on-Equal Instruction – beq $s0, $s1, -64
0
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
86
Branch-on-Equal Instruction – beq $s0, $s1, -64
01
0110
(sub)
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
87
Branch-on-Equal Instruction – beq $s0, $s1, -64
?
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
88
Branch-on-Equal Instruction – beq $s0, $s1, -64
1
?
PC-252
PC+4
If Zero=0, PC'=PC+4
If Zero=1, PC'=PC-252
PCSrc
Morgan Kaufmann Publishers
22 March, 2020
Chapter 4 — The Processor
89
Reference Readings
Patterson, "Computer Organization and Design"
Sec 4.1 – 4.4
/docProps/thumbnail.jpeg