L13_1 Pipelining_Execution- Example
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 1 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• Ability to trace the execution of instructions through the pipeline datapath implementation for LC2K.
• Understand the flow of data through the datapath and between pipeline stages.
EECS 370 – Introduction to Computer Organization
2
Sample Code (Simple)
Let us run the following code on pipelined LC2K2x:
• add 1
• nor 4
• lw2
• add 2
• sw 3
23; 56; 4 20; 55;
7 10 ;
reg3 = reg1 + reg2
reg6 = reg4 nor reg5
reg4 = Mem[reg2 + 20]
reg5 = reg2 + reg5
Mem[reg3+10]=reg7
EECS 370 – Introduction to Computer Organization
3
Time Graphs (a.k.a. Pipe Trace)
5
writeback memory execute decode
fetch
Time:1 2 3 4
6 7 8 9
add nor lw add sw
fetch decode execute memory
EECS 370 – Introduction to Computer Organization
4
fetch
decode fetch
execute decode
fetch
writeback
memory writeback
execute memory writeback
decode execute memory writeback
A vertical slice reports the entire activity of the pipeline at time 5
Pipeline Datapath
M U X
1
+
Inst mem
M U X
data dest
PC
target
PC+1
PC+1
regA
0
valB offset
M U X
+
A L U
eq?
ALU result
regB
valA
ALU result
Data memory
mdata
valB
Bits 0-2
Bits 16-18
Bits 22-24
R0 R1 R2 R3 R4 R5 R6 R7
M U X
dest
dest
dest
op
op
op
EECS 370 – Introduction to Computer Organization
5
IF/ID
ID/EX
EX/Mem
Mem/WB
Register file
instruction
Time 0 – Initial State
M U X
1
+
Inst mem
M U X
data dest
PC
0
0
R0 R1 R2 R3 R4 R5
R6 41
R7 22
Bits 0-2 Bits 16-18
Bits 22-24
0
36
0
0
9
0
12
0
18
0
7
M U X
+
A L U
Data memory
0
0 0
0
M U X
0
0
0
noop
noop
noop
EECS 370 – Introduction to Computer Organization
6
IF/ID
ID/EX
EX/Mem
Mem/WB
noop
Register file
Time 1 – Fetch: add 1 2 3
M U X
1
+
Inst mem
M U X
data dest
PC
0
1
R0 R1 R2 R3 R4 R5
R6 41
R7 22
M U X
0
36
0
0
9
0
12
0
18
0
7
M U X
+
A L U
Data memory
0
0 0
Bits 0-2
Bits 16-18
Bits 22-24
0
0
0
0
noop
noop
noop
add 1 2 3
IF/ID
ID/EX
EX/Mem
Mem/WB
EECS 370 – Introduction to Computer Organization
7
Register file
add 1 2 3
Time 2 – Fetch: nor 4 5 6
M U X
1
+
Inst mem
M U X
data dest
PC
2
R0 R1 R2 R3 R4 R5
R6 41
R7 22
M U X
36
9
1
0
1
0
0
2
12
36
18
0
7
M U X
+
A L U
0
Data memory
0
9 3
0
Bits 0-2
Bits 16-18
Bits 22-24
3
0
0
add
noop
noop
nor 4 5 6
IF/ID add 1 2 3
ID/EX
EX/Mem
Mem/WB
EECS 370 – Introduction to Computer Organization
8
nor 4 5 6
Register file
Time 3 – Fetch: lw 2 4 20
M U X
1
+
Inst mem
M U X
data dest
PC
3 1
36
9
M U X
+
4
3
2
36
4
R0 R1 R2 R3 R4 R5
R6 41
R7 22
M U X
0
18
A L U
0
9
0
5
12
18
45
7
Data memory
0
7 6
9
Bits 0-2
Bits 16-18
Bits 22-24
6
3
3
0
nor
add
noop
lw 2 4 20
IF/ID
nor 4 5 6
ID/EXadd 1 2 3
EX/Mem
Mem/WB
EECS 370 – Introduction to Computer Organization
9
Register file
lw 2 4 20
Time 4 – Fetch: add 2 5 5
M U X
1
+
Inst mem
M U X
data dest
PC
4
6 2
18
7
M U X
+
8
3
36
2
R0
R1
R2
R3
R4
R5
R6 41
R7 22
M U X
0
9
0
45
4
12
9
A L U
18
-24
7
18 20
Data memory
0
45
Bits 0-2
Bits 16-18
Bits 22-24
4
6
7
6
3
3
lw
nor
add
add 2 5 5
IF/ID
lw 2 4 20
ID/EX
nor 4 5 6
EX/Mem
add 1 2 3
Mem/WB
EECS 370 – Introduction to Computer Organization
10
add 2 5 5
Register file
Time 5 – Fetch: sw 3 7 10
M U X
1
+
Inst mem
45 M
U X
data dest
3
PC
20 3
9
M U 20 X
+
A L U
23
5
4
36
2
R0
R1
R2
R3
R4
R5
R6 41 R7 22
M U X
0
9
0
-24
5
45
9
18
29
Data memory
0
8
7 5
Bits 0-2
Bits 16-18
Bits 22-24
7
5
4
18
4
6
6
add
lw
nor
sw 3 7 10
add 2 5 5
lw 2 4 20
Mem/WB
EECS 370 – Introduction to Computer Organization
11
IF/ID
ID/EX
EX/Mem
nor 4 5 6 add 1 2 3
sw 3 7 10
Register file
Time 6 – No More Instructions
M U X
1
+
Inst mem
-24 M
U X
data dest
6
Mem/WB
PC
5 4
9
7
M U X
+
A L U
9
5
36
3
R0
R1
R2
R3
R4
R5
R6 -24 R7 22
M U X
0
9
0
29
7
45
45
18
16
Data memory
99
29
22 10
Bits 0-2
Bits 16-18
Bits 22-24
7
7
5
7
5
4
4
sw
add
lw
EECS 370 – Introduction to Computer Organization
IF/ID
sw 3 7 10
ID/EX
EX/Mem
add 2 5 5
lw 2 4 20
nor 4 5 6
12
Register file
Time 7 – No More Instructions
M U X
1
+
Inst mem
M
U 99 X
PC
10 5
45
M U 10 X
+
R0
R1
R2
R3
R4
R5
R6 -24 R7 22
M U X
0
36
9
15
A L U
99
7
0
55
16
16
45
Data memory
0
Bits 0-2
Bits 16-18
Bits 22-24
22
7
7
5
5
sw
add
EECS 370 – Introduction to Computer Organization
IF/ID
ID/EX
data dest
4
Mem/WB
sw 3 7 10
EX/Mem
add 2 5 5
lw 2 4 2013
Register file
Time 8 – No More Instructions
M U X
1
+
Inst mem
16 M
U X
data dest
5
PC
36
9
99
55
45
22
Data memory
0
Bits 0-2
Bits 16-18
Bits 22-24
R0
R1
R2
R3
R4
R5 16 R6 -24 R7 22
M U X
0
M U X
+
A L U
55
22
7
sw
EECS 370 – Introduction to Computer Organization
14
IF/ID
ID/EX
EX/Mem
sw 3 7 10
Mem/WB
add 2 5 5
Register file
Time 9 – No More Instructions
M U X
1
+
Inst mem
M U X
data dest
PC
0
36
9
45
99
M U X
+
A L U
Data memory
Bits 0-2
Bits 16-18
Bits 22-24
R0
R1
R2
R3
R4
R5 16 R6 -24 R7 22
M U X
EECS 370 – Introduction to Computer Organization
15
IF/ID
ID/EX
EX/Mem
Mem/WB
Register file
Time Graphs (a.k.a. Pipe Trace)
5
writeback memory execute decode
fetch
Time:1 2 3 4
6 7 8 9
add nor lw add sw
fetch decode execute memory
EECS 370 – Introduction to Computer Organization
16
fetch
decode fetch
execute decode
fetch
writeback
memory writeback
execute memory writeback
decode execute memory writeback
A vertical slice reports the entire activity of the pipeline at time 5
What Can Go Wrong?
• Data hazards: since register reads occur in stage 2 and register writes occur in stage 5 it is possible to read the wrong value if it is about to be written.
• Control hazards: A branch instruction may change the PC, but not until stage 4. What do we fetch before that?
• Exceptions: How do you handle exceptions in a pipelined processor with 5 instructions in flight?
EECS 370 – Introduction to Computer Organization
17
Logistics
• There are 3 videos for lecture 13
• L13_1 – Pipelining_Execution-Example
• L13_2 – Data-Hazards
• L13_3 – Data-Hazards_Detect-and-Stall
• There is one worksheet for lecture 13 1. L13 worksheet
EECS 370 – Introduction to Computer Organization
18
L13_2 Data-Hazards
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 19 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• Ability to identify data dependencies between instructions.
• To differentiate between data dependencies and data hazards. • To identify the approaches to resolving data hazards.
EECS 370 – Introduction to Computer Organization
20
Data Hazards
• Register reads occur in stage 2 and register writes occur in stage 5 • It is possible to read the wrong value if it is about to be written.
• Data hazards occur when the pipeline must be stalled because one step must wait for another to complete.
• Data hazards arise from the dependence of one instruction on an earlier one that is still in the pipeline
Example: AND gate using NOR
nor 1 1
nor 3 3
nor 5
2
4
2
4
EECS 370 – Introduction to Computer Organization
21
Pipeline Function for nor
• Fetch: read instruction from memory
• Decode: read source operands from reg • Execute: calculate nor
• Memory: pass results to next stage
• Writeback: write sum into register file
EECS 370 – Introduction to Computer Organization
22
Data Dependency – Example
Recall: registers
are read /sourced
In the “decode” stage
nor3 3 4 nor2 4 5
time
nor nor
RAW Dependency
fetch decode execute memory writeback Hazard
fetch decode execute memory writeback
If not careful, nor will read a stale value of register 4
RAW dependency: Read-After-Write
EECS 370 – Introduction to Computer Organization
23
Data Hazard – Solution
nor3 3 4 nor2 4 5
time
nor nor
Assume Register File gives the right value of register 4 when read/written during same cycle. This is consistent with most processors (ARM/x86), but not Project 3.
fetch decode execute memory writeback
fetch decode* decode* decode writeback
EECS 370 – Introduction to Computer Organization
24
Data Dependency and Data Hazard
• Data Dependency: one instruction uses the result of a previous one • Does not necessarily cause a problem
• Data Hazard: one instruction has a data dependency that will cause a problem if we do not “deal with it”
EECS 370 – Introduction to Computer Organization
25
Data Dependency – Example
Problem: Which of these instructions has a data dependency on an earlier one? Which of those are data hazards?
0 1 2 3 4
add 1 2 3
nor 3 4 5
add 6 3 7
lw 3 6 10
sw 6 2 12
0 1 2 3
add 1 2 3
beq 3 4 1
add 3 5 6
add 3 6 7
EECS 370 – Introduction to Computer Organization
26
Data Dependency – Example
Problem: Which of these instructions has a data dependency on an earlier one? Which of those are data hazards?
0 1 2 3 4
add 1 2 3
nor 3 4 5
add 6 3 7
lw 3 6 10
sw 6 2 12
0 1 2 3
add 1 2 3
beq 3 4 1
add 3 5 6
add 3 6 7
EECS 370 – Introduction to Computer Organization
27
Three Approaches to Handling Data Hazards
• Avoid
• Make sure there are no hazards in the code
• Detect and Stall
• If hazards exist, stall the processor until they go away
• Detect and Forward
• If hazards exist, fix up the pipeline to get the correct value (if possible)
EECS 370 – Introduction to Computer Organization
28
Handling Data Hazards I: Avoid all Hazards
• Assume the programmer (or the compiler) knows about the processor implementation.
• Make sure no hazards exist.
• Put noops between any dependent instructions.
write register 3 in cycle 5 read register 3 in cycle 5
add 1 2 3 noop
noop
nor 3 4 5
EECS 370 – Introduction to Computer Organization
29
Avoid all Hazards: Problems
• Old programs (legacy code) may not run correctly on new implementations
• Longer pipelines need more noops
• Programs get larger as noops are included
• Especially a problem for machines that try to execute more than one instruction every cycle
• Intel EPIC: Often 25% – 40% of instructions are noops • Program execution is slower
• CPI is 1, but some instructions are noops EECS 370 – Introduction to Computer Organization
30
Logistics
• There are 3 videos for lecture 13
• L13_1 – Pipelining_Execution-Example
• L13_2 – Data-Hazards
• L13_3 – Data-Hazards_Detect-and-Stall
• There is one worksheet for lecture 13 1. L13 worksheet
EECS 370 – Introduction to Computer Organization
31
L13_3 Data-Hazards_Detect- and-Stall
EECS 370 – Introduction to Computer Organization – Fall 2020
EECS 370 – Introduction to Computer Organization – © Bill Arthur 32 The material in this presentation cannot be copied in any form without written permission
Learning Objectives
• To identify and understand the pipeline datapath components necessary to facilitate detection and stalling for data hazards.
EECS 370 – Introduction to Computer Organization
33
Handling Data Hazards II: Detect and Stall • Detect:
• Compare regA with previous destRegs • 3bitoperandfields
• Compare regB with previous destRegs
• 3 bit operand fields
• Stall:
• Keep current instructions in fetch and decode • Pass a noop to execute
• How do we modify the pipeline to do this? EECS 370 – Introduction to Computer Organization
34
Time Graph
Time:
1
2
3
4
5
6
7
8
9
10
11
12
13
add 1 2 3
IF
ID
EX
ME
WB
nor 3 4 5
IF
ID*
ID*
ID
EX
ME
WB
EECS 370 – Introduction to Computer Organization
35
Problem: Our pipeline currently does not handle hazards – let us fix it
M U X
1
+
+
A L U
target
PC+1
M U X
data dest
PC
Inst mem
PC+1
Bits 0-2 Bits 16-18
Bits 22-24
M U X
regA
R0 R1 R2 R3 R4 R5 R6 R7
0
valB offset
dest
M U X
eq?
ALU result
Data memory
ALU result
regB
valA
mdata
valB
dest
dest
op
op
op
IF/ ID/
ID EX Mem WB
EX/ Mem/
36
instruction
Register file
Problem: Our pipeline currently does not handle hazards – let us fix it
M U X
1
+
+
A L U
target
PC+1
eq?
ALU result
M U X
data
regB
valA
PC
Inst mem
PC+1
M U X
R0 R1 R2 R3 R4 R5 R6 R7
regA
0
valB offset
dest
M U X
ALU result
Data memory
mdata
valB
dest
dest
op
op
op
IF/ ID/
ID EX Mem WB
EX/ Mem/
37
instruction
Register file
Example
• Let’s run this program with a data hazard through our 5-stage pipeline add 1 2 3
nor 3 4 5
• We will start at the beginning of cycle 3, where add is in the EX stage, and nor is in the ID stage, about to read a register value
Time:
1
2
3
add 1 2 3
IF
ID
EX
nor 3 4 5
IF
ID
Hazard!
EECS 370 – Introduction to Computer Organization
38
First half of cycle 3
M U X
1
+
PC+1
azard detection
3
14
eq?
7
ALU result
M U X
PC
Inst mem
PC+1
H
regB
14
M U X
R0
R1
R2
R3
R4
R5
R6
R7
regA
0
3
10
11
ALU result
77
7 3
M U X
+
A L U
target
data
1 8
valB
Data memory
mdata
add
op
op
IF/
ID EX Mem WB
ID/ EX/ Mem/
39
nor 345
Register file
Hazard detected
3
regA
compare
compare
compare
compare
regB
REG file
3
IF/ ID
ID/ EX
40
1
Hazard detected
compare
000
011
011
regA regB
3
EECS 370 – Introduction to Computer Organization
41
41
Handling Data Hazards II: Detect and Stall • Detect:
• Compare regA with previous destRegs • 3bitoperandfields
• Compare regB with previous destRegs
• 3 bit operand fields
• Stall:
• Keep current instructions in fetch and decode • Pass a noop to execute
EECS 370 – Introduction to Computer Organization
42
en
First half of cycle 3
M U X
1
+
+
A L U
target
2
1
Hazard
3
regA
0
14
eq?
M U X
7
ALU result
en
PC
Inst mem
regB
10
14
M U X
R0
R1
R2
R3
R4
R5
R6
R7
3
11
7
M U X
ALU result
Data memory
mdata
77
data
1 8
add
valB
IF/ ID/
ID EX Mem WB
EX/ Mem/
43
nor 3 4 5
Register file
Handling Data Hazards II: Detect and Stall • Detect:
• Compare regA with previous destRegs • 3bitoperandfields
• Compare regB with previous destRegs
• 3 bit operand fields
• Stall:
• Keep current instructions in fetch and decode • Pass a noop to execute
EECS 370 – Introduction to Computer Organization
44
End of cycle 3
M U X
1
+
+
A L U
14
regA
0
7
ALU result
M U X
PC
Inst mem
2
M U X
R0
R1
R2
R3
R4
R5
R6
R7
regB
10
3
11
77
M U X
21
Data memory
mdata
data
1 8
noop
add
IF/
ID EX Mem WB
ID/ EX/ Mem/
45
nor 3 4 5
Register file
en
First half of cycle 4
M U X
1
+
+
A L U
azard
3
regA
0
14
7
ALU result
M U X
en
PC
Inst mem
2
H
regB
10
M U X
R0 R1 R2 R3 R4 R5 R6 R7
3
11
77
M U X
21
Data memory
mdata
data
1 8
noop
add
IF/
ID EX Mem WB
ID/ EX/ Mem/
46
nor 3 4 5
Register file
End of cycle 4
M U X
1
+
+
A L U
14
7
21
M U X
PC
Inst mem
2
M U X
R0
R1
R2
R3
R4
R5
R6
R7
regA
regB
0
3
10
11
77
M U X
Data memory
data
1 8
noop
add
IF/
ID EX Mem WB
ID/ EX/ Mem/
noop
47
nor 3 4 5
Register file
First half of cycle 5
M U X
1
+
+
A L U
No Hazard
3
14
7
21
M U X
PC
Inst mem
2
M U X
R0
R1
R2
R3
R4
R5
R6
R7
regA
regB
0
3
10
11
77
M U X
Data memory
data
1 8
noop
add
IF/
ID EX Mem WB
ID/ EX/ Mem/
noop
48
nor 3 4 5
Register file
End of cycle 5
M U X
1
+
+
A L U
14
7
2
M U X
PC
Inst mem
3
M U X
R0
R1
R2
R3
R4
R5
R6
R7
regA
regB
0
21
5
21
11
77
11
M U X
Data memory
data
1 8
nor
IF/ ID
noop noop
ID/ EX/ EX Mem
49
add 637
Register file
Time Graph
Time:
1
2
3
4
5
6
7
8
9
10
11
12
13
add 1 2 3
IF
ID
EX
ME
WB
nor 3 4 5
IF
ID*
ID*
ID
EX
ME
WB
EECS 370 – Introduction to Computer Organization
50
Time Graph: Exercise
Time:
1
2
3
4
5
6
7
8
9
10
11
12
13
add 1 2 3
IF
ID
EX
ME
WB
nor 3 4 5
IF
ID*
ID*
ID
EX
ME
WB
add 6 3 7
1. Identify the data hazards in this extended program
2. Complete the time graph
lw 3 6 10
sw 6 2 12
EECS 370 – Introduction to Computer Organization
51
Time Graph: Exercise
Time:
1
2
3
4
5
6
7
8
9
10
11
12
13
add 1 2 3
IF
ID
EX
ME
WB
nor 3 4 5
IF
ID*
ID*
ID
EX
ME
WB
add 6 3 7
lw 3 6 10
sw 6 2 12
EECS 370 – Introduction to Computer Organization
52
Time Graph: Solution
Time:
1
2
3
4
5
6
7
8
9
10
11
12
13
add 1 2 3
IF
ID
EX
ME
WB
nor 3 4 5
IF
ID*
ID*
ID
EX
ME
WB
add 6 3 7
IF
ID
EX
ME
WB
lw 3 6 10
IF
ID
EX
ME
WB
sw 6 2 12
IF
ID*
ID*
ID
EX
ME
WB
EECS 370 – Introduction to Computer Organization
53
Detect and Stall – Benefits
• Benefits over “Avoid all hazards”
• Backwards compatibility: noops will effectively be injected into pipeline by the processor, not necessary to know number of noops to insert when assembling
• Fewer noops in code: smaller executables
• Smaller executables
• Less fetching of noops, fewer memory accesses
EECS 370 – Introduction to Computer Organization
54
Detect and Stall – Problems
• CPI increases every time a hazard is detected!
• Is that necessary? Not always!
• Re-route the result of the add to the nor
• nor no longer needs to read R3 from reg file • It can get the data later (when it is ready)
• This lets us complete the decode this cycle
• But we need more control to remember that the data that we are not getting from the reg file at this time will be found elsewhere in the pipeline at a later cycle.
EECS 370 – Introduction to Computer Organization
55
Logistics
• There are 3 videos for lecture 13
• L13_1 – Pipelining_Execution-Example
• L13_2 – Data-Hazards
• L13_3 – Data-Hazards_Detect-and-Stall
• There is one worksheet for lecture 13 1. L13 worksheet
EECS 370 – Introduction to Computer Organization
56