Chapter …
Chapter 4 — The Processor — 116
Branch Hazards
If branch outcome determined in MEM
§
4
.8
C
o
n
tro
l H
a
z
a
rd
s
PC
Flush these
instructions
(Set control
values to 0)
Chapter 4 — The Processor — 117
Reducing Branch Delay
Move hardware to determine outcome to ID
stage
Target address adder
Register comparator
Example: branch taken
36: sub $10, $4, $8
40: beq $1, $3, 7
44: and $12, $2, $5
48: or $13, $2, $6
52: add $14, $4, $2
56: slt $15, $6, $7
…
72: lw $4, 50($7)
Chapter 4 — The Processor — 118
Example: Branch Taken
Chapter 4 — The Processor — 119
Example: Branch Taken
Chapter 4 — The Processor — 120
Data Hazards for Branches
If a comparison register is a destination
of 2nd or 3rd preceding ALU instruction
…
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
add $4, $5, $6
add $1, $2, $3
beq $1, $4, target
Can resolve using forwarding
Chapter 4 — The Processor — 121
Data Hazards for Branches
If a comparison register is a destination
of preceding ALU instruction or 2nd
preceding load instruction
Need 1 stall cycle
beq stalled
IF ID EX MEM WB
IF ID EX MEM WB
IF ID
ID EX MEM WB
add $4, $5, $6
lw $1, addr
beq $1, $4, target
Chapter 4 — The Processor — 122
Data Hazards for Branches
If a comparison register is a destination
of immediately preceding load
instruction
Need 2 stall cycles
beq stalled
IF ID EX MEM WB
IF ID
ID
ID EX MEM WB
beq stalled
lw $1, addr
beq $1, $0, target
Chapter 4 — The Processor — 123
Dynamic Branch Prediction
In deeper and superscalar pipelines, branch
penalty is more significant
Use dynamic prediction
Branch prediction buffer (aka branch history
table)
Indexed by recent branch instruction addresses
Stores outcome (taken/not taken)
To execute a branch
Check table, expect the same outcome
Start fetching from fall-through or target
If wrong, flush pipeline and flip prediction
Chapter 4 — The Processor — 124
1-Bit Predictor: Shortcoming
Inner loop branches mispredicted twice!
outer: …
…
inner: …
…
beq …, …, inner
…
beq …, …, outer
Mispredict as taken on last iteration of
inner loop
Then mispredict as not taken on first
iteration of inner loop next time around
Chapter 4 — The Processor — 125
2-Bit Predictor
Only change prediction on two
successive mispredictions
Pop Quiz
If we have 4096 available bits, how
many 2-bit prediction entries can we
store?
A: 4096
B: 2048
C: 1024
D: 512
Chapter 4 — The Processor — 126
Chapter 4 — The Processor — 127
Calculating the Branch Target
Even with predictor, still need to
calculate the target address
1-cycle penalty for a taken branch
Branch target buffer
Cache of target addresses
Indexed by PC when instruction fetched
If hit and instruction is branch predicted taken,
can fetch target immediately
Chapter 4 — The Processor — 128
Exceptions and Interrupts
“Unexpected” events requiring change
in flow of control
Different ISAs use the terms differently
Exception
Arises within the CPU
e.g., undefined opcode, overflow, syscall, …
Interrupt
From an external I/O controller
Dealing with them without sacrificing
performance is hard
§
4
.9
E
x
c
e
p
tio
n
s
Chapter 4 — The Processor — 129
Handling Exceptions
In MIPS, exceptions managed by a System
Control Coprocessor (CP0)
Save PC of offending (or interrupted)
instruction
In MIPS: Exception Program Counter (EPC)
Save indication of the problem
In MIPS: Cause register
We’ll assume 1-bit
0 for undefined opcode, 1 for overflow
Jump to handler at 8000 00180
Chapter 4 — The Processor — 130
An Alternate Mechanism
Vectored Interrupts
Handler address determined by the cause
Example:
Undefined opcode: C000 0000
Overflow: C000 0020
…: C000 0040
Instructions either
Deal with the interrupt, or
Jump to real handler
Chapter 4 — The Processor — 131
Handler Actions
Read cause, and transfer to relevant
handler
Determine action required
If restartable
Take corrective action
use EPC to return to program
Otherwise
Terminate program
Report error using EPC, cause, …
Chapter 4 — The Processor — 132
Exceptions in a Pipeline
Another form of control hazard
Consider overflow on add in EX stage
add $1, $2, $1
Prevent $1 from being clobbered
Complete previous instructions
Flush add and subsequent instructions
Set Cause and EPC register values
Transfer control to handler
Similar to mispredicted branch
Use much of the same hardware
Chapter 4 — The Processor — 133
Pipeline with Exceptions
Chapter 4 — The Processor — 134
Exception Properties
Restartable exceptions
Pipeline can flush the instruction
Handler executes, then returns to the
instruction
Refetched and executed from scratch
PC saved in EPC register
Identifies causing instruction
Actually PC + 4 is saved
Handler must adjust
Chapter 4 — The Processor — 135
Exception Example
Exception on add in
40 sub $11, $2, $4
44 and $12, $2, $5
48 or $13, $2, $6
4C add $1, $2, $1
50 slt $15, $6, $7
54 lw $16, 50($7)
…
Handler
80000180 sw $25, 1000($0)
80000184 sw $26, 1004($0)
…
Chapter 4 — The Processor — 136
Exception Example
Chapter 4 — The Processor — 137
Exception Example
Chapter 4 — The Processor — 138
Multiple Exceptions
Pipelining overlaps multiple instructions
Could have multiple exceptions at once
Simple approach: deal with exception from
earliest instruction
Flush subsequent instructions
“Precise” exceptions
In complex pipelines
Multiple instructions issued per cycle
Out-of-order completion
Maintaining precise exceptions is difficult!
Chapter 4 — The Processor — 139
Imprecise Exceptions
Just stop pipeline and save state
Including exception cause(s)
Let the handler work out
Which instruction(s) had exceptions
Which to complete or flush
May require “manual” completion
Simplifies hardware, but more complex
handler software
Not feasible for complex multiple-issue
out-of-order pipelines