程序代写代做代考 mips CO101

CO101
Principle of Computer

Organization
Lecture 13: Pipelined MIPS

Processor 3
Liang Yanyan

澳門科技大學
Macau of University of Science and Technology

Pipelining: ideal example

• Ideal code segment.
• Start a new instruction every clock cycle.
• Complete one instruction every clock cycle.

• No dependency between instructions.
• Each instruction read/write distinct registers.
• Data required by current instruction is not produced by previous

instruction.
• Pipelined design can be generated easily.

2

Clock cycle
1 2 3 4 5 6 7 8 9

lw $8,4($29) IF ID EX MEM WB
sub $2,$4,$5 IF ID EX MEM WB
and $9,$10,$11 IF ID EX MEM WB
or $16,$17,$18 IF ID EX MEM WB
add $13,$14,$0 IF ID EX MEM WB

However
• There are many dependencies between instructions.

• Read after write dependency.
• E.g. instruction “and, or, add, sw” require (read) data $2, which

will be produced (written) by instruction “sub”.
• It is fine for single cycle processor.

• As an instruction will not start until the previous finish completely.
• What happen for a pipelined processor?

3

sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)

Instruction execution
• SW is fine

• When does SW read the data of $2?
• $2 has been written back to register when the ID stage of SW

starts.

4

IF ID EX MEM WB

1 2 3 4 5 6 7 8 9

Clock cycle

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

vagrant

However

• “and, or” instructions.
• $2 is not ready when they start reading $2.
• An old value of $2 will be read in this circumstance.

• Data hazard
• An instruction attempts to use data before it is ready.

• Back (red) arrows in time designate data hazard.

5

IF ID EX MEM WB

1 2 3 4 5 6 7 8 9

Clock cycle

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

How about
“add”?

Solutions for solving data hazard
• Pipeline stall

• Delay the execution of instructions such that they can read the
correct data.

• May reduce performance.
• We will address this later.

• Forwarding
• Forward the data from one stage to another stage that request

the data. In other word, the data bypass the pipeline register.
• E.g. forward the $2 from ALU output of “sub” to “and, or”.

• Question: How?
• Explained in the following slides

6

Observing the example
• When is the result ($2) produced?
• When is the result ($2) consumed?

7

IF ID EX MEM WB

1 2 3 4 5 6 7 8 9Clock cycle

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

Observing the example
• When is the result ($2) produced? EX of “sub”
• When is the result ($2) consumed? EX of “and, or”

8

IF ID EX MEM WB

1 2 3 4 5 6 7 8 9Clock cycle

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

Forwarding
• Forward the result from the ALU output of “sub” to “and”

instruction.
• Bypass the pipeline register and write back stage.

9

IF ID EX MEM WB

1 2 3 4 5 6 7 8 9Clock cycle

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Forwarding
• Since at the beginning of clock cycle 4, pipeline register has already

contained the result of ALU ($2).
• At clock 4: forward ALU result from EX/MEM register to ALU input.
• At clock 5: forward ALU result from MEM/WB register to ALU input.

• As ALU result ($2) is now propagated to MEM/WB register.

10

1 2 3 4 5 6 7Clock cycle

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

Reg DM Reg

A
L

UIM

Reg DM Reg

A
L

UIM

Reg DM Reg

A
L

UIM

Forwarding hardware
• Add a control unit (forwarding unit) to select the correct ALU value

for the EX stage.
• If there is no hazard, the unit selects ALU’s operands from register file,

just like before.
• If there is a hazard, the unit selects operands from either the EX/MEM

or MEM/WB pipeline registers.
• As ALU has two inputs, add two new multiplexers to select the ALU

operands.

11

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

Reg DM Reg

A
L

UIM

Reg DM Reg

A
L

UIM

Reg DM Reg

A
L

UIM

Normal flow of ALU result
• The ALU result generated at the EX stage is normally

passed through the pipeline registers to the MEM and
WB stages, before it is finally written to the register file.

12

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1
ALU

0
1Inst[15-11]

Inst[20-16]

IF/ID ID/EX EX/MEM MEM/WB

Simplified datapath with two new multiplexers

13

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1Inst[15-11]

Inst[20-16]

IF/ID ID/EX EX/MEM MEM/WB

0
2
1

0
2
1

ForwardA

ForwardB

Next step: add a forwarding unit (control unit) to
control the selection of the two multiplexers.

Question: how can the hardware know that there is
data hazard? → how to detect data hazard?

Forward Unit Output Signals

14

How to detect data hazard?
• 1. EX Forward Unit:
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd != 0)
and (EX/MEM.RegisterRd == ID/EX.RegisterRs))

ForwardA = 10
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd != 0)
and (EX/MEM.RegisterRd == ID/EX.RegisterRt))

ForwardB = 10

• 2. MEM Forward Unit:
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd != 0)
and (MEM/WB.RegisterRd == ID/EX.RegisterRs))

ForwardA = 01
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd != 0)
and (MEM/WB.RegisterRd == ID/EX.RegisterRt))

ForwardB = 01

15

EX/MEM hazard
• An EX/MEM hazard occurs between two adjacent

instructions and:
• The first instruction will write data to register file.
• The first instruction’s destination is the same as one of the

second instruction’s read operands, e.g. register $2 here.

16

Reg DM Reg

A
L

UIM

Reg DM Reg

A
L

UIM

sub $2, $1, $3

and $12, $2, $5

MEM/WB hazard
• Occur between current instruction and the instruction two

stages ahead.

17

1 2 3 4 5 6 7Clock cycle

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

Reg DM Reg

A
L

UIM

Reg DM Reg
A

L
UIM

Reg DM Reg

A
L

UIM

How to detect data hazard?

18

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1Inst[15-11]

Inst[20-16]

IF/ID ID/EX EX/MEM MEM/WB

0
2
1

0
2
1

ForwardA

ForwardB

and $12, $2, $5 sub $2, $1, $3

EX/MEM.RegWr

EX/MEM.RegRd

ID/EX.RegRt

ID/EX.RegRd

ID/EX.RegRs

Datapath with forwarding unit (control unit)

19

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1RD

RT

IF/ID ID/EX EX/MEM MEM/WB

0
2
1

0
2
1

ForwardA

ForwardB

Forwarding
Unit

RS

ID/EX.RegRT

ID/EX.RegRS

EX/MEM.RegRD

MEM/WB.RegRD

EX/MEM.RegWr

MEM/WB.RegWr

Exception: not MEM/WB hazard, it is EX/MEM hazard

• One new problem is if the same register is updated twice in previous
two instructions.

add $1, $2, $3
add $1, $1, $4
sub $5, $5, $1

• Register $1 is written by both of the previous two instructions: which
instruction will actually produce the result for SUB?

• Only the most recent result (from the second ADD) should be forwarded.

20

DMReg RegIM

DMReg RegIM

DMReg RegIM

add $1, $2, $3

add $1, $1, $4

sub $5, $5, $1

Forwarding example:
• Assume that each register contains a value which is its

number plus 100. For instance, register $2 contains 102,
register $6 contains 106, and so forth.

sub $2, $1, $3
add $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)

21

Cycle 3

22

Cycle 4: forwarding $2 from EX/MEM

23

Cycle 5: forwarding $2 from MEM/WB

24

Forwarding for SW

25

CASE 1:

CASE 2:

DMReg RegIM

DMReg RegIM

add $1, $2, $3

sw $1, 0($4)

DMReg RegIM

DMReg RegIM

add $1, $2, $3

sw $4, 0($1)

1 2 3 4 5 6

1 2 3 4 5 6

CASE 1: forward to R[rs]

26

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1RD

RT

0
2
1

0
2
1

ForwardA

ForwardB

Forwarding
Unit

RS

Sign
Extend

0
1

IF/ID ID/EX EX/MEM MEM/WB

EX: sw $4, 0($1) MEM: add $1, $2, $3

1

1

2

CASE 2: forward to R[rt]

27

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1RD

RT

0
2
1

0
2
1

ForwardA

ForwardB

Forwarding
Unit

RS

Sign
Extend

0
1

IF/ID ID/EX EX/MEM MEM/WB

EX: sw $1, 0($4) MEM: add $1, $2, $3

11

2

CASE 3

28

DMReg RegIM

DMReg RegIM

lw $1, 0($2)

sw $1, 0($4)

1 2 3 4 5 6

Can the datapath handle case 3?

29

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr.
R. Data

W. Data
Data Mem

0

1ALU

0
1RD

RT

0
2
1

0
2
1

ForwardA

ForwardB

Forwarding
Unit

RS

Sign
Extend

0
1

IF/ID ID/EX EX/MEM MEM/WB

lw $1, 0($2)
sw $1, 0($4)

Modify the datapath

30

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr.
R. Data

W. Data
Data Mem

0

1ALU

0
1RD

RT

0
2
1

0
2
1

ForwardA

ForwardB

Forwardig
Unit

RS

Sign
Extend

0
1

IF/ID ID/EX EX/MEM MEM/WB

0
1

EX/MEM.MemWr

lw $1, 0($2)
sw $1, 0($4)

lw $1, 0($2)sw $1, 0($4)

$1

Handle case 3

31

lw $1,0($2)

A
LUIM Reg DM Reg

sw $1,0($4)
A

LUIM Reg DM Reg

Can forwarding resolve this example?

• $2 is only available at the beginning of cycle 5.
• $2 is needed by “and” at the beginning of cycle 4.
• Forwarding doesn’t work.

• Data is not available when another instruction needs it.
• Forwarding: data is already available in one of the pipeline

stages when another instruction needs it.
• “True” data hazard.

32

DMReg RegIM

DMReg RegIM

lw $2, 20($3)

and $12, $2, $5

Clock cycle
1 2 3 4 5 6

Resolve true data hazard
• By inserting NOP instructions at compile time (software approach).
• NOP is a dummy instruction doing nothing.
• A NOP can be: or $0, $0, $0

• Doesn’t change the value of any register and memory.

33

DMReg RegIM

RegIM

IM

lw $2, 20($3)

or $0, $0, $0

and $12, $2, $5

or $13, $12, $2
DMReg RegIM

DMReg Reg

Clock cycle
1 2 3 4 5 6 7 8

DM Reg

Resolve true data hazard
• By stall the pipeline + forwarding at run time (hardware approach).
• Delay the execution of instruction (adding delay slot).
• E.g. delay the EX stage of “and” by one cycle, and use forwarding to pass data from

MEM/WB register to ALU input of “and”.

• If there is no forwarding, we need to stall two cycles (adding two delay slots).

34

DMReg RegIM

DMReg RegIM

lw $2, 20($3)

and $12, $2, $5

Clock cycle
1 2 3 4 5 6 7

DMReg RegIM

DMReg RegIM

lw $2, 20($3)

and $12, $2, $5

Clock cycle
1 2 3 4 5 6 7 8

Stall
• If we stall one instruction, we need to stall the

subsequent instructions too.
• Prevent structural hazard.

• Can use stall to resolve any hazard
• But reduce performance.

35

DMReg RegIM

DMReg RegIM

DMReg RegIM

lw $2, 20($3)

and $12, $2, $5

or $13, $12, $2

Clock cycle
1 2 3 4 5 6 7 8

Adding hazard detection (stall detection)

36

PC Write: update PC or not. Used to
freeze PC.

IF/ID Write: update IF/ID register or
not. Used to freeze IF/ID register.

Hazard
Unit

P
C Addr

Inst. Mem.

Read Reg #1
Reg #1

Read Reg #2

Write Reg

Reg #2
Write Data

Register File

Addr
Read Data

Write Data
Data Mem

0

1ALU

0
1RD

RT

IF/ID

ID/EX

EX/MEM
MEM/WB

0
1
2

0
1
2

ForwardA

ForwardB

Forwarding
Unit

RS

Sign
Extend

0
1

Control

W
M

E

W

M W

0
1

ID/EX.RegRT

ID/EX.MemRead

RTRS

PC Write

IF/ID Write

0

Hazard detection unit (stall detection)
• Inputs of the hazard detection unit.

• IF/ID.RegRs and IF/ID.RegRt, the source registers for the
second instruction.

• ID/EX.MemRead and ID/EX.RegRt, to determine if the first
instruction is LW and, if so, to which register it will write.

• By inspecting these values, the detection unit generates
three outputs.
• Two new control signals PCWrite and IF/ID Write, which

determine whether the pipeline stalls (Freeze PC and IF/ID) or
continues.

• A mux select, which forces control signals of the EX and future
stages to 0 in case of a stall (NOP insertion).

37

Example: cycle 3

38

or $4, $4, $2 and $4, $2, $5 lw $2, 20($1)

lw $2, 20($1)
and $4, $2, $5
or $4, $4, $2

0

Control signal 0 means: don’t
write any register and memory.

Example: cycle 4

39

or $4, $4, $2 and $4, $2, $5 NOP

lw $2, 20($1)
and $4, $2, $5
or $4, $4, $2

0

Control signal 0 means: don’t
write any register and memory.

lw $2, 20($1)

0

Summary
• There is dependency between instruction.

• Attempt to read data before it is ready → data hazard.
• Three kinds of hazards.
• Data hazard: attempts to use data before it is ready.

• E.g. instruction depends on result of previous instructions.
• Solution:

• insert NOP at compile time; forwarding;
• stall for true hazard.

40