程序代写代做代考 cache RISC-V assembly Java Memory Allocation III CSE 351 Autumn 2016

Memory Allocation III CSE 351 Autumn 2016

Roadmap
1
car *c = malloc(sizeof(car));
c->miles = 100;
c->gals = 17;
float mpg = get_mpg(c);
free(c);
Car c = new Car();
c.setMiles(100);
c.setGals(17);
float mpg =
c.getMPG();

Java:
C:
Assembly language:
Machine code:
0111010000011000
100011010000010000000010
1000100111000010
110000011111101000011111
Computer system:
OS:

Memory & data
Arrays & structs
Integers & floats
RISC V assembly
Procedures & stacks
Executables
Memory & caches
Processor Pipeline
Performance
Parallelism

CMPT 295
L27: Datapath

1

RISC-V CPU Datapath, Control Intro

CMPT 295
L27: Datapath
Podcast link: https://gimletmedia.com/tags/6dug/super-tech-support
2

Putting it together!
3
High level languages (ex. C) become machine language through compilation, assembly, and linking.
Registers, clock circuits, gates, and other logic devices are the fundamental building blocks of digital decision- making
CPU

CMPT 295
L27: Datapath

3

Today’s goal:
4
Create a “circuit” of logic elements that, when given an assembly instruction, perform the action the instruction describes
A bunch of logic stuff
010111010100010101..1
Some RISC-V Instruction in binary (let’s say the instruction is:
addi t0 x0 6)
Register t0 now has the value 6! The addi was performed!

CMPT 295
L27: Datapath

4

Today’s goal:
5
Create a “circuit” of logic elements that, when given an assembly instruction, perform the action the instruction describes
A bunch of logic stuff
010111010100010101..1
Some RISC-V Instruction in binary (let’s say the instruction is:
addi t0 x0 6)
Register t0 now has the value 6! The addi was performed!

We’ll call the logic stuff your CPU: Central Processing Unit

CMPT 295
L27: Datapath

5

Agenda
What’s a CPU?
Building from what we know
Our CPU
Processor Design Principles
6

CMPT 295
L27: Datapath

Your CPU in two parts
Central Processing Unit (CPU):
Datapath: contains the hardware necessary to perform operations required by the processor
Reacts to what the controller tells it! (ie. “I was told to do an add, so I”ll feed these arguments through an adder)
Control: decides what each piece of the datapath should do
What operation am I performing? Do I need to get info from memory? Should I write to a register? Which register?
Has to make decisions based on the input instruction only!
7

CMPT 295
L27: Datapath

Your CPU in two parts
8
010111010100010101..1
Some RISC-V Instruction in binary (let’s say the instruction is:
addi t0 x0 6)
Control

What’s our operation? What’s rs1/rs2/rd/imm? Is this a branch? …
Datapath

I was told our operands are the value in the 0 register, and the assembled immediate “6”. I was told to perform an add, so I’ll feed these two arguments into an adder!

I was told rd is t0, so I’ll store the adder’s output in register t0.

Done~!

CMPT 295
L27: Datapath

8

Designing our Datapath: Where to start?
Let’s start with a broad question:
What operations does our datapath need to be capable of performing?

And also maybe a more specific one:
How can we ensure, when we build this, that all RISC-V instructions will be supported?

9

CMPT 295
L27: Datapath

9

Designing our Datapath: Where to start?
6 different formats: R, I , S, SB, U, UJ
Arithmetic, Immediate, Store, Branch, Upper-immediate, JAL
Instructions are classified into these formats based on their behaviours, meaning each type does something a little different!
If we’re building a CPU to run /all/ instructions, we’ll need to figure out what functionalities each type needs → support them all!
10

CMPT 295
L27: Datapath

10

Agenda
What’s a CPU?
Building from what we know
Our CPU
Processor Design Principles
11

CMPT 295
L27: Datapath

Working with an ‘R-type’…
Work with the people around you. What needs to happen for an ‘R-type’ inst to execute?
Wanna work with an example? Use:
add t0 t2 t3

Come up with a list of actions…
12

CMPT 295
L27: Datapath

12

Working with an ‘R-type’
What needs to be done before our R type is executed or evaluated?
(1) Get the instruction
(2) Parse instruction fields (rd, rs1, rs2, operation…)
(3) Read data based on parsed operands
(4) Perform operation
(5) Write result to our destination register

13

CMPT 295
L27: Datapath

Working with an ‘R-type’
(1) Get the instruction
add t0 t2 t3
(2) Parse instruction fields (rd, rs1, rs2, operation…)
rd = t0 rs1 = t2 rs2 = t3
(3) Read data based on parsed operands
R[t2] R[t3]
(4) Perform operation
R[t2] + R[t3]
(5) Write result to our destination register
R[t0] = R[t2] + R[t3]
14

CMPT 295
L27: Datapath

14

Implementing R-Types
15
IMEM

+4
pc

Control
(1) Get the instruction
PC holds the address of our current instruction
Where are the bits making up our inst stored?
How does PC change after an R-Type is executed?

CMPT 295
L27: Datapath

15

Implementing R-Types
16
IMEM

+4
pc

inst[11:7]
inst[19:15]
inst[24:20]
Control
(2) Parse instruction fields (rd, rs1, rs2, operation…)
What registers are we operating on?
Where do they lie in our instruction format?
How big is each field?

CMPT 295
L27: Datapath

16

Implementing R-Types
17
IMEM

+4
pc

inst[11:7]
inst[19:15]
inst[24:20]
Control
(3) Read data based on parsed operands
New hardware: Register File
Abstraction for all our registers (x0..x31 minus PC) and some mux’ing
Reading, Writing happens here
Reg[]

AddrA
AddrB
DataA
AddrD
DataB

R[rs1]
R[rs2]

CMPT 295
L27: Datapath

17

Storage Element: Register File
Register File consists of 31 registers:
Output ports portA and portB
Input port portW
Register selection
Place data of register RA (number) onto portA
Place data of register RB (number) onto portB
Store data on portW into register RW (number) when Write Enable is 1
Clock input (CLK)
CLK is passed to all internal registers so they can be written to if they match RW and Write Enable is 1
18
Clk
portW

Write Enable
32
32
portA
32
portB
5
5
5
RW
RA
RB
32 x 32-bit
Registers

CMPT 295
L27: Datapath

Implementing R-Types
19
IMEM

+4

ALU
pc

inst[11:7]
inst[19:15]
inst[24:20]
Control
(4) Perform operation
New hardware: ALU (Arithmetic Logic Unit)
Abstraction for adders, multipliers, dividers, etc.
How do we know what operation to execute?
Our first control bit! ALUSel(ect)
Reg[]

AddrA
AddrB
DataA
AddrD
DataB

R[rs1]
R[rs2]
ALUSel
inst[31:0]

CMPT 295
L27: Datapath

19

But wait! There are many R-Type operations!
ALU == “Arithmetic Logic Unit”
ALUSel is a control bit which encodes the operation we should perform on the given operands
The value of ALUSel is a mapping from func3 and func7 values to operations
“if func3 == 000 and func7 == 0000000, perform an add”
Multiple func3, func7 combinations might lead to the same operation! (ie. add and addi)
20

CMPT 295
L27: Datapath

Implementing R-Types
21
IMEM

+4

ALU
pc

inst[11:7]
inst[19:15]
inst[24:20]
Control
(5) Write result to our destination register
The data we want to write is the result of computing operation on operands, ie. the output from our ALU
Send it back to the regfile for writing

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
wb
R[rs1]
R[rs2]
ALUSel
inst[31:0]

CMPT 295
L27: Datapath

21

What changes with an arithmetic ‘I-type’?
A: Get the instruction
B: Parse instruction fields (rd, rs1, rs2)
C: Read data based on parsed operands
D: Perform operation
E: Write result to our destination register

Think: Do we need to add more hardware, more control, or more of both?
22

CMPT 295
L27: Datapath

22

A: Get the instruction
B: Parse instruction fields (rd, rs1, rs2, operation…)
We also need to parse (and reassemble) our immediate!
C: Read data based on parsed operands
Also an okay answer!
D: Perform operation
E: Write result to our destination register
23
What changes with an arithmetic ‘I-type’?

CMPT 295
L27: Datapath

23

Implementing the addi instruction
RISC-V Assembly Instruction:
addi x15,x1,-50

24
111111001110
00001
000
01111
0010011
OP-Imm
rd=15
ADD
imm=-50
rs1=1

CMPT 295
L27: Datapath

Control Logic
Datapath for add/sub

25

+4
pc

pc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
RegWEn
(1=write, 0=no write)
Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]

alu
ALU
ALUSel
(Add=0/Sub=1)

CMPT 295
L27: Datapath

25

Control Logic
Adding addi to datapath
26

+4
pc

pc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]

alu
ALU
ALUSel=Add

Imm.
Gen

0
1
RegWEn=1
inst[31:20]
imm[31:0]
ImmSel=I
BSel=1

CMPT 295
L27: Datapath

26

I-Format immediates

27

inst[31:0]
——inst[31]-(sign-extension)——-
inst[30:20]
imm[31:0]

Imm.
Gen
inst[31:20]
imm[31:0]
ImmSel=I
High 12 bits of instruction (inst[31:20]) copied to low 12 bits of immediate (imm[11:0])
Immediate is sign-extended by copying value of inst[31] to fill the upper 20 bits of the immediate value (imm[31:12])

CMPT 295
L27: Datapath

27

Control Logic
Adding addi to datapath

28

+4
pc

pc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]

alu
ALU
ALUSel=Add

Imm.
Gen

0
1
RegWEn=1
inst[31:20]
imm[31:0]
ImmSel=I
BSel=1
Also works for all other I-format arithmetic instruction (slti,sltiu,andi,ori,xori,slli,srli,srai) just by changing ALUSel

CMPT 295
L27: Datapath

28

But wait… Loads are ‘I-type’s also?
We know we can parse the immediate in the load-word format, but…
What do we do with the immediate?
What operation should we perform?
Maybe a better question: How do we know what the instruction does?
Any ideas?
29

CMPT 295
L27: Datapath

29

Adding lw to datapath
30
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataR

0
1
pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
ALU
mem
wb
imm[31:0]

inst[31:0]
ImmSel
BSel
ALUSel
MemRW
WBSel
wb
pc+4
Reg[rs1]
Reg[rs2]

CMPT 295
L27: Datapath

30

Adding lw to datapath

31
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataR

0
1
pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
alu
mem
wb
pc+4
Reg[rs1]
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel=I
RegWEn=1
Bsel=1
ALUSel=Add
MemRW=Read
WBSel=0
wb

CMPT 295
L27: Datapath

31

Storage Element: Idealized Memory
Memory (idealized)
One input port: Data In
One output port: Data Out
Memory access:
Read: Write Enable = 0, data at Address is placed on Data Out
Write: Write Enable = 1, Data In written to Address
Clock input (CLK)
CLK input is a factor ONLY during write operation
During read, behaves as a combinational logic block: Address valid → Data Out valid after “access time”
32
CLK
Data In

Write Enable
32
32
DataOut
Address

CMPT 295
L27: Datapath

A few notes on our new datapath…
33
We have a lot of different components!
IMEM, Register file, ALU, DMEM
Does every instruction need every component?
No! We got through all of the R-Types (and some of the I-Types) without DMEM
Does any instruction need every component?
Yep! Loads!
This is the instruction which exercises our “critical path”

CMPT 295
L27: Datapath

33

All RV32 Load Instructions
Supporting the narrower loads requires additional circuits to extract the correct byte/halfword from the value loaded from memory, and sign- or zero-extend the result to 32 bits before writing back to register file.
We’ll assume these are implemented in the DMEM module (not our datapath) and won’t add them to our schematic
34

funct3 field encodes size and signedness of load data

CMPT 295
L27: Datapath

Agenda
Building from what we know
Our CPU
Processor Design Principles
35

CMPT 295
L27: Datapath

This lecture is long… what have we done again?
Added ‘R-type’ and ‘I-type’ instructions to our data path!
We still need to figure out how to do S, SB, U, and UJ types
Don’t worry, we actually have all the big pieces we need!
We talked about control bits and how they instruct our hardware to deal with many different kinds of instructions

Let’s keep going!
36

CMPT 295
L27: Datapath

36

Current Datapath
37
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataR

0
1
pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
ALU
mem
wb
Reg[rs1]
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel
BSel
ALUSel
MemRW
WBSel
wb
pc+4

CMPT 295
L27: Datapath

37

Adding sw to datapath

38
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR

0
1

pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
pc+4
Reg[rs1]
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel
RegWEn
Bsel
ALUSel
MemRW
WBSel=
wb

CMPT 295
L27: Datapath

38

Adding sw to datapath

39
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR

0
1

pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
pc+4
Reg[rs1]
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel=S
RegWEn=0
Bsel=1
ALUSel=Add
MemRW=Write
WBSel=*
wb
*= “Don’t Care”

CMPT 295
L27: Datapath

39

I-Format immediates

40

inst[31:0]
——inst[31]-(sign-extension)——-
inst[30:20]
imm[31:0]

Imm.
Gen
inst[31:20]
imm[31:0]
ImmSel=I
High 12 bits of instruction (inst[31:20]) copied to low 12 bits of immediate (imm[11:0])
Immediate is sign-extended by copying value of inst[31] to fill the upper 20 bits of the immediate value (imm[31:12])

CMPT 295
L27: Datapath

40

I & S Immediate Generator
41
imm[11:5]
rs2
rs1
funct3
imm[4:0]
S-opcode
imm[11:0]
rs1
funct3
rd
I-opcode
inst[31](sign-extension)
inst[30:25]
imm[31:0]
inst[31:0]

inst[24:20]
S
I
inst[31](sign-extension)
inst[30:25]
inst[11:7]
0
6
7
11
12
14
15
19
20
24
25
31
0
4
5
10
11
31
1
6
5
5
S
I
Just need a 5-bit mux to select between two positions where low five bits of immediate can reside in instruction
Other bits in immediate are wired to fixed positions in instruction

CMPT 295
L27: Datapath
Implementing Branches
B-format is mostly same as S-Format, with two register sources (rs1/rs2) and a 12-bit immediate
But now immediate represents values -4096 to +4094 in 2-byte increments
The 12 immediate bits encode even 13-bit signed byte offsets (lowest bit of offset is always zero, so no need to store it)
42

CMPT 295
L27: Datapath
Adding sw to datapath

43
IMEM

ALU

Imm.
Gen

+4
DMEM

Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR

0
1

pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
pc+4
Reg[rs1]
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel
RegWEn
Bsel
ALUSel
MemRW
WBSel=
wb

CMPT 295
L27: Datapath

43

Adding branches to datapath

44
IMEM

ALU

Imm.
Gen

+4
DMEM

Branch Comp.
Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR

1
0

0
1

1
0
pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel
RegWEn
BrUn
BrEq
BrLT
ASel
BSel
ALUSel
MemRW
WBSel
PCSel
wb

CMPT 295
L27: Datapath

44

Adding branches to datapath

45
IMEM

ALU

Imm.
Gen

+4
DMEM

Branch Comp.
Reg[]

AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR

1
0

0
1

1
0
pc

0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]

inst[31:0]
ImmSel=B
RegWEn=0
BrUn
BrEq
BrLT
ASel=1
Bsel=1
ALUSel=Add
MemRW=Read
WBSel=*
PCSel=taken/not-taken
wb

CMPT 295
L27: Datapath

45

Branch Comparator
BrEq = 1, if A=B
BrLT = 1, if A < B BrUn =1 selects unsigned comparison for BrLT, 0=signed BGE branch: A >= B, if !(A