CS代写 CS2305: Computer Architecture

CS2305: Computer Architecture
Single-cycle Processor
(C0mputer Organization: Chapter 4)

Copyright By PowCoder代写 加微信 powcoder

Department of Computer Science and Engineering

Fundamentals
Levels of Interpretation: Instructions
High Level Language
 C, Java, Python, …
 Loops, control flow, variables
Assembly Language
• No symbols (except labels)
• One operation per statement
• “human readable machine language”
Machine Language
• Binary-encoded assembly
• Labels become addresses
• The language of the CPU
Instruction Set Architecture
Machine Implementation (Microarchitecture) 2
for(i=0;i<10;i++) printf(“go cucs”); main: addi x2, x0, 10 addi x1, x0, 0 loop: slt x3, x1, x2 ... 00000000101000010000000000010011 00100000000000010000000000010000 00000000001000100001100000101010 ALU, Control, Register File, ... Fundamentals Objectives  How are MIPS instructions executed?  Understanding the basics of a processor  Putting it all together  Arithmetic Logic Unit (ALU)  Register File  SRAM: cache  DRAM: main memory Fundamentals The Processor: Datapath & Control  Our implementation of the MIPS is simplified  memory-reference instructions: lw, sw  arithmetic-logical instructions: add, sub, and, or, slt  control flow instructions: beq, j  Generic implementation  use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC)  decode the instruction (and read registers)  execute the instruction  All instructions (except j) use the ALU after reading the registers  How memory-reference? arithmetic? control flow? Fundamentals How to Design a Processor: step-by-step  1. Analyze instruction set => datapath requirements
 the meaning of each instruction is given by the register transfers  datapath must include storage element for ISA registers
 possibly more
 datapath must support each register transfer
 2. Select set of datapath components and establish clocking methodology
 3. Assemble datapath meeting the requirements
 4. Analyze implementation of each instruction to determine
setting of control points that effects the register transfer.
 5. Assemble the control logic

Fundamentals
The MIPS Instruction Formats
 All MIPS instructions are 32 bits long. The three instruction formats: 31 26 21 16 11 6
 R-type  I-type  J-type
6 bits 5 bits 5 bits 31 26 21 16
5 bits 16 bits
6 bits 31 26
target address
 The different fields are:
 op: operation of the instruction
 rs, rt, rd: the source and destination register specifiers
 shamt: shift amount
 funct: selects the variant of the operation in the “op” field  address / immediate: address offset or immediate value  target address: target address of the jump instruction

Fundamentals
Focus on a Subset of MIPS Instructions
7 Instructions
 ADD and subtract
 add rd, rs, rt
 sub rd, rs, rt  OR Immediate:
 ori rt, rs, imm16  LOAD and STORE
 lw rt, rs, imm16
 sw rt, rs, imm16  BRANCH:
 beq rs, rt, imm16  JUMP:
 j target
31 26 21 16 11 6 0
5 bits 31 26 21 16
31 26 6 bits
target address

Fundamentals
Aside: Logical Register Transfers
 RTL gives the meaning of the instructions  All start by fetching the instruction
op | rs | rt | rd | shamt | funct = MEM[ PC ] op|rs|rt| Imm16 =MEM[PC]
inst Register Transfers
ADDU R[rd] <– R[rs] + R[rt]; SUBU R[rd] <– R[rs] – R[rt]; ORI R[rt] <– R[rs] | zero_ext(Imm16); LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; STORE MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC<–PC+4 PC<–PC+4 PC<–PC+4 PC<–PC+4 PC<–PC+4 BEQ if ( R[rs] == R[rt] ) then PC <– PC + 4 +sign_ext(Imm16)] || 00 else PC <– PC + 4 Fundamentals Step 1: Requirements of the Instruction Set  Memory (MEM)  Instructions & data  Registers (R: 32 x 32)  Read rs Write rt or rd  Extender (sign/zero extend)  Add/Sub/OR unit for operation on register(s) or extended  Add 4 (+ maybe extended immediate) to PC Fundamentals Step 2: Components of the Datapath  Combinational Elements  Storage Elements  Clocking methodology Fundamentals Combinational Logic Elements Select A 32 Fundamentals Storage Element: Register File  Register File consists of 32 registers: Write Enable 5 busW  RA (number) selects the register to put on busA (data)  RB (number) selects the register to put on busB (data)  RW (number) selects the register to be written via busW (data) when Write Enable is 1  Two 32-bit output busses: busA and busB  One 32-bit input bus: busW  Register is selected by:  Clock input (CLK)  The CLK input is a factor ONLY during write operation  During read operation, behaves as a combinational logic block:  RA or RB valid => busA or busB valid after “access time.”
32 32-bit Registers

Fundamentals
Storage Element: Idealized Memory
Write Enable
 Address selects the word to put on Data Out
 Write Enable = 1: address selects the memory word to be written via the Data In bus
 Clock input (CLK)
DataOut 32
 Memory (idealized)
 One input bus: Data In
 One output bus: Data Out
 Memory word is selected by:
 The CLK input is a factor ONLY during write operation
 During read operation, behaves as a combinational logic block:
 Address valid => Data Out valid after “access time.”

Fundamentals
Aside: Clocking Methodologies
 The clocking methodology defines when data in a state element is valid and stable relative to the clock
 State elements – a memory element such as a register
 Edge-triggered – all state changes occur on a clock edge
 Typical execution
 read contents of state elements -> send values through
combinational logic -> write results to one or more state elements
State Combinational State element logic element
one clock cycle
 Assumes state elements are written on every clock cycle; if not, need explicit write control signal
write occurs only when both the write control is asserted and the clock edge occurs

Fundamentals
Aside: Clocking Methodologies
Setup Hold Setup Hold
Don’t Care
.. .. .. .. .. ..
 All storage elements are clocked by the same clock edge
 Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew

Fundamentals
Step 3: Assemble DataPath meeting our requirements
 Register Transfer Requirements ⇒ Datapath Assembly
 Instruction Fetch
 Read Operands and Execute Operation

Fundamentals
Generic Steps of Datapath
1. Instruction Fetch
2. Decode/ Register
3. Execute 4. Memory5. Register Write
instruction memory
Data memory

Fundamentals
Fetching Instructions
 Fetching instructions involves
 reading the instruction from the Instruction Memory M[PC]
 updating the PC value to be the address of the next (sequential) instruction PC ← PC + 4
Instruction Memory
Read Instruction Address
PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal
Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal

Fundamentals
Decoding Instructions
 Decoding instructions involves
 sending the fetched instruction’s opcode and function field
bits to the control unit
Exec Decode
Control Unit
Read Addr 1 Read Addr 2 Write Addr Write Data
Read Data 1
Read Data 2
Instruction
reading two values from the Register File
– Register File addresses are contained in the instruction

Fundamentals
Executing R-type Instructions
7 Instructions
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE  lw rt, rs, imm16  sw rt, rs, imm16
 beq rs, rt, imm16
 j target
31 26 21 16
ADD and subtract
 add rd, rs, rt  sub rd, rs, rt
31 26 6 bits
target address
31 26 21 16 11 6 0 6bits 5 bits 5 bits 5bits 5bits 6bits

Fundamentals
Datapath of RR(R-type)
31 26 21 16 11 6 0
6bits 5 bits 5 bits
 RTL:R[rd] ← R[rs] op R[rt]
5bits 5bits 6bits
Example: add rd, rs, rt
ALUctr:add/sub
Rw Ra Rb 32 32-bit Registers
Ra, Rb, Rw correspond to rs, rt, rd ALUctr,RegWr: control signal
What are controls signals for “add rd, rs, rt” ?
ALUctr=add,RegWr=1

Fundamentals
I-type instruction(ori)
 ADD and subtract  add rd, rs, rt  sub rd, rs, rt
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE  lw rt, rs, imm16  sw rt, rs, imm16
 beq rs, rt, imm16
 j target
31 26 21 16 11 6 0
31 26 6 bits
target address
31 26 21 16 0 6 bits 5 bits 5 bits 16 bits

Fundamentals
RTL: The OR Immediate Instruction
31 26 21 16 0 6 bits 5 bits 5 bits 16 bits
 ori rt, rs, imm16
 M[PC] Instruction Fetch  R[rt] ← R[rs] or ZeroExt(imm16)
zero extension of 16 bit constant or R[rs]  PC←PC + 4 update PC
Zero extension ZeroExt(imm16) 31
0000 0000 0000 0000

Fundamentals
Datapath of Immediate Instruction
 R[rt] ← R[rs] op ZeroExt[imm16]] 31 26 21 16
6 bits 5 bits 5 bits
Write the results of R-Type instruction to Rd Rd Rt
RegDst 0Mux1 RsDon’tCare RegWr5 5 5 (Rt)
Example: ori
rt, rs, imm16
Ra Rb busA 32 32-bit 32 Registers busB
Why need multiplexor here?
Ori control signals:RegDst=?;RegWr=?;ALUctr=?;ALUSrc=? Ori control signals:RegDst=1; RegWr=1;ALUctr=or; ALUSrc=124
Mux ZeroExt

Fundamentals
Datapath for lw (memory access instruction)
 ADD and subtract  add rd, rs, rt  sub rd, rs, rt
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE
 sw rt, rs, imm16  BRANCH:
 beq rs, rt, imm16  JUMP:
 j target
31 26 21 16 11 6
31 26 6 bits
target address
31 26 21 16 0 6 bits 5 bits 5 bits 16 bits
 lw rt, rs, imm16

Fundamentals
RTL: The Load Instruction
rt, rs, imm16
31 26 21 16
 lw    
6 bits 5 bits 5 bits 16 bits
Addr ← R[rs] + SignExt(imm16) Compute the address R[rt] ← M [Addr] Load Data to rt PC←PC + 4 Update PC
Why using signed extension rather than zero extension?
31 1615 31 16 bits 1615
16 bits 16 bits
Instruction Fetch
0000 0000 00000 0000
1111 1111 1111 1111

Fundamentals
Datapath for Load Instruction
 R[rt] ← M[ R[rs] + SignExt[imm16] ] Example: lw 31 26 21 16
rt, rs, imm16
RegDst 0Mux1 RsDon’t Care
6 bits 5 bits
ALU Ra Rb 32 32-bit
Mem En Adr
Data Memory
ExtOp 0:zero extension,1: sign extension
What are control signals — RegDst, RegWr, ALUctr, ExtOp, ALUSrc, MemWr, MemtoReg?
RegDst=1, RegWr=1, ALUctr=add, ExtOp=1, ALUSrc=1, MemWr=0, MemtoReg=1

Fundamentals
SW instruction
 ADD and subtract  add rd, rs, rt  sub rd, rs, rt
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE  lw rt, rs, imm16  sw rt, rs, imm16
 beq rs, rt, imm16
 j target
31 26 21 16 11 6
31 26 6 bits
target address
31 26 21 16 0 6 bits 5 bits 5 bits 16 bits

Fundamentals
RTL: The Store Instruction
31 26 21 16 6 bits 5 bits 5 bits
rt, rs, imm16
Addr ← R[rs] + SignExt(imm16) Mem[Addr] ← R[rt]
PC ← PC + 4
 sw    

Fundamentals
Datapath for SW
 M[ R[rs] + SignExt[imm16] ] ← R[rt] Example: sw 31 26 21 16
rt, rs, imm16
RegDst Rd Rt 6 bits 0Mux1 RsRt
RegWr 5 5 5
Why add this?
Rw Ra Rb 32 32-bit
Data Memory
Data In ALU
RegDst=x, RegWr=0, ALUctr=add, ExtOp=1, ALUSrc=1, MemWr=1, MemtoReg=x 30

Fundamentals
 ADD and subtract  add rd, rs, rt  sub rd, rs, rt
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE  lw rt, rs, imm16  sw rt, rs, imm16
 j target
31 26 21 16 11 6
 beq rs, rt, imm16
31 26 6 bits
target address
31 26 21 16 0 6 bits 5 bits 5 bits 16 bits

Fundamentals
RTL: The Branch Instruction
31 26 21 16 0
6 bits 5 bits
 beq rs, rt, imm16
 Cond ← R[rs] – R[rt]  if (COND eq 0)
5 bits 16 bits
PC←PC + 4 + ( SignExt(imm16) x 4 ) else
Compare rs and rt
Calculate the next instruction’s address

Fundamentals
Datapath for beq
rs, rt, imm16 We need to compare Rs and Rt !
31 26 21 16 6 bits 5 bits 5 bits
busW busA Clk
Rd Rt 0Mux1 Rs Rt
Next Addr Logic
RegWr5 5 5
Clk 32 busB 32
imm16 16 32 ExtOp
ToInstruction Memory
Q: How to design the addressing logic?
Rw Ra Rb 32 32-bit Registers
RegDst=x, RegWr=0, ALUctr=sub, ExtOp=x, ALUSrc=0, MemWr=0, MemtoReg=x,

Fundamentals
Instruction Fetch Unit at the End of Branch
31 26 21 16 0
 if (Zero==1) then PC=PC+4+SignExt[imm16]*4; else PC=PC+4
Instruction<31:0>
busW Clk 32
AL Ra Rb 32 32-bit Registers

Fundamentals
Jump Operation
 ADD and subtract  add rd, rs, rt  sub rd, rs, rt
 OR Immediate:
 ori rt, rs, imm16
 LOAD and STORE  lw rt, rs, imm16  sw rt, rs, imm16
 beq rs, rt, imm16
31 26 21 16 11 6
6 bits 5 bits 5 bits 31 26 21 16
 j target
31 26 0 6 bits 26 bits
target address

Fundamentals
Executing Jump Operations
 Jump operation involves
 replace the lower 28 bits of the PC with the lower 26 bits of the
fetched instruction shifted left by 2 bits
Jump address
Instruction Memory
Read Instruction Address

Fundamentals
Single Cycle Datapath (Without Jump) 0
Shift left 2
PC Read MemtoReg
Instr[31-26] Control Unit
ALUSrc RegWrite
Instruction Memory
Instr[25-21]
Instr[15 -11]
Instr[15-0]
ALU control
Read Addr 1
Read Data 1
Instr[20-16]
Read Addr 2
Write Addr Write Data
Read Data 2
Data Read Data
Write Data
Read Address
Instr[31-0]
16 Extend 32
Instr[5-0]

Fundamentals
Step 4: Adding the Control
 Selecting the operations to perform (ALU, Register File and Memory read/write)
 Controlling the flow of data (multiplexor inputs)
31 25 20
addr of registers
to be read are
always specified by the
rs field (bits 25-21) and rt field (bits 20-16); for lw and sw rs is the base register
addr. of register to be written is in one of two places – in rt (bits 20-16) for lw; in rd (bits 15-11) for R-type instructions
offset for beq, lw, and sw always in bits 15-0
 Observations op field always
in bits 31-26
I-Type: J-type:
25 20 15 10 5 15
address offset
target address

Fundamentals
R-type Instruction Data/Control Flow
Shift left 2
PC Read MemtoReg
Instr[31-26] Control Unit
ALUSrc RegWrite
Instruction Memory
Instr[25-21]
Instr[15 -11]
Instr[15-0]
ALU control
Read Addr 1
Read Data 1
Instr[20-16]
Read Addr 2
Write Addr Write Data
Read Data 2
Data Read Data
Write Data
Read Address
Instr[31-0]
16 Extend 32
Instr[5-0]

Fundamentals
Load Word Instruction Data/Control Flow
Shift left 2
PC Read MemtoReg
Instr[31-26] Control Unit
ALUSrc RegWrite
Instruction Memory
Instr[25-21]
Instr[15 -11]
Instr[15-0]
ALU control
Read Addr 1
Read Data 1
Instr[20-16]
Read Addr 2
Write Addr Write Data
Read Data 2
Data Read Data
Write Data
Read Address
Instr[31-0]
16 Extend 32
Instr[5-0]

Fundamentals
Branch Instruction Data/Control Flow
Shift left 2
PC Read MemtoReg
Instr[31-26] Control Unit
ALUSrc RegWrite
Instruction Memory
Instr[25-21]
Instr[15 -11]
Instr[15-0]
ALU control
Read Addr 1
Read Data 1
Instr[20-16]
Read Addr 2
Write Addr Write Data
Read Data 2
Data Read Data
Write Data
Read Address
Instr[31-0]
16 Extend 32
Instr[5-0]

Fundamentals
Adding the Jump Operation
Instr[25-0] Shift 28 32
PC+4[31-28]
Shift left 2
PC Read MemtoReg
Instr[31-26] Control Unit
ALUSrc RegWrite
Instruction Memory
Read Address
Instr[25-21]
Instr[15 -11]
Instr[15-0]
ALU control
Read Addr 1
Read Data 1
Instr[20-16]
Read Addr 2
Write Addr Write Data
Read Data 2
Data Read Data
Write Data
Instr[31-0]
16 Extend 32
Instr[5-0]

Fundamentals
Assemble Control Logic
Instruction<31:0>
Inst Memory
Rt Rs Rd Imm16
nPC_sel RegWr RegDst ExtOp ALUSrc AL Reg Equal
<0:15> <11:15>
<21:25> <21:25>

Fundamentals
A Summary of the Control Signals
ALUctr<2:0>
We Don’t Care 🙂
R-type I-type J-type
ori, lw, sw, beq jump
31 26 21 16 11 6
target address

Fundamentals
The Concept of Local Decoding
Two levels of decoding: Main Control and ALU Control

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com