程序代写代做代考 clock compiler mips CMPEN

CMPEN
Lecture
13
331

Chapter 3 — Arithmetic for Computers — 2
Review questions solved

Introduction
• CPUperformancefactors
• Instruction count
• Determined by ISA and compiler
• CPI and Cycle time
• Determined by CPU hardware
• WewillexaminetwoMIPSimplementations • A simplified version
• A more realistic pipelined version
• Simplesubset,showsmostaspects
• Memoryreference:lw,sw
• Arithmetic/logical:add,sub,and,or,slt • Controltransfer:beq,j
Chapter 4 — The Processor — 3
§4.1 Introduction


Instruction Execution
Generic implementation
• use the program counter (PC) to supply
the instruction address and fetch the instruction from memory (and update the PC)
• decode the instruction (and read registers)
Exec
Fetch PC = PC+4
Decode
• •
• execute the instruction
All instructions (except j) use the ALU after reading the registers
Depending on instruction class
• Use ALU to calculate
• Arithmetic result
• Memory address for load/store • Branch target address
• Access data memory for load/store
• PC¬targetaddressorPC+4
Chapter 4 — The Processor — 4

CPU Overview
n Can’t just join wires together
n Use multiplexers
Chapter 4 — The Processor — 5

Chapter 4 — The Processor — 6
Control

Adding the Control
• Selecting the operations to perform (ALU, Register File and Memory read/write)
• Controlling the flow of data (multiplexor inputs)
rs
rt
rd
shamt
funct
rs
rt
address offset
q Observations ● opfieldalways
in bits 31-26
31 25 20 15 10 5 0
R-type: op
31 25 20 15 0
I-Type:
31 25 0
J-type: op
op
● addressofregisters
to be read are
always specified by the
rs field (bits 25-21) and rt field (bits 20-16);; for lw and sw rs is the base register
● addressofregistertobewrittenisinoneoftwoplaces–inrt(bits20- 16) for lw;; in rd (bits 15-11) for R-type instructions
● offsetforbeq,lw,andswalwaysinbits15-0
target address

Logic Design Basics
• Information encoded in binary
• Lowvoltage=0,Highvoltage=1
• Onewireperbit
• Multi-bitdataencodedonmulti-wirebuses
• Combinational element
• Operateondata
• Outputisafunctionofinput
• State (sequential) elements • Storeinformation
Chapter 4 — The Processor — 8
§4.2 Logic Design Conventions

Combinational Elements
n AND-gate n Adder nY=A&B nY=A+B
A
Y
+ B
A B
Y
n Multiplexer
n Y = S ? I1 : I0
I0 Mu Y I1 x
S
n Arithmetic/Logic Unit n Y = F(A, B)
A
ALU Y
B
F
Chapter 4 — The Processor — 9


Sequential Elements
Register: stores data in a circuit
• Usesaclocksignaltodeterminewhentoupdatethestored value
• Edge-triggered:updatewhenClkchangesfrom0to1
DQ Clk
Clk D Q
Chapter 4 — The Processor — 10


Sequential Elements
Register with write control
• Onlyupdatesonclockedgewhenwritecontrolinputis1 • Usedwhenstoredvalueisrequiredlater
DQ
Clk
Write D Q
Write Clk
Chapter 4 — The Processor — 11

Clocking Methodology
• The clocking methodology defines when data in a state element is valid and stable relative to the clock
• State elements – a memory element such as a register
• Edge-triggered–allstatechangesoccuronaclockedge
• Combinational logic transforms data during clock cycles
• Betweenclockedges
• Inputfromstateelements,outputtostateelement • Longestdelaydeterminesclockperiod
Chapter 4 — The Processor — 12

CMPEN
Lecture
14
331

Chapter 4

Building a Datapath
• Datapath
• Elementsthatprocessdataandaddresses in the CPU
• Registers, ALUs, mux’s, memories, …
• We will build a MIPS datapath incrementally
• Refiningtheoverviewdesign
Chapter 4 — The Processor — 15
§4.3 Building a Datapath

Instruction Fetch
• Fetching instructions involves
• ReadingtheinstructionfromtheInstructionMemory
• UpdatingthePCvaluetobetheaddressofthenext (sequential) instruction
● PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal
● Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal
32-bit register
Chapter 4 — The Processor — 16
Increment by 4 for next instruction

Basic Architecture
• To carry out each instruction, the control unit must:
• Fetch – Read instruction from inst. mem.
• Decode – Determine the operation and operands of the instruction
• Execute – Carry out the instruction’s operation using the datapath

Control Unit
Init IR=I[PC] PC=0 Fetch PC=PC+1
Decode
Execute
Controller
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
PC
0->1
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Data memory D
D[0]: 99
IR RF[0]=D[0]
PC
1
IR RF[0]=D[0]
n-bit 2×1
Control unit
Control unit
PC
1
IR RF[0]=D[0]
Controller
Controller
“load”
Control unit
Controller
Register file RF
R[0]: ??à99
Datapath
ALU
(a) Fetch
(b) Decode
Execute (c)
17


To carry out each instruction, the control unit must:
• Fetch – Read instruction from inst. mem.
• Decode – Determine the operation and operands of the instruction
• Execute – Carry out the instruction’s operation using the datapath
Basic Architecture

Control Unit
Init IR=I[PC] PC=0 Fetch PC=PC+1
Decode
Execute
Controller
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
PC
1->2
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Data memory D
D[1]: 102
IR RF[1]=D[1]
PC
2
IR RF[1]=D[1]
n-bit 2×1
Control unit
Control unit
PC
2
IR RF[1]=D[1]
Controller
Controller
“load”
Control unit
Controller
Register file RF
R[1]: ??à102
Datapath
ALU
18
Execute (c)
(a) Fetch
(b) Decode

Basic Architecture

Control Unit
Init IR=I[PC] PC=0 Fetch PC=PC+1
Decode
Execute
Controller

To carry out each instruction, the control unit must:
• Fetch – Read instruction from inst. mem.
• Decode – Determine the operation and operands of the instruction
• Execute – Carry out the instruction’s operation using the datapath
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Data memory D
PC
2->3
IR RF[2]=RF[0]+RF[1]
PC
3
Controller
PC
3
Control unit
Controller
Control unit
IR RF[2]=RF[0]+RF[1]
IR RF[2]=RF[0]+RF[1]
Controller
“ALU (add)”
Control unit
(b) Decode
99 102
19
Datapath
Execute (c)
(a) Fetch
201
n-bit 2×1
Register file RF
R[2]: ??à201
ALU


To carry out each instruction, the control unit must:
• Fetch – Read instruction from inst. mem.
• Decode – Determine the operation and operands of the instruction
• Execute – Carry out the instruction’s operation using the datapath
Basic Architecture

Control Unit
Init IR=I[PC] PC=0 Fetch PC=PC+1
Decode
Execute
Controller
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
PC
3->4
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Instruction memory I
0: RF[0]=D[0]
1: RF[1]=D[1]
2: RF[2]=RF[0]+RF[1] 3: D[9]=RF[2]
Data memory D
D[9]=?? à 201
IR D[9]=RF[2]
PC
4
IR D[9]=RF[2]
n-bit 2×1
Controller
PC
4
IR D[9]=RF[2]
Controller
“store”
Control unit
Control unit
Controller
Register file RF
R[2]: 201
Control unit
Datapath
ALU
20
Execute (c)
(a) Fetch
(b) Decode

Decoding Instructions
• Decoding instructions involves
• sendingthefetchedinstruction’sopcodeandfunction field bits to the control unit
Fetch PC = PC+4
Exec Decode
Control Unit
and
Instruction
• reading two values from the Register File
– Register File addresses are contained in the instruction
Read Addr 1 Register Read
Read Addr 2
File
Write Addr Write Data
Data 1
Read Data 2

Fixed Program
module program_counter (
Counter
input input input output
update, clk, rst,
reg [31:0] pc
4
New PC
PC
);;
parameter INCREMENT_AMOUNT = 32’d4;;
always @(posedge clk or posedge rst) begin
if (rst)
pc <= 0;; else if (update) pc <= pc + INCREMENT_AMOUNT;; end Update + Variable Program module program_counter ( Counter input input input input output );; update, [31:0] instruction_size, clk, rst, reg [31:0] pc New PC Update PC Instr_Size always @(posedge clk or posedge rst) begin if (rst) pc <= 0;; else if (update) pc <= pc + instruction_size;; end + R • R format operations (add, sub, slt, and, or) 31 25 20 15 10 5 0 R-type: op • performoperation(opandfunct)onvaluesinrsandrt • storetheresultbackintotheRegisterFile(intolocationrd) • NotethatRegisterFileisnotwritteneverycycle(e.g.sw), so we need an explicit write control signal for the Register File - Format Instructions rs rt rd shamt funct Chapter 4 — The Processor — 24 Executing Load and Store Operations • Load and store operations involves • compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction • store value (read from the Register File during decode) written to the Data Memory • load value, read from the Data Memory, written to the Register File Instruction RegWrite ALU control MemWrite overflow zero ALU Read Addr 1 Register Read Read Addr 2 File Write Addr Write Data Sign 16 Extend MemRead Data 1 Read Data 2 Address Data Memory ReadData Write Data 32 Chapter 4 — The Processor — 26 Data path Branch Instructions • Read register operands • Compare operands • UseALU,subtractandcheckZerooutput • Calculate target address • Sign-extenddisplacement • Shiftleft2places(worddisplacement) • AddtoPC+4 • Already calculated by instruction fetch Chapter 4 — The Processor — 27 Branch Instructions Add 4 PC Chapter 4 — The Processor — 28 Fixed Program Counter with Offset Branching module program_counter ( input input input input input output update, branch, [15:0]branch_offset, clk, rst, reg [31:0] pc Branch Offset Branch? 4 );; parameter INCREMENT_AMOUNT = 32’d4;; always @(posedge clk or posedge rst) begin if (rst) pc <= 0;; else if (update) if (branch) pc <= pc + {16’d0,branch_offset};; else pc <= pc + INCREMENT_AMOUNT;; New PC Update PC end + Composing the Elements • First-cut data path does an instruction in one clock cycle • Eachdatapathelementcanonlydoonefunctionatatime • Hence,weneedseparateinstructionanddatamemories • Use multiplexers where alternate data sources are used for different instructions Chapter 4 — The Processor — 30 Chapter 4 — The Processor — 31 R - Type/Load/Store Datapath • • Jump uses word address Update PC with concatenation of • Top4bitsofoldPC • 26-bitjumpaddress • 00 Need an extra control signal decoded from opcode • Jump Chapter 4 — The Processor — 32 Implementing Jumps op address 31:26 25:0 Chapter 4 — The Processor — 33 Datapath With Jumps Added ALU Control • ALU used for • Load/Store:F=add • Branch:F=subtract • R-type:Fdependsonfunctfield ALU control Function 0000 AND 0001 OR 0010 add 0110 subtract 0111 set-on-less-than 1100 NOR Chapter 4 — The Processor — 34 §4.4 A Simple Implementation Scheme ALU Control • Assume 2-bit ALUOp derived from opcode • CombinationallogicderivesALUcontrol opcode ALUOp Operation funct ALU function ALU control lw 00 load word XXXXXX add 0010 sw 00 store word XXXXXX add 0010 beq 01 branch equal XXXXXX subtract 0110 R-type 10 add 100000 add 0010 subtract 100010 subtract 0110 AND 100100 AND 0000 OR 100101 OR 0001 set-on-less-than 101010 set-on-less-than 0111 Chapter 4 — The Processor — 35 MIPS ALU in Verilog •The ALU has 7 ports. A Verilog behavioral definition of a MIPS ALU Pseudo code module MIPSALU (ALUctl, A, B, ALUOut, Zero); input [3:0] ALUctl; input [31:0] A,B; output reg [31:0] ALUOut; output Zero; assign Zero = (ALUOut==0); //Zero is true if ALUOut is 0 always @(ALUctl, A, B) begin //reevaluate if these change case (ALUctl) 0: ALUOut <= A & B; 1: ALUOut <= A | B; 2: ALUOut <= A + B; 6: ALUOut <= A - B; 7: ALUOut <= A < B ? 1 : 0; 12: ALUOut <= ~(A | B); //result is nor default: ALUOut <= 0; endcase end endmodule The MIPS ALU control This is a combinational control logic. (Pseudo code) module ALUControl (ALUOp, FuncCode, ALUCtl); input [1:0] ALUOp; input [5:0] FuncCode; output [3:0] reg ALUCtl; always @(*) case (FuncCode) 32: ALUCtl <=2; // add 34: ALUCtl <=6; //subtract 36: ALUCtl <=0; // and 37: ALUCtl <=1; // or 39: ALUCtl <=12; // nor 42: ALUCtl <=7; // slt default: ALUCtl <=15; // should not happen endcase endmodule The Main Control Unit • Control signals derived from instruction 0 rs rt rd shamt funct R-type Load/ Store Branch 31:26 31:26 31:26 opcode 25:21 20:16 25:21 20:16 25:21 20:16 15:11 10:6 5:0 15:0 15:0 35 or 43 rs rt address 4 rs rt address Chapter 4 — The Processor — 39 always read read, except for load write for R-type and load sign-extend and add