Processor Components
Where we are now
Assembly Language Processors
Copyright By PowCoder代写 加微信 powcoder
Arithmetic Logic Units
Devices Circuits
Transistors
Finite State Machines
Flip-flops
Microprocessors
Sofar,we’vebeentalking about making devices, such as adders, counters and registers.
Theultimategoalisto
make a microprocessor, which is a digital device that processes input, can store values and produces output, according to a set of on-board instructions.
The Final Destination
PCWriteCond
Control Unit
Shift left 2
Instruction [31-26]
Instruction [25-21]
Instruction [20-16]
Instruction [15-0]
Read reg 1 Readreg2 data1
Write reg Read data 2
Write data
Memory data
Write data
Instruction Register
ALU ALU result Out
Memory data register
Sign extend
Shift left 2
Deconstructing processors
Simpleratahighlevel:
Controller Unit
“Datapath”
Storage Unit
Arithmetic Unit
Datapath vs. Control
Datapath:wherealldatacomputationstake place.
Oftenadiagramversionofrealwiredconnections. Controlunit:orchestratestheactionsthat
take place in the datapath.
Thecontrolunitisabigfinite-statemachine that instructs the datapath to perform all appropriate actions.
Datapath example
Thinking back to Lab 4
Recallthatyoucreated an ALU an a register for Lab 4.
TheALUstoresitsoutput into the register,
TheregisterprovidesaninputvaluetotheALU.
Whatiftherewerewasmorethanone
register available to store ALU output values?
Howdoweindicatewhichregistertowriteto? HowdowedeterminetheALUinputs?
Lab 4 ALU with 2 registers
LdRB SelAB
Lab 4 ALU with 2 registers
Thingstoconsider:
With2registers,weneed to indicate which register (if any) will store the ALU output value.
The register write signals LdRA and LdRB handle this.
WealsoneedtoindicatewhetherX,registerA,or register B will provide the input values to the ALU.
The multiplexers determine the inputs A and B of the ALU by setting the values for SelxA and SelAB
Datapath signals
Useinputsignalsto operate this datapath:
LdRA = Load Reg A
LdRB = Load Reg B
ALUop = Add vs Mult
SelxA = First ALU input
0 input comes from X, 1 input comes from RA
SelAB = Select second ALU input
0 input comes from RA, 1 input comes from RB Howdoweusethistoperformacalculation?
Example: Calculating x2 + 2x
Let’sassumewehaveanexternalvalueX, supplied from outside the datapath.
Howdoweusethedatapathtocalculatex2+2x?
Needtocoordinatethedatapathcomponents:
ALU(toadd,subtractandmultiplyvalues)
Multiplexers(todeterminewhattheinputsshould be to the ALU)
Registers(toholdvaluesusedinthecalculation)
Calculating x2 + 2x
To calculate x2 + 2x:
LoadXintoRA&RB
MultiplyRAbyRB Store result in RA
AddXtoRA
Store result in RA
AddXtoRAagain
Result from ALU is x2 + 2x.
Howdowemaketheseoperationshappen?
Making the calculation
High-level Steps
LoadXintoRA&RB
Multiply RA & RB
Store result in RA AddXtoRA
Store result in RA Add X to RA again
ALU output is x2 + 2x.
Who sends these signals?
Control Signals
SelxA = 0, ALUop = A, LdRA=1, LdRB=1
SelxA = 1, SelAB = 1, ALUop=Mult, LdRA=1
SelxA = 0, SelAB = 0, ALUop=Add, LdRA=1
SelxA = 0, SelAB = 0, ALUop = Add
The Control Unit
Essentially,agiantFiniteStateMachine
Synchronized to system-wide signals (clock, resetn)
Outputsthedatapathcontrolsignals
SelxA, SelAB
LdRA, LdRB
=> control mux outputs (ALU inputs) => controls ALU operation
=> controls loading for registers RA, RB
Somearchitecturesalsooutputadonesignal, when the computation is complete
Yet another output; not shown in our datapaths
clk resetn
These signals are optional, for whenever the operation starts or stops
Datapath + Control
selxA selAB
(ALU, registers, muxes)
F (ALU result)
Microprocessors
Thisdatapathexample is a combination of the major devices we’ve discussed so far:
Registerstostorevalues.
AnALU(adders,shifters,etc)toprocessdata. Finitestatemachinestocontroltheprocess.
Microprocessorsarethebasisofall computing since the 1970’s, and can be found in nearly every sort of electronics.
Microprocessor components
Tounderstandthefullmicroprocessor,we’ll visit each major component in depth.
Startingwiththearithmeticunit.
Controller Unit
Storage Unit
Arithmetic Unit
The Arithmetic Unit
aka: the Arithmetic Logic Unit (ALU)
Arithmetic Logic Unit
Thefirstmicroprocessorapplicationswere calculators.
Recalltheunitonaddersandsubtractors.
Thesearepartofalargerstructurecalledthe
Arithmetic Logic Unit (ALU).
Like the ones you made for the labs.
Thislargerstructureisresponsibleforthe processing of all data values in a basic CPU.
ALU inputs
TheALUperformsallof
the arithmetic operations covered in this course so
far, and logical operations as well (AND, OR, NOT, etc.)
InputSrepresentsselectbits(inthiscase,S2S1& S0) that specify the operation to perform.
ThecarrybitCinisusedinoperationssuchas incrementing an input value or the overall result.
ALU outputs
Inadditiontotheinput
signals, there are output ALUop, signals V, C, N & Z which Cin indicate special conditions
in the arithmetic result:
V: overflow condition
Theresultoftheoperationcouldnotbestoredinthen
bits of G, meaning that the result is incorrect.
C: carry-out bit
Usedtodetecterrorsinunsignedarithmetic.
N: Negative indicator
Z: Zero-condition indicator
ALU block diagram
In addition to data inputs and outputs, this circuit also has:
outputs indicating the different conditions,
inputs specifying the operation to perform (similar to Sub).
A1 G … G0
Data input A
Data input B
Carry input
Operation & Mode select
Data output G
Carry output
Overflow indicator Negative indicator Zero indicator
ALU Disclaimer
TherearemultiplewaysthattheALUcanbe implemented.
Allimplementationsdothesamegeneralfunction (arithmetic and logical operations).
TheoperationsthattheALUcanperform,howit performs them, and specific input and output signals can vary.
Wewillgiveyouacoupleofimplementations (that you should learn), but just keep in mind that others are possible as well.
The ALU from the labs
Thisdesigniseasyto see and build, but not optimized for size.
ALUdesignsneedto find ways to avoid duplicating logic and reducing the footprint.
Tools like Quartus reduce these designs automatically when mapping to gates.
Subtracter
Bitwise AND
Revisiting the “A” of ALU
WhenbuildingtheALUoperationsfrom scratch, we start with the arithmetic side.
Fundamentally,thissideismadeofanadder/ subtractor unit, which we’ve seen already:
Y3 Y2 Y1 Y0 X3 X2 X1 X0
FA C3 FA C2 FA C1 FA Cin S3 S2 S1 S0
What else could we put into this layer?
Arithmetic components
n-bit parallel adder
n n G=X+Y+Cin B
Inadditiontoadditionandsubtraction,manymore operations can be performed by manipulating what is added to input A, as shown in the diagram above.
B input logic
Arithmetic operations
IftheinputlogiccircuitontheleftsendsB straight through to the adder, result is G = A+B
WhatifBwasreplacedbyallonesinstead?
Result of addition operation: G = A-1
WhatifBwasreplacedbyBandCinwashigh? Result of addition operation: G = A-B
AndwhatifBwasreplacedbyallzeroes?
Result is: G = A. (Not interesting, but useful!)
Instead of a Sub signal, the operation you want is signaled using the select bits S0 & S1.
Operation selection
Select bits
Result Operation
G=A Transfer
G = A+B Addition
G = A+B Subtraction – 1
G = A-1 Decrement
Thisisagoodstart!Butsomethingismissing… Wait,whataboutthecarrybit?
Full operation selection
Select Input Operation
S1 S0 Y Cin=0 Cin=1
G = A (transfer) G = A+1 (increment)
G=A+B (add) G=A+B+1
G=A+B G=A+B+1 (subtract)
G=A-1 (decrement) G=A (transfer)
Basedonthevaluesontheselectbitsandthe carry bit, we can perform any number of basic arithmetic operations by manipulating what value is added to A.
The “L” of ALU
We also want a circuit that can perform logical operations,
in addition to arithmetic ones.
How do we tell which operation perform?
4-to-1 mux
Another select bit!
If S2 = 1, then the output of the logic circuit block
appears at the ALU output.
Multiplexer is used to determine which block (logical or arithmetic) goes to the output.
ALU: Arithmetic + Logic
Arithmetic circuit
Ai N Bi Z
Logic circuit
What about multiplication?
Multiplication(anddivision)operationsare always more complicated than other arithmetic (plus, minus) or logical (AND, OR) operations.
Threemajorwaysthatmultiplicationcanbe implemented in circuitry:
Layeredrowsofadderunits. Anadder/shiftercircuit
Booth’sAlgorithm
Binary Multiplication
Revisitinggrade3math…
1368 1368 912
Binary Multiplication
5*6 (unsigned)
Result: 30
Andnow,inbinary… 101 10
110 110 000
Binary Multiplication
Orseenanotherway….
000 000 101
Binary Multiplication
Implementation
Implementingthis in circuitry involves the summation of several AND terms.
ANDgatescombine input signals.
Adderscombinethe outputs of the AND gates.
Multiplication
Thisimplementation results in an array of adder circuits to make the multiplier circuit.
Thiscangetalittle expensive as the size of the operands grows.
N-bitnumbersO(1)time,butO(N2)size.
Isthereanalternativetothiscircuit?
Accumulator circuits
Whatifyoucouldperformeachstageofthe multiplication operation, one after the other?
Thiscircuitwouldonly need a single row of adders and a couple of shift registers.
Register A cout
Register B
Shift Left 1
Howwidedoes register R have to be?
Isthereasimpler way to do this?
Shift Left 1
Adder Register R
Accumulator, illustrated
a3 a2 a1 a0
a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1
a3b2 a2b2 a1b2 a0b2
a3b3 a2b3 a1b3 a0b3
Accumulator, illustrated
a3 a2 a1 a0
a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1
a3b2 a2b2 a1b2 a0b2
a3b3 a2b3 a1b3 a0b3
Isthereamoreefficientwaytodothis?
Booth’s Algorithm
Devisedasawaytotakeadvantageofcircuits where shifting is cheaper than adding, or where space is at a premium.
Based on the premise that when multiplying by certain values (e.g. 99), it can be easier to think of this operation as a difference between two products.
Considertheshortcutmethodwhenmultiplyinga given decimal value X by 9999:
X*9999 = X*10000 – X*1
Nowconsidertheequivalentprobleminbinary: X*001111 = X*010000 – X*1
Booth’s Algorithm
Thisideaistriggeredoncaseswheretwo neighboring digits in an operand are different.
Ifdigitsatiandi-1are0and1,themultiplicand is added to the result at position i.
Ifdigitsatiandi-1are1and0,themultiplicand is subtracted from the result at position i.
Theresultisalwaysavaluewhosesizeisthe sum of the sizes of the two multiplicands.
Booth’s Algorithm
Example: Add B here
x 00011110
+ 111110101110
0100110011100
Subtract B from here
Sign extend this before adding
Booth’s Algorithm
Weneedtomakethisworkinhardware. Option#1:Havehardwaresetuptocompare
neighbouring bits at every position in A, with adders in place for when the bits don’t match.
Problem: This is a lot of hardware, which Booth’s Algorithm is trying to avoid.
Option#2:Havehardwaresetuptocomparetw0 neighbouring bits, and have them move down through A, looking for mismatched pairs.
Problem: Hardware doesn’t move like that. Oops.
Booth’s Algorithm
Stillneedtomakethisworkinhardware…
Option#3:Havehardwaresetuptocomparetw0 neighbouring bits in the lowest position of A, and looking for mismatched pairs in A by shifting A to the right one bit at a time.
Solution! This could work, but the accumulated solution P would have to shift one bit at a time as well, so that when B is added or subtracted, it’s from the correct position.
Booth’s Algorithm
StepsinBooth’sAlgorithm:
Designate the two multiplicands as A & B, and the result as some product P.
Add an extra zero bit to the right-most side of A.
Repeat the following for each original bit in A: If the last two bits of A are the same, do nothing.
If the last two bits of A are 01, then add B to the highest bitsofP.
If the last two bits of A are 10, then subtract B from the highest bits of P.
Perform one-digit arithmetic right-shift on both P and A. The result in P is the product of A and B.
Booth’s Algorithm Example
Example:(-5) * 2 Steps#1:
A=-5 11011
Addextrazerototheright
B=2 00010
-B=-2 11110
P=0 0000000000
Booth’s Algorithm Example
Step#3(repeat5times): ChecklasttwodigitsofA:
Sincedigitsare10,subtractBfromthemost significant digits of P:
00000 00000
11110 00000
ArithmeticshiftPandAonebittotheright: A = 111011 P = 11111 00000
Booth’s Algorithm Example
Step#3(repeat4moretimes): ChecklasttwodigitsofA:
Sincedigitsare11,donothingtoP.
ArithmeticshiftPandAonebittotheright: A = 111101 P = 11111 10000
Booth’s Algorithm Example
Step#3(repeat3moretimes): ChecklasttwodigitsofA:
Sincedigitsare01,addBtothemostsignificant
digits of P:
11111 10000
00001 10000
ArithmeticshiftPandAonebittotheright: A = 111110 P = 00000 11000
Booth’s Algorithm Example
Step#3(repeat2moretimes): ChecklasttwodigitsofA:
Sincedigitsare10,subtractBfromthemost significant digits of P:
00000 11000
11110 11000
ArithmeticshiftPandAonebittotheright: A = 111111 P = 11111 01100
Booth’s Algorithm Example
Step#3(finaltime):
ChecklasttwodigitsofA:
Sincedigitsare11,donothingtoP:
ArithmeticshiftPandAonebittotheright: A = 111111 P = 11111 10110
Finalproduct:
P = 111110110 = -10
Reflections on multiplication
Apopularversionofthisalgorithminvolves copying A into the lower bits of P, so that the testing and shifting only takes place in P.
Also good for maintaining the original value of A.
Multiplicationisn’tascommonanoperationas addition or subtraction, but occurs enough that its implementation is handled in the hardware, rather than by the CPU.
Mostcommonmultiplicationanddivision operations are powers of 2. For this, the shift register is used instead of the multiplier circuit.
A Barrel Shifter unit
D3 D2 D1 D0
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
Y3 Y2 Y1 Y0
This barrel shifter shifts and rotates D to the left by S bits. IfS1S0 is01 => Y=D2D1D0D3
IfS1S0 is11 => Y=D0D3D2D1
This is a purely combinational circuit, unlike the shift registers in the lab.
Expanding our view
Constant in
n G select A
Address out
MuxB select
H select IR
MuxF select
So where do A and B come from?
The Storage Unit
aka: the register file and main memory
Memory and registers
Theprocessorhasregistersthatstoreasingle value (program counters, instruction registers, etc.)
TherearealsounitsintheCPUthatstorelarge amounts of data for use by the CPU:
Registerfile:Smallnumberoffastmemoryunits that allow multiple values to be read and written simultaneously.
Mainmemory:Largergridofmemorycellsthatare used to store the main information to be processed by the CPU.
An Analogy
Ifregistersarebooks…
Theregisterfileisthe
pile of books on your
desk, small in number
but available for quick access.
Mainmemoryisthelibrary.Largercapacity, but takes time to access.
Otherelements:cache(locallibrarybranch), and networks (collections around the world)
The Register File
Register 0
Register 1
Register 2
… Register
Register 31
Register File
The register file stores the data that the processor uses for quick calculations.
For the example we use in this course, the typical register file contains 32 registers.
All the registers share a single set of input and output wires.
Like an apartment building with a single shared entrance, as opposed to a street of houses.
So how do we specify a register toreadorwrite?
Writing to the registers
Register 0
Register 1
Register 2
… Register
Register 31
Towritearegister,weneed to activate the Write signal for the desired register.
Assume we have the number (from 0 to 31, in binary) of the register in this file.
This number is also called the address of the register.
Giventheregister’saddress, how do we turn on the write signal for that register?
Register File
Writing to registers
Solution:
Usedemuxto specify a single register, based on that register’s address.
Alsoknownasa one-hot decoder.
Register 0
Register 1
Register 2
… Register
Register 31
Register File
One-hot decoders (for writing)
Theone-hotdecodertakesinam-bitbinaryaddressto activate one of the 2m registers in the register file.
Address bits One-hot decoder output
Reading from registers
Solution:
Considerthesame approach that was used in writing.
Mux with address as select bits.
Register 0
Register 1
Register 2
Register 31
Register File
Register File Functionality
Register 0
Register 1
Register 2
… Register
Register 2m
Read/Write Destination Reg.
(m-bit address) Data to write
Register A (m-bit address)
Register B (m-bit address)
Register File
n-bit value from Reg. A
n-bit value from Reg. B
Handling multiple registers
Theregisterunitinourmicroprocessorstores32 registers (each register storing a 32 bit value).
Howdoweaccessorupdateasingleregister?
Need to specify ALU input A and ALU input B when
performing a read operation.
Need to specify a register to write to (and the data
value to write) when performing a write operation.
Bothofthesearedonebyspecifyingtheaddress
of each register among the 32 available.
Each address will be 5 bits (log2 32).
Register File – Write Operation
Load Enable
Reg A select
Reg B select
Destination
Reg. Address
Register File – Write Operation
Reg A select
Reg B select
Destination
Reg. Address
Register File – Read Operation
Load Enable
Reg A select
Reg B select
Note: Data, A and B, and all the registers (R0 to R3) have the same bitwidth (e.g., n bits).
Destination
Reg. Addre
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com