代写代考 Processor Components

Processor Components

Where we are now
Assembly Language Processors

Copyright By PowCoder代写 加微信 powcoder

Arithmetic Logic Units
Devices Circuits
Transistors
Finite State Machines
Flip-flops

Microprocessors
 Sofar,we’vebeentalking about making devices, such as adders, counters and registers.
 Theultimategoalisto
make a microprocessor, which is a digital device that processes input, can store values and produces output, according to a set of on-board instructions.

The Final Destination
PCWriteCond
Control Unit
Shift left 2
Instruction [31-26]
Instruction [25-21]
Instruction [20-16]
Instruction [15-0]
Read reg 1 Readreg2 data1
Write reg Read data 2
Write data
Memory data
Write data
Instruction Register
ALU ALU result Out
Memory data register
Sign extend
Shift left 2

Deconstructing processors
 Simpleratahighlevel:
Controller Unit
“Datapath”
Storage Unit
Arithmetic Unit

Datapath vs. Control
 Datapath:wherealldatacomputationstake place.
 Oftenadiagramversionofrealwiredconnections.  Controlunit:orchestratestheactionsthat
take place in the datapath.
 Thecontrolunitisabigfinite-statemachine that instructs the datapath to perform all appropriate actions.

Datapath example

Thinking back to Lab 4
 Recallthatyoucreated an ALU an a register for Lab 4.
 TheALUstoresitsoutput into the register,
 TheregisterprovidesaninputvaluetotheALU.
 Whatiftherewerewasmorethanone
register available to store ALU output values?
 Howdoweindicatewhichregistertowriteto?  HowdowedeterminetheALUinputs?

Lab 4 ALU with 2 registers
LdRB SelAB

Lab 4 ALU with 2 registers
 Thingstoconsider:
 With2registers,weneed to indicate which register (if any) will store the ALU output value.
 The register write signals LdRA and LdRB handle this.
 WealsoneedtoindicatewhetherX,registerA,or register B will provide the input values to the ALU.
 The multiplexers determine the inputs A and B of the ALU by setting the values for SelxA and SelAB

Datapath signals
 Useinputsignalsto operate this datapath:
 LdRA = Load Reg A
 LdRB = Load Reg B
 ALUop = Add vs Mult
 SelxA = First ALU input
 0  input comes from X, 1  input comes from RA
 SelAB = Select second ALU input
 0  input comes from RA, 1  input comes from RB  Howdoweusethistoperformacalculation?

Example: Calculating x2 + 2x
 Let’sassumewehaveanexternalvalueX, supplied from outside the datapath.
 Howdoweusethedatapathtocalculatex2+2x?
 Needtocoordinatethedatapathcomponents:
 ALU(toadd,subtractandmultiplyvalues)
 Multiplexers(todeterminewhattheinputsshould be to the ALU)
 Registers(toholdvaluesusedinthecalculation)

Calculating x2 + 2x
 To calculate x2 + 2x:
 LoadXintoRA&RB
 MultiplyRAbyRB  Store result in RA
 AddXtoRA
 Store result in RA
 AddXtoRAagain
 Result from ALU is x2 + 2x.
 Howdowemaketheseoperationshappen?

Making the calculation
High-level Steps
 LoadXintoRA&RB
 Multiply RA & RB
 Store result in RA  AddXtoRA
 Store result in RA  Add X to RA again
 ALU output is x2 + 2x.
 Who sends these signals?
Control Signals
 SelxA = 0, ALUop = A, LdRA=1, LdRB=1
 SelxA = 1, SelAB = 1, ALUop=Mult, LdRA=1
 SelxA = 0, SelAB = 0, ALUop=Add, LdRA=1
 SelxA = 0, SelAB = 0, ALUop = Add

The Control Unit
 Essentially,agiantFiniteStateMachine
 Synchronized to system-wide signals (clock, resetn)
 Outputsthedatapathcontrolsignals
 SelxA, SelAB
 LdRA, LdRB
=> control mux outputs (ALU inputs) => controls ALU operation
=> controls loading for registers RA, RB
 Somearchitecturesalsooutputadonesignal, when the computation is complete
 Yet another output; not shown in our datapaths

clk resetn
These signals are optional, for whenever the operation starts or stops
Datapath + Control
selxA selAB
(ALU, registers, muxes)
F (ALU result)

Microprocessors
 Thisdatapathexample is a combination of the major devices we’ve discussed so far:
 Registerstostorevalues.
 AnALU(adders,shifters,etc)toprocessdata.  Finitestatemachinestocontroltheprocess.
 Microprocessorsarethebasisofall computing since the 1970’s, and can be found in nearly every sort of electronics.

Microprocessor components
 Tounderstandthefullmicroprocessor,we’ll visit each major component in depth.
 Startingwiththearithmeticunit.
Controller Unit
Storage Unit
Arithmetic Unit

The Arithmetic Unit
aka: the Arithmetic Logic Unit (ALU)

Arithmetic Logic Unit
 Thefirstmicroprocessorapplicationswere calculators.
 Recalltheunitonaddersandsubtractors.
 Thesearepartofalargerstructurecalledthe
Arithmetic Logic Unit (ALU).
 Like the ones you made for the labs.
 Thislargerstructureisresponsibleforthe processing of all data values in a basic CPU.

ALU inputs
 TheALUperformsallof
the arithmetic operations covered in this course so
far, and logical operations as well (AND, OR, NOT, etc.)
 InputSrepresentsselectbits(inthiscase,S2S1& S0) that specify the operation to perform.
 ThecarrybitCinisusedinoperationssuchas incrementing an input value or the overall result.

ALU outputs
 Inadditiontotheinput
signals, there are output ALUop, signals V, C, N & Z which Cin indicate special conditions
in the arithmetic result:
 V: overflow condition
 Theresultoftheoperationcouldnotbestoredinthen
bits of G, meaning that the result is incorrect.
 C: carry-out bit
 Usedtodetecterrorsinunsignedarithmetic.
 N: Negative indicator
 Z: Zero-condition indicator

ALU block diagram
 In addition to data inputs and outputs, this circuit also has:
 outputs indicating the different conditions,
 inputs specifying the operation to perform (similar to Sub).
A1 G … G0
Data input A
Data input B
Carry input
Operation & Mode select
Data output G
Carry output
Overflow indicator Negative indicator Zero indicator

ALU Disclaimer
 TherearemultiplewaysthattheALUcanbe implemented.
 Allimplementationsdothesamegeneralfunction (arithmetic and logical operations).
 TheoperationsthattheALUcanperform,howit performs them, and specific input and output signals can vary.
 Wewillgiveyouacoupleofimplementations (that you should learn), but just keep in mind that others are possible as well.

The ALU from the labs
 Thisdesigniseasyto see and build, but not optimized for size.
 ALUdesignsneedto find ways to avoid duplicating logic and reducing the footprint.
 Tools like Quartus reduce these designs automatically when mapping to gates.
Subtracter
Bitwise AND

Revisiting the “A” of ALU
 WhenbuildingtheALUoperationsfrom scratch, we start with the arithmetic side.
 Fundamentally,thissideismadeofanadder/ subtractor unit, which we’ve seen already:
Y3 Y2 Y1 Y0 X3 X2 X1 X0
FA C3 FA C2 FA C1 FA Cin S3 S2 S1 S0
What else could we put into this layer?

Arithmetic components
n-bit parallel adder
n n G=X+Y+Cin B
 Inadditiontoadditionandsubtraction,manymore operations can be performed by manipulating what is added to input A, as shown in the diagram above.
B input logic

Arithmetic operations
 IftheinputlogiccircuitontheleftsendsB straight through to the adder, result is G = A+B
 WhatifBwasreplacedbyallonesinstead?
 Result of addition operation: G = A-1
 WhatifBwasreplacedbyBandCinwashigh?  Result of addition operation: G = A-B
 AndwhatifBwasreplacedbyallzeroes?
 Result is: G = A. (Not interesting, but useful!)
 Instead of a Sub signal, the operation you want is signaled using the select bits S0 & S1.

Operation selection
Select bits
Result Operation
G=A Transfer
G = A+B Addition
G = A+B Subtraction – 1
G = A-1 Decrement
 Thisisagoodstart!Butsomethingismissing…  Wait,whataboutthecarrybit?

Full operation selection
Select Input Operation
S1 S0 Y Cin=0 Cin=1
G = A (transfer) G = A+1 (increment)
G=A+B (add) G=A+B+1
G=A+B G=A+B+1 (subtract)
G=A-1 (decrement) G=A (transfer)
 Basedonthevaluesontheselectbitsandthe carry bit, we can perform any number of basic arithmetic operations by manipulating what value is added to A.

The “L” of ALU
 We also want a circuit that can perform logical operations,
in addition to arithmetic ones.
 How do we tell which operation perform?
4-to-1 mux
 Another select bit!
 If S2 = 1, then the output of the logic circuit block
appears at the ALU output.
 Multiplexer is used to determine which block (logical or arithmetic) goes to the output.

ALU: Arithmetic + Logic
Arithmetic circuit
Ai N Bi Z
Logic circuit

What about multiplication?
 Multiplication(anddivision)operationsare always more complicated than other arithmetic (plus, minus) or logical (AND, OR) operations.
 Threemajorwaysthatmultiplicationcanbe implemented in circuitry:
 Layeredrowsofadderunits.  Anadder/shiftercircuit
 Booth’sAlgorithm

Binary Multiplication
 Revisitinggrade3math…
1368 1368 912

Binary Multiplication
5*6 (unsigned)
Result: 30
 Andnow,inbinary… 101 10
110 110 000

Binary Multiplication
 Orseenanotherway….
000 000 101

Binary Multiplication

Implementation
 Implementingthis in circuitry involves the summation of several AND terms.
 ANDgatescombine input signals.
 Adderscombinethe outputs of the AND gates.

Multiplication
 Thisimplementation results in an array of adder circuits to make the multiplier circuit.
 Thiscangetalittle expensive as the size of the operands grows.
 N-bitnumbersO(1)time,butO(N2)size.
 Isthereanalternativetothiscircuit?

Accumulator circuits
 Whatifyoucouldperformeachstageofthe multiplication operation, one after the other?
 Thiscircuitwouldonly need a single row of adders and a couple of shift registers.
Register A cout
Register B
Shift Left 1
 Howwidedoes register R have to be?
 Isthereasimpler way to do this?
Shift Left 1
Adder Register R

Accumulator, illustrated
a3 a2 a1 a0
a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1
a3b2 a2b2 a1b2 a0b2
a3b3 a2b3 a1b3 a0b3

Accumulator, illustrated
a3 a2 a1 a0
a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1
a3b2 a2b2 a1b2 a0b2
a3b3 a2b3 a1b3 a0b3
 Isthereamoreefficientwaytodothis?

Booth’s Algorithm
 Devisedasawaytotakeadvantageofcircuits where shifting is cheaper than adding, or where space is at a premium.
 Based on the premise that when multiplying by certain values (e.g. 99), it can be easier to think of this operation as a difference between two products.
 Considertheshortcutmethodwhenmultiplyinga given decimal value X by 9999:
 X*9999 = X*10000 – X*1
 Nowconsidertheequivalentprobleminbinary:  X*001111 = X*010000 – X*1

Booth’s Algorithm
 Thisideaistriggeredoncaseswheretwo neighboring digits in an operand are different.
 Ifdigitsatiandi-1are0and1,themultiplicand is added to the result at position i.
 Ifdigitsatiandi-1are1and0,themultiplicand is subtracted from the result at position i.
 Theresultisalwaysavaluewhosesizeisthe sum of the sizes of the two multiplicands.

Booth’s Algorithm
 Example: Add B here
x 00011110
+ 111110101110
0100110011100
Subtract B from here
Sign extend this before adding

Booth’s Algorithm
 Weneedtomakethisworkinhardware.  Option#1:Havehardwaresetuptocompare
neighbouring bits at every position in A, with adders in place for when the bits don’t match.
 Problem: This is a lot of hardware, which Booth’s Algorithm is trying to avoid.
 Option#2:Havehardwaresetuptocomparetw0 neighbouring bits, and have them move down through A, looking for mismatched pairs.
 Problem: Hardware doesn’t move like that. Oops.

Booth’s Algorithm
 Stillneedtomakethisworkinhardware…
 Option#3:Havehardwaresetuptocomparetw0 neighbouring bits in the lowest position of A, and looking for mismatched pairs in A by shifting A to the right one bit at a time.
 Solution! This could work, but the accumulated solution P would have to shift one bit at a time as well, so that when B is added or subtracted, it’s from the correct position.

Booth’s Algorithm
 StepsinBooth’sAlgorithm:
Designate the two multiplicands as A & B, and the result as some product P.
Add an extra zero bit to the right-most side of A.
Repeat the following for each original bit in A: If the last two bits of A are the same, do nothing.
If the last two bits of A are 01, then add B to the highest bitsofP.
If the last two bits of A are 10, then subtract B from the highest bits of P.
Perform one-digit arithmetic right-shift on both P and A. The result in P is the product of A and B.

Booth’s Algorithm Example
 Example:(-5) * 2  Steps#1&#2:
A=-5 11011
 Addextrazerototheright 
B=2  00010
 -B=-2  11110
 P=0  0000000000

Booth’s Algorithm Example
 Step#3(repeat5times):  ChecklasttwodigitsofA:
 Sincedigitsare10,subtractBfromthemost significant digits of P:
00000 00000
11110 00000
 ArithmeticshiftPandAonebittotheright:  A = 111011 P = 11111 00000

Booth’s Algorithm Example
 Step#3(repeat4moretimes):  ChecklasttwodigitsofA:
 Sincedigitsare11,donothingtoP.
 ArithmeticshiftPandAonebittotheright:  A = 111101 P = 11111 10000

Booth’s Algorithm Example
 Step#3(repeat3moretimes):  ChecklasttwodigitsofA:
 Sincedigitsare01,addBtothemostsignificant
digits of P:
11111 10000
00001 10000
 ArithmeticshiftPandAonebittotheright:  A = 111110 P = 00000 11000

Booth’s Algorithm Example
 Step#3(repeat2moretimes):  ChecklasttwodigitsofA:
 Sincedigitsare10,subtractBfromthemost significant digits of P:
00000 11000
11110 11000
 ArithmeticshiftPandAonebittotheright:  A = 111111 P = 11111 01100

Booth’s Algorithm Example
 Step#3(finaltime):
 ChecklasttwodigitsofA:
 Sincedigitsare11,donothingtoP:
 ArithmeticshiftPandAonebittotheright:  A = 111111 P = 11111 10110
 Finalproduct:
P = 111110110 = -10

Reflections on multiplication
 Apopularversionofthisalgorithminvolves copying A into the lower bits of P, so that the testing and shifting only takes place in P.
 Also good for maintaining the original value of A.
 Multiplicationisn’tascommonanoperationas addition or subtraction, but occurs enough that its implementation is handled in the hardware, rather than by the CPU.
 Mostcommonmultiplicationanddivision operations are powers of 2. For this, the shift register is used instead of the multiplier circuit.

A Barrel Shifter unit
D3 D2 D1 D0
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
3 2 1 0 S1 S0
4-to-1 MUX
Y3 Y2 Y1 Y0
 This barrel shifter shifts and rotates D to the left by S bits.  IfS1S0 is01 => Y=D2D1D0D3
 IfS1S0 is11 => Y=D0D3D2D1
 This is a purely combinational circuit, unlike the shift registers in the lab.

Expanding our view
Constant in
n G select A
Address out
MuxB select
H select IR
MuxF select
 So where do A and B come from?

The Storage Unit
aka: the register file and main memory

Memory and registers
 Theprocessorhasregistersthatstoreasingle value (program counters, instruction registers, etc.)
 TherearealsounitsintheCPUthatstorelarge amounts of data for use by the CPU:
 Registerfile:Smallnumberoffastmemoryunits that allow multiple values to be read and written simultaneously.
 Mainmemory:Largergridofmemorycellsthatare used to store the main information to be processed by the CPU.

An Analogy
 Ifregistersarebooks…
 Theregisterfileisthe
pile of books on your
desk, small in number
but available for quick access.
 Mainmemoryisthelibrary.Largercapacity, but takes time to access.
 Otherelements:cache(locallibrarybranch), and networks (collections around the world)

The Register File
Register 0
Register 1
Register 2
… Register
Register 31
Register File
 The register file stores the data that the processor uses for quick calculations.
 For the example we use in this course, the typical register file contains 32 registers.
 All the registers share a single set of input and output wires.
 Like an apartment building with a single shared entrance, as opposed to a street of houses.
 So how do we specify a register toreadorwrite?

Writing to the registers
Register 0
Register 1
Register 2
… Register
Register 31
 Towritearegister,weneed to activate the Write signal for the desired register.
 Assume we have the number (from 0 to 31, in binary) of the register in this file.
 This number is also called the address of the register.
 Giventheregister’saddress, how do we turn on the write signal for that register?
Register File

Writing to registers
 Solution:
 Usedemuxto specify a single register, based on that register’s address.
 Alsoknownasa one-hot decoder.
Register 0
Register 1
Register 2
… Register
Register 31
Register File

One-hot decoders (for writing)
 Theone-hotdecodertakesinam-bitbinaryaddressto activate one of the 2m registers in the register file.
Address bits One-hot decoder output

Reading from registers
 Solution:
 Considerthesame approach that was used in writing.
 Mux with address as select bits.
Register 0
Register 1
Register 2
Register 31
Register File

Register File Functionality
Register 0
Register 1
Register 2
… Register
Register 2m
Read/Write Destination Reg.
(m-bit address) Data to write
Register A (m-bit address)
Register B (m-bit address)
Register File
n-bit value from Reg. A
n-bit value from Reg. B

Handling multiple registers
 Theregisterunitinourmicroprocessorstores32 registers (each register storing a 32 bit value).
 Howdoweaccessorupdateasingleregister?
 Need to specify ALU input A and ALU input B when
performing a read operation.
 Need to specify a register to write to (and the data
value to write) when performing a write operation.
 Bothofthesearedonebyspecifyingtheaddress
of each register among the 32 available.
 Each address will be 5 bits (log2 32).

Register File – Write Operation
Load Enable
Reg A select
Reg B select
Destination
Reg. Address

Register File – Write Operation
Reg A select
Reg B select
Destination
Reg. Address

Register File – Read Operation
Load Enable
Reg A select
Reg B select
Note: Data, A and B, and all the registers (R0 to R3) have the same bitwidth (e.g., n bits).
Destination
Reg. Addre

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com