CMPEN
Lecture 11
331
Dealing with Overflow
• Some languages (e.g., C) ignore overflow • UseMIPSaddu,addiu,subuinstructions
• Other languages (e.g., Ada, Fortran) require raising an exception
• UseMIPSadd,addi,subinstructions
• Onoverflow,invokeexceptionhandler
• Save PC in exception program counter (EPC) register
• Jump to predefined handler address
• mfc0 (move from coprocessor reg) instruction can retrieve EPC value, to return after corrective action
Chapter 3 — Arithmetic for Computers — 2
Multiplication
• Start with long-multiplication approach
multiplicand multiplier
1000
× 1001
1000
0000 0000
1000 1001000
product
Length of product is the sum of operand lengths
Chapter 3 — Arithmetic for Computers — 3
§3.3 Multiplication
Multiplication Hardware
Chapter 3 — Arithmetic for Computers — 4
Initially 0
Optimized Multiplier
• Perform steps in parallel: add/shift
n One cycle per partial-product addition
n That’s ok, if frequency of multiplications is low
Chapter 3 — Arithmetic for Computers — 5
Faster Multiplier
• Uses multiple adders
• Cost/performancetradeoff
n Can be pipelined
n Several multiplication performed in parallel Chapter 3 — Arithmetic for Computers — 6
MIPS Multiplication
• Two32-bitregistersforproduct • HI: most-significant 32 bits
• LO: least-significant 32-bits
• Instructions
•multrs,rt / multurs,rt
• 64-bit product in HI/LO
•mfhird / mflord
• Move from HI/LO to rd
• mul rd, rs, rt
• Least-significant 32 bits of product –> rd
Chapter 3 — Arithmetic for Computers — 7
quotient dividend
Division
• •
Check for 0 divisor
Long division approach
• If divisor ≤ dividend bits
• 1bitinquotient,subtract
• Otherwise
• 0bitinquotient,bringdownnext dividend bit
Restoring division
• Do the subtract, and if remainder goes < 0, add divisor back
Signed division
• Divide using absolute values
• Adjust sign of quotient and remainder as required
1001
1000 1001010
divisor
-1000
10 •
101
1010 -1000 10
•
remainder
n-bit operands yield n-bit quotient and remainder
Chapter 3 — Arithmetic for Computers — 8
§3.4 Division
Division Hardware
Chapter 3 — Arithmetic for Computers — 9
Initially divisor in left half
Initially dividend
Optimized Divider
• Onecycleperpartial-remaindersubtraction
• Looksalotlikeamultiplier!
• Same hardware can be used for both
Chapter 3 — Arithmetic for Computers — 10
Faster Division
• Can’t use parallel hardware as in multiplier
• Subtractionisconditionalonsignofremainder
Chapter 3 — Arithmetic for Computers — 11
MIPS Division
• Use HI/LO registers for result • HI:32-bitremainder
• LO:32-bitquotient
• Instructions
•div rs, rt / divu rs, rt
• Nooverflowordivide-by-0checking
• Software must perform checks if required
• Usemfhi,mflotoaccessresult
Chapter 3 — Arithmetic for Computers — 12
Floating Point
• Representation for non-integral numbers
• Includingverysmallandverylargenumbers
• Like scientific notation • –2.34×1056
• +0.002×10–4
• In binary
• ±1.xxxxxxx2 × 2yyyy
• Types float and double in C
Chapter 3 — Arithmetic for Computers — 13
§3.5 Floating Point
Floating Point Standard
• Defined by IEEE Std 754-1985
• Developed in response to divergence of representations
• Portabilityissuesforscientificcode
• Now almost universally adopted
• Two representations
• Singleprecision(32-bit) • Doubleprecision(64-bit)
Chapter 3 — Arithmetic for Computers — 14
IEEE Floating
single: 8 bits single: 23 bits double: 11 bits double: 52 bits
-
Point Format
S
Exponent
Fraction
S (Exponent-Bias) x=(-1) ´(1+Fraction)´2
• S: sign bit (0 Þ non-negative, 1 Þ negative)
• Normalize significand: 1.0 ≤ |significand| < 2.0
• Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)
• Significand is Fraction with the “1.” restored
• Exponent: excess representation: actual exponent + Bias
• Ensures exponent is unsigned
• Single: Bias = 127;; Double: Bias = 1023
Chapter 3 — Arithmetic for Computers — 15
Single
• Exponents00000000and11111111reserved • Smallestvalue
• Exponent: 00000001
Þ actual exponent = 1 – 127 = –126
• Fraction: 000...00 Þ significand = 1.0
• ±1.0 × 2–126 ≈ ±1.2 × 10–38
• Largestvalue
• exponent: 11111110
Þ actual exponent = 254 – 127 = +127
• Fraction: 111...11 Þ significand ≈ 1.11111
• ±1.11111 × 2+127 ≈ ±3.4 × 10+38
Chapter 3 — Arithmetic for Computers — 16
-
Precision Range
CMPEN
Lecture 12
331
IEEE Floating
single: 8 bits single: 23 bits double: 11 bits double: 52 bits
-
Point Format
S
Exponent
Fraction
S (Exponent-Bias) x=(-1) ´(1+Fraction)´2
• S: sign bit (0 Þ non-negative, 1 Þ negative)
• Normalize significand: 1.0 ≤ |significand| < 2.0
• Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)
• Significand is Fraction with the “1.” restored
• Exponent: excess representation: actual exponent + Bias
• Ensures exponent is unsigned
• Single: Bias = 127;; Double: Bias = 1023
Chapter 3 — Arithmetic for Computers — 18
Double
• Exponents0000...00and1111...11reserved • Smallestvalue
• Exponent: 00000000001
Þ actual exponent = 1 – 1023 = –1022
• Fraction: 000...00 Þ significand = 1.0
• ±1.0 × 2–1022 ≈ ±2.2 × 10–308
• Largestvalue
• Exponent: 11111111110
Þ actual exponent = 2046 – 1023 = +1023
• Fraction: 111...11 Þ significand ≈ 1.111111
• ±1.111111 × 2+1023 ≈ ±1.8 × 10+308
Chapter 3 — Arithmetic for Computers — 19
-
Precision Range
Floating
• Relative precision
• allfractionbitsaresignificant
• Single:approx2–23
• Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision
• Double:approx2–52
• Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision
Chapter 3 — Arithmetic for Computers — 20
-
Point Precision
Floating
• Represent –0.75
• –0.75 = (–1)1 × 1.12 × 2–1 • S=1
• Fraction=1000...002
• Exponent=–1+Bias
• Single: –1 + 127 = 126 = 011111102
• Double: –1 + 1023 = 1022 = 011111111102
• Single: 1011111101000...00
• Double: 1011111111101000...00
Chapter 3 — Arithmetic for Computers — 21
-
Point Example
Floating
• What number is represented by the single-precision float
11000000101000...00
• S=1
• Fraction=01000...002
• Fxponent = 100000012 = 129
• x=(–1)1 ×(1+012)×2(129–127) = (–1) × 1.25 × 22
= –5.0
Chapter 3 — Arithmetic for Computers — 22
-
Point Example
Floating
• Nowconsidera4-digitbinaryexample
• 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
• 1.Alignbinarypoints
• Shift number with smaller exponent • 1.0002 × 2–1 + –0.1112 × 2–1
• 2.Addsignificands
• 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
• 3.Normalizeresult&checkfor over/underflow
• 1.0002 × 2–4, with no over/underflow
• 4.Roundandrenormalizeifnecessary
• 1.0002 × 2–4 (no change) = 0.0625 Chapter 3 — Arithmetic for Computers — 23
-
Point Addition
FP Adder Hardware
• Much more complex than integer adder
• Doing it in one clock cycle would take too long • Muchlongerthanintegeroperations
• Slowerclockwouldpenalizeallinstructions
• FP adder usually takes several cycles • Canbepipelined
Chapter 3 — Arithmetic for Computers — 24
FP Adder Hardware
Step 1
Step 2
Step 3 Step 4
Chapter 3 — Arithmetic for Computers — 25
Floating
• Consider a 4-digit decimal example • 1.110 × 1010 × 9.200 × 10–5
• 1. Add exponents
• For biased exponents, subtract bias from sum • Newexponent=10+–5=5
• 2. Multiply significands
• 1.110 × 9.200 = 10.212 Þ 10.212 × 105
• 3. Normalize result & check for over/underflow
• 1.0212 × 106
• 4. Round and renormalize if necessary • 1.021 × 106
• 5. Determine sign of result from signs of operands • +1.021 × 106
Chapter 3 — Arithmetic for Computers — 26
-
Point Multiplication
Floating Point
• Multiply
(0.5 = 1.0000 ´ 2-1) x (-0.4375 = -1.1100´ 2-2)
• Step 0:
• Step 1: •
• Step 2:
• Step 3: •
• Step 4:
• Step 5:
Hidden bits restored in the representation above
Add the exponents (not in bias would be -1 + (-2) = -3 and in bias would be (-1+127) + (-2+127) – 127 = (-1 -2) + (127+127-127) = -3 + 127 = 124
Multiply the significands
1.0000 x 1.110 = 1.110000
Normalized the product, checking for exp over/underflow 1.110000 x 2-3 is already normalized
The product is already rounded, so we’re done Rehide the hidden bit before storing
Multiplication Example
27
FP Arithmetic Hardware
• FP multiplier is of similar complexity to FP adder
• Butusesamultiplierforsignificandsinsteadofan adder
• FP arithmetic hardware usually does
• Addition,subtraction,multiplication,division,reciprocal, square-root
• FP«integerconversion
• Operations usually takes several cycles
• Canbepipelined
Chapter 3 — Arithmetic for Computers — 28
Infinities and NaNs
• Exponent = 111...1, Fraction = 000...0
• ±Infinity
• Canbeusedinsubsequentcalculations,avoidingneed for overflow check
• Exponent = 111...1, Fraction ≠ 000...0
• Not-a-Number(NaN)
• Indicatesillegalorundefinedresult • e.g., 0.0 / 0.0
• Canbeusedinsubsequentcalculations
Chapter 3 — Arithmetic for Computers — 29
IEEE
point
numbers
754 encoding of
floating
-
Chapter 3 — Arithmetic for Computers — 30
MIPS Floating Point Instructions
• MIPS has a separate Floating Point Register File ($f0, $f1, ..., $f31) (whose registers are used in
pairs for double precision values) with special instructions to load to and store from them
lwc1 $f1,54($s2) #$f1 = Memory[$s2+54]
swc1 $f1,58($s4) #Memory[$s4+58] = $f1
• And supports IEEE 754 single
add.s $f2,$f4,$f6 #$f2 = $f4 + $f6
and double precision operations
add.d $f2,$f4,$f6 #$f2||$f3 =
$f4||$f5 + $f6||$f7
similarly for sub.s, sub.d, mul.s, mul.d, div.s, div.d
31
•
MIPS Floating Point Instructions, Con’t
And floating point single precision comparison operations
c.x.s $f2,$f4 #if($f2 < $f4) cond=1;
else cond=0
where x may be eq, neq, lt, le, gt, ge and double precision comparison operations
c.x.d $f2,$f4 #$f2||$f3 < $f4||$f5
cond=1; else
cond=0
bclt 25 #branch,trueif(cond==1) go to PC+4+25
bclf 25 #branch,falseif(cond==0) go to PC+4+25
•
And floating point branch operations
32
• Ccode:
FP Example:
°
float f2c (float fahr) {
return ((5.0/9.0)*(fahr - 32.0));
}
• fahr in $f12, result in $f0, literals in global memory space
• CompiledMIPScode:
f2c: lwc1 $f16, const5($gp) # $f16 =5.0
lwc1 $f18, const9($gp) # $f18 =9.0
div.s $f16, $f16, $f18 # $f16 =5.0/9.0
lwc1 $f18, const32($gp) # $f18 =32.0
sub.s $f18, $f12, $f18 # $f18 =fahr-32.0
mul.s $f0, $f16, $f18
jr $ra
Chapter 3 — Arithmetic for Computers — 33
F to
°
C
Accurate Arithmetic
• IEEEStd754specifiesadditionalroundingcontrol
• Extra bits of precision (guard, round)
• Choice of rounding modes
• Allows programmer to fine-tune numerical behavior of a computation
• NotallFPunitsimplementalloptions
• Most programming languages and FP libraries just use defaults
• Trade-offbetweenhardwarecomplexity, performance, and market requirements
Chapter 3 — Arithmetic for Computers — 34
Associativity
• Parallel programs may interleave operations in unexpected orders
• Assumptionsofassociativitymayfail
(x+y)+z
x+(y+z)
x
-1.50E+38
0.00E+00
-1.50E+38
y
1.50E+38
1.50E+38
z
1.0
1.0
1.00E+00
0.00E+00
n Need to validate parallel programs under varying degrees of parallelism
Chapter 3 — Arithmetic for Computers — 35
§3.6 Parallelism and Computer Arithmetic: Associativity
• •
•
Right Shift and Division
Left shift by i places multiplies an integer by 2i Right shift divides by 2i?
• Onlyforunsignedintegers
For signed integers
• Arithmeticrightshift:replicatethesignbit
• e.g.,–5/4
• 111111111111111111111111111110112 >> 2 = 001111111111111111111111111111102
= 1073741822
• Asolutionwouldbetohaveanarithmeticrightshiftthatextends the sign bit instead of shifting in 0s. A 2-bit arithmetic shift right of -5ten produces
• 111111111111111111111111111111102 = -2 instead of -1;; (close)
Chapter 3 — Arithmetic for Computers — 36
§3.8 Fallacies and Pitfalls
Review
• Overflow (floating point) happens when a positive exponent becomes too large to fit in the exponent field
• Underflow (floating point) happens when a negative exponent becomes too large to fit in the exponent field
• IEEE 754 floating point standard (-1)sign x (1+F) x 2E-bias
• Formats for both single and double precision
• F is stored in normalized format where the msb in F is 1,
called the hidden bit
• To simplify sorting FP numbers, E is represented in excess (biased) notation where the bias is -127 (-1023 for double precision) so the most negative is 00000001 = 21-127 = 2-126 and the most positive is 11111110 = 2254-127 = 2+127
Chapter 3 — Arithmetic for Computers — 37