Chapter …
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 1
COMPUTERORGANIZATIONANDDESIGN
The Hardware/Software Interface
5
th
Edition
Chapter 3 & Appendix B
(continued)
Arithmetic for Computers
Chapter 3 — Arithmetic for Computers — 2
Floating Point
Representation for non-integral numbers
Including very small and very large numbers
Like scientific notation
–2.34 × 1056
+0.002 × 10–4
+987.02 × 109
In binary
±1.xxxxxxx2 × 2
yyyy
Types float and double in C
normalized
not normalized
§
3
.5
F
lo
a
tin
g
P
o
in
t
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 2
Chapter 3 — Arithmetic for Computers — 3
Floating Point Standard
Defined by IEEE standard 754-1985
Developed in response to divergence of
representations
Portability issues for scientific code
Now almost universally adopted
Two representations
Single precision (32-bit)
Double precision (64-bit)
Chapter 3 — Arithmetic for Computers — 4
IEEE Floating-Point Format
S: sign bit (0 non-negative, 1 negative)
Normalized significand: 1.0 ≤ |significand| < 2.0
Always has a leading pre-binary-point 1 bit, so no need to
represent it explicitly (hidden bit)
Significand is Fraction with the “1.” restored
Exponent: actual exponent + Bias
Ensures exponent is unsigned
Single: Bias = 127; Double: Bias = 1023
S Exponent Fraction
single: 8 bits
double: 11 bits
single: 23 bits
double: 52 bits
Bias)(ExponentS
2Fraction)(11)(x
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 3
Chapter 3 — Arithmetic for Computers — 5
Single-Precision Range
Exponents 00000000 and 11111111 reserved
Smallest value
Exponent: 00000001
actual exponent = 1 – 127 = –126
Fraction: 000…00 significand = 1.0
±1.0 × 2–126 ≈ ±1.2 × 10–38
Largest value
exponent: 11111110
actual exponent = 254 – 127 = +127
Fraction: 111…11 significand ≈ 2.0
±2.0 × 2+127 ≈ ±3.4 × 10+38
Chapter 3 — Arithmetic for Computers — 6
Double-Precision Range
Exponents 0000…00 and 1111…11 reserved
Smallest value
Exponent: 00000000001
actual exponent = 1 – 1023 = –1022
Fraction: 000…00 significand = 1.0
±1.0 × 2–1022 ≈ ±2.2 × 10–308
Largest value
Exponent: 11111111110
actual exponent = 2046 – 1023 = +1023
Fraction: 111…11 significand ≈ 2.0
±2.0 × 2+1023 ≈ ±1.8 × 10+308
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 4
Chapter 3 — Arithmetic for Computers — 7
Floating-Point Precision
Relative precision
all fraction bits are significant
Single: approx 2–23
Equivalent to 23 × log102 ≈ 23 × 0.3
≈ 6 decimal digits of precision
Double: approx 2–52
Equivalent to 52 × log102 ≈ 52 × 0.3
≈ 16 decimal digits of precision
Binary Refresher
When we look at a number like 101102,
we’re seeing it as:
1(24) + 0(23) + 1(22) + 1(21) + 0(20)
= 16 + 4 + 2 = 2210
Chapter 1 — Computer Abstractions and Technology — 8
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 5
Binary Decimal Points
In decimal, 12.63 is the same as
1(101) + 2(100) + 6(10-1) + 3(10-2)
In binary, 101.012 is the same as
1(22) + 0(21) + 1(20) + 0(2-1) + 1(2-2)
= 4 + 1 + 0.25 = 5.25
Chapter 1 — Computer Abstractions and Technology — 9
Useful Exponent Identities
ab * ac = ab+c
Why memorize more than 210 when we can
just break them down?
235 = 25 * 230
= 25 * 210 * 210 * 210 = 32 MB
a-b =
1
𝑎𝑏
When we have decimal terms, instead of
seeing 2-1, 2-2, etc. use
1
21
,
1
22
, etc.
Chapter 1 — Computer Abstractions and Technology — 10
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 6
More Binary
Dividing (shifting right) by 210 is the
same as moving the decimal point one
place to the left.
Same reasoning as when we divide by 10 in
base 10
Multiplying (shifting left) works the
same way
Chapter 1 — Computer Abstractions and Technology — 11
Chapter 3 — Arithmetic for Computers — 12
Floating-Point Example
Represent –0.75
–0.75 = (–1)1 × 1.12 × 2
–1
S = 1
Fraction = 1000…002
Exponent = –1 + Bias
Single: –1 + 127 = 126 = 011111102
Double: –1 + 1023 = 1022 = 011111111102
Single: 1011111101000…00
Double: 1011111111101000…00
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 7
Chapter 3 — Arithmetic for Computers — 13
Floating-Point Example
What number is represented by the
single-precision float
11000000101000…00
S = 1
Fraction = 01000…002
Exponent = 100000012 = 129
x = (–1)1 × (1 + 0.012) × 2
(129 – 127)
= (–1) × 1.25 × 22
= –5.0
IEEE 754-1985 Specials
We reserve all 0s and all 1s in the
exponent. This is why:
0111111110000…00 = +∞
1111111110000…00 = -∞
X11111111[non-zero] = NaN
e.g., square root of a negative number
X000000000000…00 = 0
...there’s actually a positive zero
and a negative zero
Chapter 1 — Computer Abstractions and Technology — 14
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 8
Chapter 3 — Arithmetic for Computers — 15
Floating-Point Addition
Consider a 4-digit decimal example
9.999 × 101 + 1.610 × 10–1
1. Align decimal points
Shift number with smaller exponent
9.999 × 101 + 0.016 × 101
2. Add significands
9.999 × 101 + 0.016 × 101 = 10.015 × 101
3. Normalize result & check for
over/underflow
1.0015 × 102
4. Round and renormalize if necessary
1.002 × 102
Chapter 3 — Arithmetic for Computers — 16
Floating-Point Addition
Now consider a 4-digit binary example
1.0002 × 2
–1 + –1.1102 × 2
–2 (0.5 + –0.4375)
1. Align binary points
Shift number with smaller exponent
1.0002 × 2
–1 + –0.1112 × 2
–1
2. Add significands
1.0002 × 2
–1 + –0.1112 × 2
–1 = 0.0012 × 2
–1
3. Normalize result & check for
over/underflow
1.0002 × 2
–4, with no over/underflow
4. Round and renormalize if necessary
1.0002 × 2
–4 (no change) = 0.0625
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 9
Chapter 3 — Arithmetic for Computers — 17
FP Instructions in MIPS
FP hardware is coprocessor 1
Adjunct processor that extends the ISA
Separate FP registers
32 single-precision: $f0, $f1, … $f31
Paired for double-precision: $f0/$f1, $f2/$f3, …
Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s
FP instructions operate only on FP registers
Programs generally don’t do integer ops on FP data,
or vice versa
More registers with minimal code-size impact
FP load and store instructions
lwc1, ldc1, swc1, sdc1
e.g., ldc1 $f8, 32($sp)
Chapter 3 — Arithmetic for Computers — 18
FP Instructions in MIPS
Single-precision arithmetic
add.s, sub.s, mul.s, div.s
e.g., add.s $f0, $f1, $f6
Double-precision arithmetic
add.d, sub.d, mul.d, div.d
e.g., mul.d $f4, $f4, $f6
Single- and double-precision comparison
c.xx.s, c.xx.d (xx is eq, lt, le, …)
Sets or clears FP condition-code bit
e.g. c.lt.s $f3, $f4
Branch on FP condition code true or false
bc1t, bc1f
e.g., bc1t TargetLabel
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 10
Chapter 3 — Arithmetic for Computers — 19
FP Example: °F to °C
C code:
float f2c (float fahr) {
return ((5.0/9.0)*(fahr - 32.0));
}
fahr in $f12, result in $f0, literals in global
memory space
Compiled MIPS code:
f2c: lwc1 $f16, const5($gp)
lwc2 $f18, const9($gp)
div.s $f16, $f16, $f18
lwc1 $f18, const32($gp)
sub.s $f18, $f12, $f18
mul.s $f0, $f16, $f18
jr $ra
Chapter 3 — Arithmetic for Computers — 20
FP Example: Array Multiplication
X = X + Y × Z
All 32 × 32 matrices, 64-bit double-precision
elements
C code:
void mm (double x[][],
double y[][], double z[][]) {
int i, j, k;
for (i = 0; i! = 32; i = i + 1)
for (j = 0; j! = 32; j = j + 1)
for (k = 0; k! = 32; k = k + 1)
x[i][j] = x[i][j]
+ y[i][k] * z[k][j];
}
Addresses of x, y, z in $a0, $a1, $a2, and
i, j, k in $s0, $s1, $s2
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 11
Chapter 3 — Arithmetic for Computers — 21
FP Example: Array Multiplication
MIPS code:
li $t1, 32 # $t1 = 32 (row size/loop end)
li $s0, 0 # i = 0; initialize 1st for loop
L1: li $s1, 0 # j = 0; restart 2nd for loop
L2: li $s2, 0 # k = 0; restart 3rd for loop
sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)
addu $t2, $t2, $s1 # $t2 = i * size(row) + j
sll $t2, $t2, 3 # $t2 = byte offset of [i][j]
addu $t2, $a0, $t2 # $t2 = byte address of x[i][j]
l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]
L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z)
addu $t0, $t0, $s1 # $t0 = k * size(row) + j
sll $t0, $t0, 3 # $t0 = byte offset of [k][j]
addu $t0, $a2, $t0 # $t0 = byte address of z[k][j]
l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j]
…
Chapter 3 — Arithmetic for Computers — 22
FP Example: Array Multiplication
…
sll $t0, $s0, 5 # $t0 = i*32 (size of row of y)
addu $t0, $t0, $s2 # $t0 = i*size(row) + k
sll $t0, $t0, 3 # $t0 = byte offset of [i][k]
addu $t0, $a1, $t0 # $t0 = byte address of y[i][k]
l.d $f18, 0($t0) # $f18 = 8 bytes of y[i][k]
mul.d $f16, $f18, $f16 # $f16 = y[i][k] * z[k][j]
add.d $f4, $f4, $f16 # f4=x[i][j] + y[i][k]*z[k][j]
addiu $s2, $s2, 1 # $k k + 1
bne $s2, $t1, L3 # if (k != 32) go to L3
s.d $f4, 0($t2) # x[i][j] = $f4
addiu $s1, $s1, 1 # $j = j + 1
bne $s1, $t1, L2 # if (j != 32) go to L2
addiu $s0, $s0, 1 # $i = i + 1
bne $s0, $t1, L1 # if (i != 32) go to L1
Morgan Kaufmann Publishers 16 October, 2018
Chapter 3 — Arithmetic for Computers 12
Chapter 3 — Arithmetic for Computers — 23
Accurate Arithmetic
IEEE Std 754 specifies additional rounding
control
Extra bits of precision (guard, round, sticky)
Choice of rounding modes
Allows programmer to fine-tune numerical behavior
of a computation
Not all FP units implement all options
Most programming languages and FP libraries just
use defaults
Trade-off between hardware complexity,
performance, and market requirements
Chapter 3 — Arithmetic for Computers — 24
Associativity
Parallel programs may interleave
operations in unexpected orders
Assumptions of associativity may fail
(x+y)+z x+(y+z)
x -1.50E+38 -1.50E+38
y 1.50E+38
z 1.0 1.0
1.00E+00 0.00E+00
0.00E+00
1.50E+38
Need to validate parallel programs under
varying degrees of parallelism