Chapter 3 Arithmetic for Computers
1
Numbers
• Bits are just bits
— conventions define relationship between bits and numbers
• Binary numbers (base 2)
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001… decimal: 0…2n-1
• Of course it gets more complicated: numbers are finite (overflow)
fractions and real numbers
negative numbers
e.g., no MIPS subi instruction; addi can add a negative number
2
Number and Base
• Number with base b: uses b digits
– Decimal number: base = 10,
– Binary number: base = 2,
– Hexadecimal number: base = 16,
e.g., 108 ten
e.g., 1101100 two e.g., 6C hex
(uses 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F)
3
• (dn-1 dn-2 …d2 d0 )b
=dn-1 *bn-1 +dn-2 *bn-2 +…..+d1 *b1 +d0 *b0
108ten =1*102 +0*101 +8*100
1101110two =1*26 +1*25 +0*24 +1*23 +1*22 +1*21 +0*20 6Chex =6*161 +C*160
4
Conversion to a different base
Convert 27 ten to a binary number.
Keep dividing by a base 2, and keep track of remainders.
2 )_27
2 )_13 –1 2 )_6 –1 2 )_3 –0 2 )_1 –1
0 –1
Write in bottom up order 11011 two
5
Different representations of signed numbers
• Assume 3 bit words for simplicity
• Sign Magnitude: 000 = +0 001 = +1 010 = +2 011 = +3
100=-0 101=-1 110=-2 111=-3
One’s Complement 000 = +0 001 = +1 010 = +2 011 = +3
100=-3 101=-2 110=-1 111=-0
Two’s Complement 000 = +0 001 = +1 010 = +2 011 = +3
100 = -4 101 = -3 110 = -2 111 = -1
• Issues: balance, number of zeros, ease of operations
6
How to express negative numbers?
• Assume 3 bit words for simplicity
• Sign Magnitude à just set the left most digit to 1 for negative numbers.
– 100 (=-0)
– 101 (=-1)
– 110 (=-2)
• One’s complement à just invert all bits (0 -> 1, 1 -> 0)
– 111 (=-0)
– 110 (=-1)
– 101 (=-2)
• Two’s complement à invert all bits and add 1 – 111+1=000(=-0)
– 110+1=111(=-1)
– 101+1=110(=-2)
– 100+1=101(=-3)
• We will use Two’s complement representation
7
32 bit signed numbers in MIPS (2’s complement)
• 32 bit signed numbers:
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten 0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten 0000 0000 0000 0000 0000 0000 0000 0010two = + 2ten …
0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten maxint 0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten 1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten 1000 0000 0000 0000 0000 0000 0000 0010two = – 2,147,483,646ten …
1111 1111 1111 1111 1111 1111 1111 1101two = – 3ten 1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
• (-x 31 * 2 31)+(x 30 * 2 30)+(x 29 * 2 29)+…+(x1 * 2 1)+(x0*20)
minint
8
Exercise
Assume 4 bit words for simplicity.
What is the decimal representation of 1010 two if two’s complement is used?
PositiveàNegative (invert and add 1) NegativeàPositive (subtract 1 and invert)
Thus,
Subtract 1 from 1010, we get 1001.
Then invert 1001, we get 0110. That is 6. So it is -6 ten
9
Unsigned Numbers
• For example, each memory address it not negative, so we should use 32 bits to store just 0 and positive numbers.
• With 32-bits, we can represent numbers from 0 to 2 32 -1 = 4,294,967,295 ten
10
Signed vs. Unsigned Comparison
• Example $s0 has
1111 1111 1111 1111 1111 1111 1111 1111 two And $s1 has
0000 0000 0000 0000 0000 0000 0000 0000 two
After “slt $t0, $s0, $s1”
$t0 has: ______________ $t0 has 1
After “sltu $t1, $s0, $s1” (sltu deals numbers in registers as unsigned)
$t1 has: _________________ $t1 has 0
11
Two’s Complement Operations
• Negating a two’s complement number: invert all bits and add 1
• Advantages of Two’s complement representations:
1. Only one representation of zero.
2. All negative numbers have the most significant digit (left most)
of 1.
3. Sign Extension is easy: Extend the number of bits by replicating the most significant bit (most left) into the other bits.
0010 -> 0000 0010
1010 -> 1111 1010
4. Addition and Subtraction (see the next page)
12
Addition & Subtraction
• Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110 – 0110 – 0101
• Two’s complement operations easy
– subtraction using addition of negative numbers 0111
+ 1010
13
Binary Subtraction
• Consider 8-bit numbers for simplicity.
• Direct method:
7–6=1
0000 0111
– 0000 0110
0000 0001
• Or adding 2’s complement number
A – B = A + (2’s complement number of B) 7 + (-6) = 1
0000 0111
+ 1111 1010
1 0000 0001 (the left most bit 1 is ignored since it is beyond 8 bit)
14
Overflow
• The result of an operation is too large and cannot be represented in the finite computer word:
– e.g., adding two n-bit numbers does not yield an n-bit number 0111
+ 0001 1000
If words are 4 bit long, 0111
+ 0110
note that overflow term is somewhat misleading, it does not mean a carry “overflowed”
15
Detecting Overflow
• No overflow when adding a positive and a negative number
• No overflow when signs are the same for subtraction
• Overflow occurs when the value affects the sign:
– overflow when adding two positives yields a negative
– or, adding two negatives gives a positive
– or, subtract a negative from a positive and get a negative – or, subtract a positive from a negative and get a positive
Operation
Operand A
Operand B
Result indicating
overflow
A+B
>= 0
>= 0
<0
A+B
<0
<0
>= 0
A-B
>= 0
<0
<0
A-B
<0
>= 0
>= 0
16
Effects of Overflow
• An exception (interrupt) occurs
– Control jumps to predefined address for exception
– Interrupted address is saved for possible resumption
EPC – Exception Program Counter
it contains the address of the instruction that caused the exception.
17
Multiplication
• More complicated than addition
– accomplished via shifting and addition
• Let’s look at how we multiply two positive binary numbers. 0110 (multiplicand)
__x_0011 (multiplier)
• Negative numbers: convert and multiply
– there are better techniques, we won’t look at them
18
Control
We use multiplicand register, multiplier register,
and product register.
Product register’s initial value is 0.
Multiplier0 is the right most bit of the multiplier.
Multiplier0 = 1
Start
1. Test Multiplier0
Multiplier0 = 0
1a. Add multiplicand to product and place the result in Product register
2. Shift the Multiplicand register left 1 bit
3. Shift the Multiplier register right 1 bit
No: < 32 repetitions
32nd repetition?
Yes: 32 repetitions
Done
19
Example: multiply 0110 and 0011
(Assume 4-bit numbers instead of 32-bit numbers)
Iteration
Step
Multiplicand Register value
Multiplier Register value
Product Register value
0
Initial values
0110
0011
0
1st iteration
1a. Prod = Prod +Multiplicand
2.sll Multiplicand by 1 3. srl Multiplier by 1
0 1100
001
0+0110=0110
2nd iteration
1a. Prod = Prod +Multiplicand
2. sll Multplicand by 1 3. srl Multiplier by 1
01 1000
00
0110+01100 = 010010
3rd iteration
2. sll Multplicand by 1 3. srl Multiplier by 1
011 0000
0
010010
4th iteration
2. sll Multplicand by 1 3. srl Multiplier by 1
0110 0000
--
010010
Final Product
20
Multiplication Hardware
In the previous example, if we use 4-bit multiplicand, we end up 8-bit
multiplicand at the end since we kept shifting it 4 times. And the product also ends up in 8 bit. Therefore, if we use 32-bit numbers, then we need 64 bits for multiplicand and product and
we need a 64-bit adder to add those two numbers.
64 bits
Multiplicand
Shift left
64-bit ALU
Multiplier Shift right
32 bits Control test
Product
Write
64 bits
Datapath
21
Signed Multiplication
• A simple way to make both numbers positive and remember whether to complement the product when finished (leave out the most significant: run for 31 bits)
22
Multiply in MIPS
• mult vs. multu
• A pair of registers Hi, Lo to contain the 64 bits product
mfhi: move from Hi mflo: move from Lo
23
Division
• How do we perform division on binary numbers?
11 ß Quotient Divisor 10 ) 111 ß Dividend
- 10 11
-10
1 ßRemainder
Dividend = Quotient x Divisor + Remainder
Dividend and Remainder should have the same sign.
24
Control for Division
Remainder register is initialized to the Dividend value.
Divisor is shifted to left so that the left most digit is aligned with
the left most digit of the dividend.
To decide whether divisor can
be subtracted from remainder,
we need to subtract and check
if the result is negative or positive.
If we cannot subtract,
we need to add the subtracted value back to the remainder.
25
Example: Divide 0000 0111 by 0010
Iterat ion
0
1
2
3
4
5
Step
Initial value
1.Rem=Rem-Div
2b. Rem<0, Rem+=Div, sll Q, Q0=0 3. srl Div
1. Rem=Rem-Div
2b. Rem<0, Rem+=Div, sll Q, Q0=0 3. srl Div
1. Rem=Rem-Div
2b. Rem<0, Rem+=Div, sll Q, Q0=0 3. srl Div
1. Rem=Rem-Div
2a. Rem>=0, sll Q, Q0=1 3. srl Div
1. Rem=Rem-Div
2a. Rem>=0, sll Q, Q0=1 3. srl Div
Quotient
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0001
0001
0001
0011
0011
Divisor
0010 0000
(sll to be 8 bit)
0010 0000 0010 0000 0001 0000
0001 0000 0001 0000 0000 1000
0000 1000 0000 1000 0000 0100
0000 0100 0000 0100 0000 0010
0000 0010 0000 0010 0000 0001
Remainder
0000 0111 (=Dividend)
1110 0111 0000 0111 0000 0111
1111 0111 0000 0111 0000 0111
1111 1111 0000 0111 0000 0111
0000 0011 0000 0011 0000 0011
0000 0001 0000 0001 0000 0001
26
• W need to move the divisor to the right one digit each time, so we start with the divisor place in the left half of the 64-bit Divisor register and shift it right to w bit each step to align it with the dividend. Divisor and Remainder registers are 64-bit long, and we also need to use 64-bit ALU.
27
Division in MIPS • div
• Registers Hi contains remainder, Lo contains the quotient.
mfhi: move from Hi mflo: move from Lo
28
Fraction
• 0.011 two = _____________________ ten?
= (0 * 2 -1) + (1 * 2 -2) + (1 * 2 -3 ) = 0.25 + 0.125 = 0.325 ten
(0.d1 d2 d3 ….)b
=(d1 *b-1)+(d2 *b-2)+(d3 *b-3)+….
29
Converting fractional decimal numbers to binary
• 0.375 ten = _____________ two?
= 0.25+0.125=2-2 +2-3 =0.011two
• 0.1 ten = _______________ two? = 0.0625+0.03125+……
When does a conversion result in a terminating expansion?
30
Floating Point
• We need a way to represent
– numbers with fractions, e.g., 3.1416
– very small numbers, e.g., .000000001 – very large numbers, e.g., 3.15576 × 109
• Scientific notation: a single digit to the left of the decimal point
• Normalized number: no leading 0: 1.0×10-9 = 0.1×10-8 =10.0×10-10
31
Floating Point
• Representation:
– sign, exponent, significand: (–1)sign × significand × 2exponent
-0.11two =(-1)*11*2-2
– more bits for significand gives more accuracy
– more bits for exponent increases range One way:
1 bit 8 bits 23 bits
1 1111 1110 0000 0000 0000 0000 0000 011
(this is not the way MIPS represents floating point.)
sign
exponent
significand
32
Floating Point
• IEEE 754 floating point standard:
– single precision: 8 bit exponent, 23 bit significand – double precision: 11 bit exponent, 52 bit significand
33
IEEE 754 floating-point standard
• Leading “1” bit of significand is implicit
• Exponent is “biased”:
– A number with all 0s is smallest exponent , a number with all 1s
is largest
– bias of 127 for single precision and 1023 for double precision – summary: (–1)sign × (1+significand) × 2exponent – bias
• Example:
– decimal: -.75=-(1⁄2+1⁄4)
– binary: -.11 = -1.1 x 2-1
– floating point: exponent = 126 = 01111110
– IEEE single precision: 10111111010000000000000000000000
34
Biasing
[actual (unbiased) exponent] = [exponent written] – bias (i.e., biased exponent)
Given
1 bit 8 bits 23 bits The number represented is
(–1)sign × (1+fraction) × 2 biased exponent – bias ‘significand’ = 1 + ‘fraction’
sign
exponent written
fraction
35
Biasing
Why do we have the exponent field before fraction field?
-It simplifies sorting of floating point numbers using integer comparison instructions, since numbers with biggest exponents look larger than numbers with smaller exponents, as long as both exponents have the same sign.
Negative exponents pose a challenge to simplified sorting.
-biasing solves this problem, by shifting the exponent by the bias 127.
36
Example
• (–1)sign × (1+fraction) × 2 biased exponent – bias
-0.75 ten = – (0.5 + 0.25) ten = -0.11 two
-0.11 two = -1.1 x 2 -1 (in normalized scientific notation)
Sign: 1 since (-1)1 = -1
Fraction: 0.1 two
Biased Exponent: 126 (= exponent + bias, where exponent = -1) Bias: 127
1 bit 8 bits 23 bits
1126 .1
37
sign
exponent written
fraction
1
01111110
10000000000000000000000
Double Precision
sign
exponent written
fraction
1 bit 11 bits 20bits 20+32 bits = 52 bits
32 bits
Bias = 1023
Underflow: A negative component is too large to fit in the exponent field.
Fraction (continued)
38
What is the range of exponents that can be represented by IEEE 754 encoding in single precision? In double precision?
Single Precision
Almost as small as 2.0 ten x 10 -38 and almost as large as 2.0 ten x 10 38
Double Precision
Almost as small as 2.0 ten x 10 -308 and almost as large as 2.0 ten x 10 308
39
• How to represent 0, +/- infinity, NaN?
Single precision
Double precision
Object represented
Exponent
Fraction
Exponent
Fraction
0
0
0
0
0
0
Nonzero
0
Nonzero
+- denormalized number
1-254
Anything
1-2046
Anything
+- floating-point number
255
0
2047
0
+- infinity
255
Nonzero
2047
Nonzero
NaN (Not a Number)
40
Floating point addition
•
Start
1. Compare the exponents of the two numbers. Shift the smaller number to the right until its exponent would match the larger exponent
2. Add the significands
3. Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent
Yes underflow?
No
No
Still normalized?
Yes Done
Overflow or
Exception
4. Round the significand to the appropriate number of bits
41
Example
• Assume 4 bits precision: Add 0.5 ten and (-0.4375)ten
0.5ten =1⁄2ten =2-1 =0.1two =1.000two x2-1 -0.4375ten =-7/16ten =-7/24=-111two x2-4
=-0.0111two =-1.110two x2-2
Step1: The significand of the number with the lesser exponent is
shifted right until its exponent matches the larger number: -1.110two x2-2=-0.111two x2-1
Step2: Add the significands:
(1.000two x2-1 )+(-0.111two x2-1)=0.001two x2-1
Step3: Normalize the sum, checking for overflow or underflow: 0.001two x2-1=1.000two x2-4
Since 127 >= -4 >= -126, there is no overflow or underflow.
Step4: Round the sum. In this case, it already fits in 4 bits. 1.000 two x 2 -4 = 0.0625 ten
42
Floating point addition
•
Start
2. Add the significands
Sign
Exponent
Fraction
Sign
Exponent
Fraction
1. Compare the exponents of the two numbers. Shift the smaller number to the right until its exponent would match the larger exponent
Small ALU
010101
Exponent difference
3. Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent
Control
Shift right
Big ALU
Overflow or
Yes underflow?
No
No
Still normalized?
Yes Done
01
01
Increment or decrement
Shift left or right
Rounding hardware
Exception
4. Round the significand to the appropriate number of bits
Sign
Exponent
Fraction
43
Floating point multiplication
44
Example
• Assume 4 bits precision: Multiply 0.5 ten and -0.4375 ten 0.5ten =1.000two x2-1
-0.4375ten = =-1.110two x2-2
Step 1: Adding the exponents without bias: -1 + (-2) = -3
biased representation: -3 + 127 = 124 Step2: Multiplying the significands:
1.000 x 1.110 = 0000+10000+100000+1000000=111000 two The product is 1.110000 x 2 -3
Step3: Normalize it and check for overflow or underflow. 127 >= -3 >= -126, there is no overflow or underflow
Step 4: Rounding the product. 1.110000 x 2 -3 = 1.110 x 2 -3
Step5: Since the signs of the original operands differ, make the sign of the product negative. Hence the product is: – 1.110 x 2 -3 =-0.21875 ten
45
Floating point instructions in MIPS
• green sheet
• add.s – floating point addition single, add.s $f2, $f4, $f6 #$f2 = $f4+$f6
• sub.s
sub.s $f2, $f4, $f6 #$f2 = $f4-$f6
• mul.s
• div.s
• add.d – floating point addition double
• sub.d, ……
32 floating point registers: $f0, $f1, $f2, …., $f31
MIPS floating point registers are used in pairs for double precision numbers.
46
num1: num2: num3:
main:
.data
.float 34.5454 .double 123.034 .double 3.545452e+2
.text
.globl main
la $s0, num1 lwc1 $f6, 0($s0)
la $s1, num2 lwc1 $f2, 0($s1) lwc1 $f3, 4($s1)
# $s0 = address of num1
# load the floating pt num at the address $s0
#$s1 = address of num2
# load the floating pt num at the address $s1 # get the second part of double
……..
# lwc1 = load word coprocessor 1
47
• Isx+(y+z)=(x+y)+z?
• Yes, in mathematics
• No, in computer arithmetic (because of rounding errors) • Takex=-1.5tenx1038,y=1.5tenx1038,z=1.0ten
48
Guard digits
• Guard: The first of two extra bits kept on the right during intermediate calculations of floating-point numbers; used to improve rounding accuracy.
• It is used with rounding
• Round: Method to make the intermediate floating-point result fit the floating-point format; the goal is typically to find the nearest number that can be represented in the format.
49
Example with Guard
• Assume 3 significant digits.
• Add2.56ten x100 and2.34ten x102
With Guard:
Shift the smaller num to the right to align the exponent: 2.56 x 10 0à0.0256 x 10 2 (we have 2 extra bits for guard)
2.3400 + 0.0256 2.3656
The sum is 2.3656 x 10 2
by rounding to 3 significant digits, we get 2.37 x 10 2
50
Example without Guard
• Assume 3 significant digits.
• Add2.56ten x100 and2.34ten x102
Without Guard:
Shift the smaller num to the right to align the exponent: 2.56 x 10 0à0.02 x 10 2 (we can have 3 digits only)
2.34 + 0.02 2.36
The sum is 2.36 x 10 2
off by 1 in the last digit compared to the result with Guard.
51