PowerPoint 演示文稿
CO101
Principle of Computer
Organization
Lecture 03: Arithmetic
Liang Yanyan
澳門科技大學
Macau of University of Science and
Technology
Analog vs. Digital
• Analog Signal
• Vary in a smooth way
over time.
• Analog data are
continuous valued.
• Example: audio, video
2
• Digital Signal
• Maintains a constant
level then changes to
another constant level
(generally operate in one
of the two states).
• Digital data are discrete
value.
• Example: computer data.
Number Systems
• An ordered set of symbols, called digits, with relations defined for
addition, subtraction, multiplication, and division.
• Radix or base of the number system is the total number of the
number system is the total number of digits allowed in the number
system.
• Commonly used numeral systems.
3
Conversion from Decimal Integer
• Step 1: Divide the decimal number by the radix (number
base).
• Step 2: Save the remainder (first remainder is the least
significant digit).
• Repeat step 1 and step 2 until the quotient is zero.
• Result is in reverse order of remainders.
• Exercises:
• Convert 1510 to binary value.
• Convert 1010 to octal value.
4
0010010001001000
Most significant bit (MSB) Least significant bit (LSB)
Machine number representations
• Machine number representations
• Fixed point representation
• Floating point representation
• Fixed point representation
• Numbers can be represented in any base.
• 18910, 6EF16
• Numbers in computer are represented in binary.
• A sequence of 0 and 1, each digit is called a bit.
• 0000 -> 0001 -> 0010 -> 0011 -> 0100 -> 0101 -> …
• In decimal from 0 to 2N-1 for N bits.
• A byte → 8 bits
• A word → 16 bits, 32 bits? (machine dependent)
5
Sign and magnitude
• Unsigned number
• an-1an-2…a0 = an-1x2n-1 + an-2x2n-2+…+ a0x20
• 101 = 1×22 + 0x21 + 1×20 = 5
• For a N-bit number, the range it can represent is
[0, 2N-1]
• Sign and magnitude
• Use the MSB to represent positive or negative
number, e.g. 1 → negative, 0 → positive
• 101 = -1 (since 0x21 + 1×20 = 1, MSB = 1, so it is a
negative number)
• For a N bit number, the range it can represent is
[-(2N-1-1), +(2N-1-1)]
6
Problems of sign and magnitude
• There are two zeros
• 0000 = +0
• 1000 = -0
• Waste resource
• Computer needs an extra step to check the sign
before performing computations.
• E.g. (1001 + 0010) we cannot add them directly.
Two’s complement
• The sign is still indicated by MSB.
• 1 → negative , 0 → positive.
• Negative number is obtained by two steps.
• inverse all bits of the positive number
• add 1
For example, to represent -5 using a 4-bit
representation
We know : 5 = 0101
After inverse all bits: 1010
Plus 1 : +0001
After plus 1 : 1011 → answer!
8
Two’s complement
• Given a number in two’s complement format, what is the
actual value?
Example 1: given 1100, this is a negative number.
We have : 1100
After inverse all bits: 0011
Plus 1 : +0001
We have : 0100 → 4
As a result, 1100 is -4, since the MSB is 1.
Example 2: given 0110, since this is a positive number,
we just calculate the value as unsigned format.
We have : 0110 → 6
9
Two’s complement
• The range for an N-bit representation
• [-2N-1, +2N-1-1]
• E.g. for N=8, the range is [-128, +127]
• Remember to extend the sign when we extend a
number. Consider the case that extends a 4-bit
number to 8 bits (e.g. a computer uses 4 bits,
another one uses 8 bits).
• 0011 → 0000 0011
• 1011 → 1111 1011 (the negative sign is extended)
• If the sign is not extended, 1011 → 0000 1011. After
extension, a negative number becomes positive.
10
32-bit Binary Representations
• 32-bit signed numbers (2’s complement):
11
Addition and subtraction
• Addition of X and Y
• X + Y
• Subtraction
• X – Y = X + (-Y) = X + (inverse of Y) + 1
• Overflow means a number is out of the representation range.
• E.g. unable to represent 1023 using 4 bits
• What situations may cause overflow?
• Addition of two positive numbers, or two negative numbers.
• 0100 + 0100 or 1000 + 1000
• Subtraction of a positive and a negative number.
• 0111 – 1000
• What situations don’t cause overflow?
• Addition of a positive and a negative number
• 0100 + 1111
• Subtraction of two negative numbers, or two positive numbers
• 1000 – 1111 or 0100 – 0111
12
Digital Signal Representation
• Active HIGH
• High voltage means On
• Active LOW
• Low voltage means OFF
13
Logic Gates
14
Truth Table
• A means for describing how a logic circuit’s output
depends on the logic levels present at the circuit’s inputs.
• The number of input combinations will equal 2N for an N-
input truth table.
15
Building a 1-bit Binary Half Adder
• A: input
• B: input
• S: output
• C: output
• Using two half adders
to build a full adder.
16
XOR gate
AND gate
Inputs Outputs
A B C S
0 0 0 0
1 0 0 1
0 1 0 1
1 1 1 0
Building a 1-bit Binary Full Adder
17
How can we use it to build a 32-bit adder?
How can we modify it easily to build an adder/subtractor?
Building 32-bit Adder
• Just connect the carry-out of
the least significant bit FA to
the carry-in of the next least
significant bit and
connect …
• Ripple Carry Adder (RCA)
• Advantage: simple logic, so
small (low cost).
• Disadvantage: slow and lots of
glitching (so lots of energy
consumption).
18
Overflow Detection
• Can detect overflow by:
• Carry into MSB xor Carry out of MSB.
• xor operation:
• 0 xor 0 = 0;
• 1 xor 1 = 0;
• 0 xor 1 = 1;
• 1 xor 0 = 0;
19
Floating Point Representation
• Fixed point representation of fractional number (decimal).
• an-1an-2…a0﹒b1b2…bm = an-1x2n-1 + an-2x2n-2 +…+ a0x20﹒b1x2-1 +
b2x2-2 + … + bmx2-m
• 11.101 = 1×21 + 1×20 + 1×2-1 + 0x2-2 + 1×2-3 = 3.625
• We need a way to represent
– Very small numbers, e.g., .000000001. To represent 2-98 using
fixed point, we need 99 bits.
– Very large numbers, e.g., 3.15576 ∗ 1099. To represent 299-1
using fixed point, we need 99 bits.
20radix (base)
decimal point
Mantissa
exponent
+ 6.02 x 10
23
Scientific Notation
sign
Floating Point Representation
• In mathematics, we have standard (Scientific
Notation) for representing fractional numbers.
• In computer systems, we also need a standard
to represent fractional numbers → IEEE754
floating point format.
• The most widely used standard for floating point
computation.
• Advantages of the standard (IEEE754):
• Simplify exchange of data includes fractional number.
• Simplify floating point arithmetic algorithms.
• Increase accuracy of the numbers that can be stored.
21
Floating Point Representation
• How do computer systems store a fractional number
which is in scientific notation?
• Can you figure out what the number is if I just tell you the
mantissa, exponent, and sign?
• In computer systems, fractional numbers are often
stored using 32-bit space and divide these 32 bits into 3
parts:
• S: store the sign.
• E: store the exponent.
• M: store the mantissa.
• No need to store the base.
22
radix (base)
decimal point
Mantissa
exponent
+ 1.011 x 2
35
S E M
1 8 23
sign
More details
• Scientific notation in binary: +1.011 × 235
• S is the sign bit, 0 means +ve, 1 means –ve
• E is the biased exponent, assume E is N bits.
• E = exponent + bias, e.g. E = 35 + 127
• bias = 2N-1 – 1, bias is 127 for single precision and 1023 for
double precision.
• exponent = E – bias
• M is the normalized binary number with “hidden” leading
‘1’, called significand.
• M = Mantissa – 1, e.g. M = 1.011 – 1 = .011
• Mantissa = 1 + M
23
1 8 23
S E M
1 11 52
S E M
Single precision
Double precision
actual value = (–1)S ∗ (1+M) × 2E – bias
0.7510 = 0.112= 1.12 x 2-1 (normalized)
S = 0 (positive number)
E = exponent + 127 = -1+127 = 126 =01111110
M = Mantissa – 1 = 1.1 – 1 = .10000000000000000000000
32 bits = 00111111010000000000000000000000
= 0x3F400000
Conversion
• Convert –11.510 into Single Precision FP in Hex
• Convert 0.7510 into Single Precision FP in Hex
24
-11.510 = -1011.12=-1.01112 x 23 (normalized)
S = 1 (negative number)
E = exponent + 127 = 3+127 = 130 = 10000010
M = Mantissa – 1 = 1.0111 – 1 = .01110000000000000000000
32 bits = 11000001001110000000000000000000
= 0xC1380000
Conversion
• What is the decimal value of the following IEEE Single
precision number?
25
1 8 bits 23 bits
1 01111110 10000000000000000000000
sign bit S = 1 = negative
E = 01111110 = 126
exponent = E – 127 = 126-127 = -1
significand M = 0.1000….2
Mantissa = 1 + M = 1+0.1000….2 = 1.12
Value = – 1.12 x 2-1 = -0.112 = -(0.5+0.25) = -0.75
Conversion (Exercises)
• Convert single precision FP 0xC0A00000 to decimal
• Convert single precision FP 0x41380000 to decimal
26
0xC0A0000 = 11000000101000000000000000000000
sign = 1 = negative
exponent = E – 127 = 129-127 = 2
Mantissa = 1 + M = 1.012
value = -1.012 x 22 = -1012 = -5
0x41380000 = 01000001001110000000000000000000
sign = 0 = positive
exponent = E – 127 = 130-127 = 3
Mantissa = 1 + M = 1.01112
value = 1.01112 x 23 = 1011.12 = 11.5
Advantages
• For ease of sorting
• 1st bit determine whether the number is positive or
negative.
• Sorting is then based on exponent, a larger exponent
implies the number is bigger.
• For more accuracy
• Observe that after normalization, no leading ‘0’ and
must start with ‘1’, hidden this leading ‘1’ makes the
significand actually 24 bits long (53 bits long for
double precision) → can use more bits to store the
mantissa.
27
Range
• Notice that the IEEE 754 standard reserves the smallest
E (all 0s) and largest E (all 1s) for special purpose. Will
address this issue later.
• Smallest magnitude can be represented
• Smallest E = 00000001 = 1
• Smallest M = 00000000000000000000000
• Smallest magnitude = 1.02 x 2(1-127) ≈ 1.8 x 10 -38
• Largest magnitude can be represented
• Largest E = 11111110 = 254
• Largest M = 11111111111111111111
• Largest magnitude = 1.1112… x 2(254-127) ≈ 3.4 x 1038
• Range the IEEE754 can represent
• (+1.8 x 10 -38 , +3.4 x 1038) and (-3.4 x 1038, -1.8 x 10 -38)
• How about zero and numbers outside these range?
28
Underflow and overflow
29
-1.8 x 10-38-3.4 x 1038 1.8 x 10-38 3.4 x 10380
negative
overflow
negative
underflow
positive
underflow
positive
overflow
Special number
• Representation of Zero
• All 0s in E and M
• +0 and -0 indicated by S (0 mean +)
• Representation of +/- infinity
• E: all 1s, M: all 0s
• +/- indicated by S (0 means +)
• Representation of Not a Number (NaN)
• E: all 1s, M: non zero number
• e.g. sqrt of –ve number
• HW decides what M is
• Denormalized number
• E: all 0s, M: non zero number
• Computers cause exception
30
S 11111111 00000000…0
S 11111111 Non zero
S 00000000 00000000000000000000000
S 00000000 Non zero
Arithmetic Add
31
Start
1. Compare the exponents. Right Shift the smaller
one until its exponent match the larger number
2. Add significands
3. Normalize the sum and shfit the exponent accordingly
Overflow or
underflow?
4. Round significand
Still
normalized?
End
Exception
yes
yes
No
No
• Assume 3 digits significand
and 2 digits exponent
• 6.42 x 101 + 9.51 x 102
1. 6.42 x 101 = 0.642 x 102
2. 0.642+9.51 = 10.152
3. 10.152 x 102 = 1.0152 x 103
(no over/under flow)
4. After rounding, 1.015 x 103
(normalized)
5. Ans : 1.015 x 103
Arithmetic Multiplication
32
Start
1. Get the new biased exponent, by adding E1+E2-bias
2. Multiply significands
3. Normalize the product and shift the exponent accordingly
Overflow or
underflow?
4. Round significand
Still
normalized?
End
Exception
yes
yes
No
No
5. Set the sign to +ve if two no. are in the same sign,
-ve otherwise
• Assume 3 digits significand
and 2 digits exponent
• 8.76 x 101 * 1.47 x 102
1. (1+127)+(2+127) – 127 =
(3+127)
2. 8.76 x 1.47 = 12.8772
3. 12.8772 x 103=1.28772 x 104
(No over/under flow)
4. After rounding, 1.288 x 104
(normalized)
5. Set to +ve
6. Ans : 1.288 x 104
Common Pitfall for FP
• Floating point addition is NOT associative.
• Is (a+b)+c equivalent to a+(b+c)?
• No in some circumstances, but why?
• FP numbers have limited precision, only approximate
result can be stored.
• a = -1.5 x 1038, b = 1.5 x 1038, c = 1.0
• Everyone knows (a+b)+c = a+(b+c) = 1.0 in
mathematics.
• How about do this in C program using float type?
33
Common Pitfall for FP
a = -1.5 x 1038, b = 1.5 x 1038, c = 1.0
• printf (“%f”, (a+b)+c); Result = 1.0
• printf (“%f”, a+(b+c)); Result = 0
• It is nearly no effect for adding a very small number to a
very big number.
• The small number becomes zero when normalizes the
exponents of two numbers (step 1 of addition algorithm).
• Don’t assume associate rule holds for Floating Point
Number! Always avoid adding a very small number to a
very big number first.
34
Common Pitfall for FP
• Case 1
• a = -238, b = 238, c = 1.0
• Is ((a+b)+c) equal to (a+(b+c))?
• Case 2
• a = -228, b = 228, c = 1.0
• Is ((a+b)+c) equal to (a+(b+c))?
• Case 3
• a = -22, b = 22, c = 1.0
• Is (a+b)+c) equal to (a+(b+c))?
• Q1: What is the largest exponent that these two
equations are still equal?
• Q2: What is the smallest exponent that these two
equations are not equal?
35
Decreasing the
exponent.
NOT EQUAL
EQUAL
Common Pitfall for FP
• Equality test may NOT hold for FP.
• Is 10/3*3 equal to 10?
• No, but why?
• Round off errors easily occur during each
intermediate step!
• Not wise to test ‘absolute’ equality of two FP numbers.
36
Common Pitfall for FP
float d;
int e = 10;
d = (10/3)*3;
if (d==e)
printf(“1. equality work!\n”);
else
printf(“2. equality not work!\n”);
if (d > e)
printf(“3. compare magnitude work!\n”);
37
Summary
• Fixed point representation
• Two’s complement
• Floating point representation
• IEEE754
• Conversion between single precision floating
point format and decimal format.
• Addition and multiplication
• Pitfall
38
CO101�Principle of Computer Organization
Analog vs. Digital
Number Systems
Conversion from Decimal Integer
Machine number representations
Sign and magnitude
Problems of sign and magnitude
Two’s complement
Two’s complement
Two’s complement
32-bit Binary Representations
Addition and subtraction
Digital Signal Representation
Logic Gates
Truth Table
Building a 1-bit Binary Half Adder
Building a 1-bit Binary Full Adder
Building 32-bit Adder
Overflow Detection
Floating Point Representation
Floating Point Representation
Floating Point Representation
More details
Conversion
Conversion
Conversion (Exercises)
Advantages
Range
Underflow and overflow
Special number
Arithmetic Add
Arithmetic Multiplication
Common Pitfall for FP
Common Pitfall for FP
Common Pitfall for FP
Common Pitfall for FP
Common Pitfall for FP
Summary