2. Binary Codes
Codes: general comments
• In any fixed length code the key feature is how many bits are in a word.
• If the number of bits is n, the number of different codewords possible is 2n.
• If you know how many elements are in the data set you wish to code, you must choose a value of n that is big enough. E.g. the months of the year would need 4 bit words. Why?
• On the other hand if you have a very big data set and can only represent a part of it, the value of n available will determine how big the represented part can be.
• For example, to represent unsigned whole numbers (the infinite set {0, 1, 2, 3…}) using 8- bit codewords, only 256 values will be available.
– e.g. we could represent 0 up to 255 (this is the most natural subset to choose)
• On the other hand if we want to represent integers (signed whole numbers infinite in both directions, {…, -3, -2, -1, 0, 1, 2, 3, …}) we will need some codewords for negative numbers so, using 8-bits again, we might choose codewords for -128 to +127 (256 values).
• Starting with the unsigned whole numbers and an n-bit codeword, many different codes are possible but the most natural one uses the same positional technique as mechanical (pencil and paper) arithmetic uses for decimal values.
• Representing sets of numbers is critical to computing and we shall examine that first, beginning with unsigned values.
Computer Systems Binary Codes 2
Unsigned Number Representation
• Unsigned binary codes are positional like the decimal system used in ordinary arithmetic.
– Each digit carries a weight depending on its position in the code word;
– In decimal allowed digits are 0, 1,.., 9; in binary only 1s and 0s allowed.
• Can generate positional codes with any number as base. Position of a digit is weighted using powers of the base.
• Lowest weighted digit is least significant; highest is most significant.
• Decimal positional code. E.g. 205310
= 2103 + 0102 + 5101 + 3100 = 2000+0+50+3
• Binary positional code E.g. 10012
= 123 + 022 + 021 + 120 = 8 + 0 + 0 + 1 = 910
• Subscript removes ambiguity with different number bases. 100110 ≠ 10012 ! Computer Systems Binary Codes 3
The Powers of 2
It’s useful to know the value of each bit position for binary numbers!
value
27
26
25
24
23
22
21
20
=
128
64
32
16
8
4
2
1
Don’t memorize this table! Construct it when you need it.
Computer Systems Binary Codes 4
Decimal to Binary Conversion
Problem: Convert the whole number x into binary.
It will need at least k bits where k is the smallest power of 2 bigger than x.
– Write k columns containing the powers of 2
– Starting from the left, put a 1 if x is greater than or equal to the column value. Now subtract
the column value from x to reduce it
– If x is less than the column value put a 0 in the column and leave x unchanged.
– Move to the next column and repeat.
– Stop at the rightmost column.
• Example: Convert 20310 to 8-bit binary.
– First note that largest power of 2 in 8-bit positional number is 128
x=203 x=75 x=11 x=11 x=11 x=3 x=3
x=1
128
64
32
16
8
4
2
1
1
1
0
0
1
0
1
1
– Check the result by converting 110010112 back to decimal!
Computer Systems Binary Codes
5
Unsigned Binary Code: example
• An n-bit unsigned binary code uses binary positional representations of numbers, padded out with 0s to make up full n bits of codeword.
• Example: Find the 16-bit unsigned binary code for 290.
• Solution. Here k = 9 since 28 =256 and 29 =512.
– So we need 9 bits minimum to represent 290.
– Follow procedure on previous slide:
290 34 34 34 2 2 2 2 0
– 9 bit binary representation of 290 is 100100010.
– But we’re asked for 16-bit code word, so pad out with 0s at most significant end.
– Answer: 0000 0001 0010 0010
256
128
64
32
16
8
4
2
1
1
0
0
1
0
0
0
1
0
Computer Systems Binary Codes 6
Unsigned binary addition
• Adding 2 binary values works just like mechanical decimal arithmetic.
– Start adding from units column (conventionally the rightmost). Progress to the “2s” column then to the “4s” etc.
– Every column generates a sum bit, S and a carry bit, C. The carry bit must be added into the next column
• Example: Add the two 4-bit numbers
• x=1010andy=1111
• What decimal numbers are these? 10 and 15
• Solution: Examine the table (right)
• Result is 110012 = ?10
• Note that the sum is too big to be a 4-bit number.
• This is always true if the carry from the most significant column is 1.
C
1
1
1
0
x
1
0
1
0
y
1
1
1
1
S
1
1
0
0
1
Computer Systems Binary Codes 7
Overflow
• If we add two n-bit unsigned numbers and the sum is too big to be an n-bit number, the sum cannot be represented in the n-bit code.
• This is called an (unsigned) overflow.
• A necessary and sufficient condition for unsigned overflow is that the carry from the
most significant bit (MSB) position is 1.
• An n-bit unsigned adder is a device which takes in two n-bit unsigned binary code
words and outputs an n-bit number representing the sum.
• If there is an overflow, the number representing the sum is wrong.
• A practical adder should signal when there is an overflow. This can be done using the carry from the MSB position. This is often referred to as the adder’s carry out (C).
n n
n
1
C
Computer Systems
Binary Codes
8
Hexadecimal
• Binary codewords can be very long and difficult for humans to work with.
• A common way to shorten binary strings is to convert them to a base 16 number representation (called hexadecimal).
• Hex has 16 digits: 0, 1, …, 9, a, b, c, d, e, f
• Because 16 = 24, it is easy to convert between binary and hex.
– Every group of 4 bits can be directly represented as a single hex digit.
• Don’t worry about hex arithmetic – we use hex just to make the notation for bits easier to read
Computer Systems Binary Codes 9
Hexadecimal Digits
Digit
0
1
2
3
4
5
6
7
Binary
0000
0001
0010
0011
0100
0101
0110
0111
Digit
8
9
a
b
c
d
e
f
Bits
1000
1001
1010
1011
1100
1101
1110
1111
Computer Systems Binary Codes 10
Hex to Binary
Just replace each hex digit with its 4-bit binary equivalent. Don’t use commas; put a space between the 4-bit groups
3 a6 e
0011 1010 0110 1110
Binary to hex conversions are equally easy. Remember to pad the binary number out to the left with 0s until the number of bits is a multiple of 4.
Computer Systems Binary Codes 11
Unsigned Number Positional Codes: summary.
• Standard n-bit codes for whole (non-negative) numbers. • Every number maps onto its n-bit binary positional form.
Number
0 1 2 3 4 5 6
65530 65531 65532 65533 65534 65535
Codeword
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0001 0000 0010 0000 0011 0000 0100 0000 0101 0000 0110
16-bit unsigned code
(65536 codewords)
. . .
1111 1111
1111 1111
1111 1111
1111 1111
1111 1111
1111 1111 1111 1111
1111 1010 1111 1011 1111 1100 1111 1101 1111 1110
4-bit unsigned code
(16 codewords)
Number Codeword
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111
Number
0 1 2 3 4 5 6
250
251
252
253
254
255
. . .
Codeword
0000 0000 0000 0001 0000 0010 0000 0011 0000 0100 0000 0101 0000 0110
1111 1010 1111 1011 1111 1100 1111 1101 1111 1110 1111 1111
8-bit unsigned code
(256 codewords)
Computer Systems Binary Codes 12
Representing signed numbers
• Here numbers can be negative or positive
• Recall: a k-bit word has 2k distinct values.
• Instead of choosing all our code words as being positive (as with unsigned binary), we choose half of them to represent negative numbers
• This can be done in more than one way. e.g. in sign & magnitude the most significant bit is the sign (1 means a negative number) and the other bits the magnitude:
– +1 in 8-bit sign and mag code is 0000 0001 – -1 in 8-bit sign and mag code is 1000 0001
• Note that sign & magnitude codes have a codeword for -0 (not a valid concept).
• These codes are not used in practice because doing arithmetic with them is awkward compared to a better alternative called two’s complement codes.
Computer Systems Binary Codes 13
Two’s complement
• The commonest codes for signed numbers are two’s complement codes.
• Take an 8-bit two’s complement code as an example.
– The code word 0000 0000 represents 0.
– Code words with MSB=0 represent +ve numbers. Largest positive value is 0111 11112 = +127
– To find the code word for –x simply subtract x from 28 =256 and convert the answer to binary in the usual way.
• Example: Find the two’s complement code word for -1.
– Find the code word for 256 – 1 = 255. This is 1111 11112.
• NOTE that the code word 1111 11112 represents 255 in an unsigned 8-bit code but -1 in an 8-bit two’s complement code
– It is vital to remember which code you are working in.
Computer Systems Binary Codes 14
Properties of 2’s complement
• In the 8-bit two’s complement code the numbers represented are as follows:
Number
Code
Number
Code
0
0000 0000
-1
1111 1111
1
0000 0001
-2
1111 1110
2
0000 0010
-3
1111 1101
…
…
….
….
127
0111 1111
-128
1000 0000
• An 8-bit two’s comp code cannot represent numbers greater than +127
• Note that the MSB indicates the sign
• The inverse of any two’s complement number (with any number of bits) can be found by flipping all the bits and adding 1 to the result. This process is called negation.
• If the two’s complement codes for a number and its inverse are input to an 8-bit unsigned adder, and the carry out is ignored, the answer is 0 (try it with, for example, the 8-bit codewords for 2 and -2!)
Computer Systems Binary Codes 15
Two’s complement Arithmetic
• If we input 2 two’s complement codewords to an ordinary unsigned adder (that works with unsigned codes) and the result doesn’t fall outside the two’s complement code range, it will be correct. Ignore the carry from the MSB when doing this.
• E.g. 2 + (-1) in 8 bit code. Result is 1 (ignoring MSB carry)
+2 → 0000 0010
-1 → 1111 1111
0000 0001 → +1
• When the sum of two signed numbers exceeds the boundary of the code, we have a signed or two’s complement overflow.
• E.g. If we add 127 + 1, the unsigned adder will give result 1000 0000. In this case the inputs represent the same numbers in both unsigned and two’s complement codes. But the output (1000 0000) does not. In an unsigned code it is correct (128) but in two’s complement it is not (-128), i.e. there is a signed overflow.
+1 → 0000 0001 +127 → 0111 1111
1000 0000 → -128 X
Computer Systems Binary Codes 16
Two’s Complement Overflow
• In last example, the adder always gives out the same answer bits (1000 0000) but the answer is wrong (an overflow) in two’s complement but correct in unsigned code.
• Obviously the carry out of the MSB flags a unsigned overflow but not a two’s complement one, so the C output of the adder is of no help if we are using two’s comp.
• We need a different test for two’s complement overflow and for an adder to work with both codes it must be augmented with a second overflow output. This is often labelled V. (See last question at end of Section 2 for how to perform this test)
• When using an adder with unsigned code, look at C but ignore V. When using two’s comp do the opposite.
• With this augmentation, the same adder circuit can add both unsigned and two’s complement numbers.
– This is a major reason why two’s complement codes are preferred over sign and magnitude.
n n
n
1
1
V
C
Computer Systems
Binary Codes
17
Characters
• We write characters as geometric shapes (“character graphics”), but inside a computer a character is represented as a word (“character code”)
• A character code is a table giving the bit value that represents each character.
• ASCII (American Standard Code for Information Interchange) is an older
standard code (1960), still in common use: uses 7-bit or 8-bit code words.
• Unicode (two-byte code, 16 bits) is the current standard and includes many additional characters. Can be extended to 32-bits.
• When you declare a character string in older languages (C, C++, …), it is represented as
– a sequence of bytes
– where each byte contains the ASCII code of one character
• A String in Java is a sequence of Unicode characters
Computer Systems Binary Codes 18
Questions
1. Convert 224 to binary.
2. Convert 111101012 to hexadecimal.
3. Convert FE16 to decimal.
4. The 8-bit 2’s complement code can represent all numbers between -128 and +127. An 8-bit unsigned adder will add numbers in this range correctly unless the code overflows. Find two examples to show that 2’s complement code overflows do not occur when unsigned code overflows do (one example) and vice versa (another example)
5. Challenge. Can you find a condition for two’s complement overflow in an 8-bit code (Hint it involves the carries into and out of the MSB)
Computer Systems Binary Codes 19