Reconfigurable computing
Small Embedded Systems
Unit 2.2: Bit Manipulation and Non-Integer Types
Introduction
Bitwise operations and bit manipulation in C
Text representation
Representing numbers with fractional parts
Bitwise Operations
Embedded systems often require us to manipulate individual bits in a register
For example, the ATmega328P has registers that control which pins are inputs and which are outputs
DDRB: data direction register for port B
DDRC: data direction register for port C
DDRD: data direction register for port D
Each port governs eight of the pins on the chip
Example:
uint8_t DDRB=0b00000011
Sets pins 0 and 1 to be outputs
Sets pins 7, 6, 5, 4, 3, 2 to be inputs
Bitwise Operations in C
Example:
DDRB=0b00000011
Pins 0 and 1 are outputs
Pins 7, 6, 5, 4, 3, 2 are inputs
We want to write a process that will change pin 7 to be an output, but at the time of writing we don’t know the state of the other pins
We just want to change pin 7 and leave the others untouched
This is the code:
DDRB = DDRB | 0b10000000;
Bitwise OR in C
Why does this work?
ORing a value with 1 will set the result to 1
ORing a value with 0 will leave the value unchanged
DDRB starts at 00000011
ORed with 10000000
Result 10000011
Pin 7 has been set to output
Other values unchanged
DDRB = DDRB | 0b10000000;
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1
Bitwise AND in C
Now we want to turn pin 1 into an input
ANDing a value with 1 will leave the value unchanged
ANDing a value with 0 will set the result to 0
DDRB starts at 10000011
ANDed with 11111101
Result 10000001
Pin 1 has been set to input
Other values unchanged
DDRB = DDRB & 0b11111101;
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1
Bitwise XOR in C
Now we want to reverse state of pin 4
XORing a value with 1 will reverse the value
XORing a value with 0 will leave the value unchanged 0
DDRB starts at 10000001
XORed with 00010000
Result 10010001
Pin 4 has been reversed
Other values unchanged
DDRB = DDRB ^ 0b00010000;
A B A XOR B
0 0 0
0 1 1
1 0 1
1 1 0
Shift Operations
Suppose we have a value
We can do a bitshift like this
The result of this is newval gets the value
If we want to put a 1 into position 3, we do this:
uint8_t newval = (1 << 3);
uint8_t newval = (val << 3);
uint8_t val = 0b00000001;
0b00001000
Concise bit manipulation notations
Setting a bit to 1
Suppose we want to set pin 7 to be an output
These are the same and all work:
The last notation is often used as you can see immediately that it is operating on bit 7
DDRB = DDRB | 0b10000000;
DDRB |= 0b10000000;
DDRB |= (1 << 7);
1 shifted left 7 places
Concise bit manipulation notations
Clearing a bit to 0
Suppose we now want to set pin 7 to be an input
These are the same and all work:
(1 << 7) = 0b10000000
~(1<<7) = 0b01111111
~ is bitwise NOT operator – reverses all bits
Don’t confuse it with ! – true if value is non-zero
DDRB = DDRB & 0b01111111;
DDRB &= 0b01111111;
DDRB &= ~(1 << 7);
Text Data
Main standard in early days of computing was ASCII (American Standard Code for Information Interchange)
First standard was 1963; current revised standard dates from 1977
Uses 7-bits to represent A-Z, a-z, 0-9, punctuation and control codes (e.g. newline, carriage return)
27 = 128 distinct codes could be used
Many communication protocols were originally based on the transmission of ASCII codes
ASCII
Message Hello!
8-bit ASCII
Computers standardised on 8-bit bytes as their basic addressable unit
The extra bit means we can represent an additional 128 characters/codes in the range 128-255
Different countries used them for different character sets
à, á, â, ã, ä, å, æ, ç, è, é, ê, ë, ì, í, î, ï, ð, ñ, ò, ó, ô, õ, ö, ø, ù, ú, û, ü, ý
а, б, в, г, д, е, ж, з, и, й, к, л, м, н, о, п, р, с, т, у, ф, х, ц, ч, ш, щ, ъ, ы
ؤ, إ, ئ, ا, ب, ة, ت, ث, ج, ح, خ, د, ذ, ر, ز, س, ش, ص, ض, ط, ع, ػ, ؼ, ؽ, ؾ, ف, ق, ك, ل
अ, आ, इ, ई, उ, ऊ, ऋ, ऌ, ऍ, ऎ, ए, ऐ, ऑ, ऒ, ओ, औ, क, ख, ग, घ, ङ, च, छ, ज, झ
This didn’t help for languages with more than 256 characters (e.g. Chinese, Japanese)
To make things worse, different computer manufacturers used the additional codes in different ways
Much confusion resulted – standardisation was needed
UTF-8
Unicode is the Universal character encoding
First version published 1991
Drive for standardisation was accelerated by growth of the Internet
Aims to include all writing systems into common encoding scheme
UTF-8 (Unicode Transformation Format 8 bit)
Variable length code (1 to 4 bytes per character)
First 128 code points are identical to ASCII and 1 byte long
They are identified by first bit being 0
Other codes points with first bit of 1 can contain 2 to 4 bytes
Real Numbers
Not all numbers are integers
Real numbers also have a fractional part
Two approaches:
Fixed point
Preferable when numbers are neither very large nor very small and a lot of computation must be done
Floating point
Preferable when numbers can be very large and very small and slow computation can be tolerated
Denary Fixed Point
Consider the denary number 132.36
This means
1×102 + 3×101 + 2×100 + 3×10-1 + 6×10-2
1×100 + 3×10 + 2×1 + 3× + 6×
Digits to the left of the decimal point have positive powers of 10; Digits to the right have negative powers
Advantage is that arithmetic can be done by standard integer hardware, with appropriate shifting:
132.36
31.4
+
163.76
=
13236
3140
+
16376
=
is same operations as
Shift by 2 positions is same as division by 102 = 100
Binary Fixed Point
Unsigned arithmetic: 101.11 means
1×22 + 0×21 + 1×20 + 1×2-1 + 1×2-2
1×4 + 0×2 + 1×1 + 1×½ + 1× ¼
= 5.75 (denary)
Signed arithmetic: 101.11 means
1×-22 + 0×21 + 1×20 + 1×2-1 + 1×2-2
1×-4 + 0×2 + 1×1 + 1×½ + 1× ¼
= -2.25 (denary)
Shift by 2 positions is same as division by 22 = 4
10111 is 23 denary
Unsigned
10111 is -9 denary
Signed
Disadvantage of Fixed Point
Suppose we have to deal with numbers that can be very different sizes:
Example:
214000000000 × 0.000000000053
To represent both numbers would require a very large number of digits using fixed point:
214000000000.000000000000 × 000000000000.000000000053
If we routinely have to deal with many numbers of widely differing size, this can become unrealistic
Floating Point Types
Equivalent of scientific notation, e.g.
-2.63 x 104
A floating point number has 3 parts:
Sign (-)
Mantissa (263)
Exponent (4)
float datatype in C packs binary representations of sign, exponent, and mantissa into 32-bit word
32-bit float can represent numbers in range
±1.5 × 10−45 to 3.4 × 1038
S
Exponent
Mantissa
31
30
23
22
0
Floating Point Types
double datatype in C packs binary representations of sign, exponent, and mantissa into 64-bit word
64-bit double can represent numbers in range
±5.0 × 10−345 to 1.7 × 10308
S
Exponent
Mantissa
63
62
52
51
0
Floating Point Arithmetic
Adding or multiplying floating point numbers is very complicated
Example:
Suppose we are working in base-10 and have 4 digits to represent our numbers. We want to perform the addition:
9.936 × 10-3
+ 7.355 × 10-5
Floating Point Addition
Here are the required operations:
9.936 × 10-3
+ 7.355 × 10-5
9.936 × 10-3
+ 0.07355 × 10-3
10.00955 × 10-3
1.000955 × 10-2
1.001 × 10-2
1: Align the exponents
2: Add the mantissas
3: Normalise
4: Round
0: Our starting point
Floating Point Arithmetic
Adding or multiplying floating point numbers is very complicated
Powerful computers have special floating point hardware to accomplish this efficiently
Most microcontrollers do not have floating point hardware, and must perform exponent alignment, normalisation, and rounding using complicated software routines – can be very slow
Summary
Bitwise and shift operations are often used to set value in special function registers
Numbers with fractional parts can be represented as fixed point or floating point
Processors with simple hardware normally prefer fixed point
/docProps/thumbnail.jpeg