Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
1
Carnegie Mellon
Bits, Bytes and Integers – Part 1
15-213/18-213/14-513/15-513: Introduction to Computer Systems 2nd Lecture, May 20, 2020
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
2
Carnegie Mellon
Announcements
Linux Boot Camp Tuesday, noon EDT
Lab 0 will be available via course web page and Autolab. ▪ Release estimated at 5/20, 9pm
▪ Due Fri May 29, 11:59pm ▪ No grace days
▪ No late submissions
▪ Just do it!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
3
Carnegie Mellon
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting ▪ Summary
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
4
Carnegie Mellon
Everything is bits Each bit is 0 or 1
By encoding/interpreting sets of bits in various ways ▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
Why bits? Electronic Implementation
An Amazing & Successful Abstraction.
(which we won’t dig into in 213)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
5
Carnegie Mellon
Everything is bits Each bit is 0 or 1
By encoding/interpreting sets of bits in various ways ▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
Why bits? Electronic Implementation
▪ Easy to store with bistable elements
▪ Reliably transmitted on noisy and inaccurate wires
010
1.1V 0.9V
0.2V 0.0V
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
6
Carnegie Mellon
For example, can count in binary
Base 2 Number Representation
▪ Represent 1521310 as 111011011011012
▪ Represent 1.2010 as 1.0011001100110011[0011]…2 ▪ Represent 1.5213 X 104 as 1.11011011011012 X 213
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
7
Carnegie Mellon
Encoding Byte Values
Byte = 8 bits
▪ Binary 000000002 to 111111112
▪ Decimal: 010 to 25510
▪ Hexadecimal 0016 to FF16
▪ Base 16 number representation
▪ Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ ▪ Write FA1D37B16 in C as
– 0xFA1D37B – 0xfa1d37b
15213: 0011 1011 0110 1101
3B6D
0
0
0000
1
1
0001
2
2
0010
3
3
0011
4
4
0100
5
5
0101
6
6
0110
7
7
0111
8
8
1000
9
9
1001
A
10
1010
B
11
1011
C
12
1100
D
13
1101
E
14
1110
F
15
1111
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
8
Carnegie Mellon
Example Data Representations
C Data Type
Typical 32-bit
Typical 64-bit
x86-64
char
1
1
1
short
2
2
2
int
4
4
4
long
4
8
8
float
4
4
4
double
8
8
8
pointer
4
8
8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
9
Carnegie Mellon
Example Data Representations
C Data Type
char
short
int
long
float
double
pointer
1
2
4
4
4
8
4
1
2
4
8
4
8
8
1
2
4
8
4
8
8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
10
Typical 32-bit
Typical 64-bit
x86-64
Carnegie Mellon
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting ▪ Summary
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
11
Carnegie Mellon
Boolean Algebra
Developed by George Boole in 19th Century ▪ Algebraic representation of logic
▪ Encode “True” as 1 and “False” as 0 And Or
◼A&B = 1 when both A=1 and B=1 ◼A|B = 1 when either A=1 or B=1
Not Exclusive-Or (Xor)
◼~A = 1 when A=0 ◼A^B = 1 when either A=1 or B=1, but not both
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
12
Carnegie Mellon
General Boolean Algebras Operate on Bit Vectors
▪ Operations applied bitwise
01101001 01101001 01101001
& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001 01111101 00111100 10101010 01000001 01111101 00111100 10101010
All of the Properties of Boolean Algebra Apply
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
13
Carnegie Mellon
Example: Representing & Manipulating Sets
Representation
▪ Width w bit vector represents subsets of {0, …, w–1} ▪ aj = 1 if j ∈ A
▪ 01101001 ▪ 76543210
▪ 01010101 ▪ 76543210
Operations
▪ &
▪ |
▪ ^
{ 0, 3, 5, 6 }
{ 0, 2, 4, 6 }
01000001 { 0, 6 } 01111101 { 0, 2, 3, 4, 5, 6 } 00111100 { 2, 3, 4, 5 } 10101010 { 1, 3, 5, 7 }
Intersection
Union
Symmetric difference Complement
▪ ~
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
14
Carnegie Mellon
Bit-Level Operations in C
Operations &, |, ~, ^ Available in C
▪ Apply to any “integral” data type ▪ long, int, short, char, unsigned
▪ View arguments as bit vectors
▪ Arguments applied bit-wise
Examples (Char data type) ▪ ~0x41 → 0xBE
▪ ~010000012 → 101111102 ▪ ~0x00 → 0xFF
▪ ~000000002 → 111111112 ▪ 0x69 & 0x55 → 0x41
▪ 011010012 & 010101012 → 010000012 ▪ 0x69 | 0x55 → 0x7D
▪ 011010012 | 010101012 → 011111012 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
15
0
0
0000
1
1
0001
2
2
0010
3
3
0011
4
4
0100
5
5
0101
6
6
0110
7
7
0111
8
8
1000
9
9
1001
A
10
1010
B
11
1011
C
12
1100
D
13
1101
E
14
1110
F
15
1111
Carnegie Mellon
Bit-Level Operations in C
Operations &, |, ~, ^ Available in C
▪ Apply to any “integral” data type ▪ long, int, short, char, unsigned
▪ View arguments as bit vectors
▪ Arguments applied bit-wise
Examples (Char data type) ▪ ~0x41 → 0xBE
▪ ~0100000122→10111111002 2 ▪ ~0x00 → 0xFF
▪ ~0000000022→1111111112 2 ▪ 0x69 & 0x55 → 0x41
▪ 0110100122&0101001100112 2→→00110000010212 ▪ 0x69 | 0x55 → 0x7D
▪ 0110100122|0101001100112 2→→00111110110212 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
16
0
0
0000
1
1
0001
2
2
0010
3
3
0011
4
4
0100
5
5
0101
6
6
0110
7
7
0111
8
8
1000
9
9
1001
A
10
1010
B
11
1011
C
12
1100
D
13
1101
E
14
1110
F
15
1111
Carnegie Mellon
Contrast: Logic Operations in C
Contrast to Bit-Level Operators ▪ Logic Operations: &&, ||, !
▪ View 0 as “False”
▪ Anything nonzero as “True” ▪ Always return 0 or 1
▪ Early termination
Examples (char data type) ▪ !0x41 → 0x00
▪ !0x00 → 0x01
▪ !!0x41→ 0x01
▪ 0x69 && 0x55 → 0x01 ▪ 0x69 || 0x55 → 0x01
Watch out for && vs. & (and || vs. |)… Super common C programming pitfall!
▪ p && *p (avoids null pointer access) Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
17
Carnegie Mellon
Shift Operations
Left Shift: x << y
▪ Shift bit-vector x left y positions
– Throw away extra bits on left ▪ Fill with 0’s on right
Right Shift: x >> y
▪ Shift bit-vector x right y positions
▪ Throw away extra bits on right ▪ Logical shift
▪ Fill with 0’s on left ▪ Arithmetic shift
▪ Replicate most significant bit on left Undefined Behavior
▪ Shift amount < 0 or ≥ word size Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
18
Argument x
01100010
<< 3
00010000
Log. >> 2
00011000
Arith. >> 2
00011000
Argument x
10100010
<< 3
00010000
Log. >> 2
00101000
Arith. >> 2
11101000
Carnegie Mellon
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting ▪ Summary
Representations in memory, pointers, strings Summary
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
19
Carnegie Mellon
Encoding Integers Unsigned w−1
B2U(X) = xi2i i=0
Two’s Complement w−2
B2T(X) = −xw−12w−1+xi2i i=0
short int x = 15213;
short int y = -15213;
C does not mandate using two’s complement ▪ But, most machines do, and we will assume so
C short 2 bytes long
Sign Bit
▪ For 2’s complement, most significant bit indicates sign
▪ 0 for nonnegative
▪ 1 for negative
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
20
Sign Bit
Decimal
Hex
Binary
x
15213
3B 6D
00111011 01101101
y
-15213
C4 93
11000100 10010011
Carnegie Mellon
Two-complement: Simple Example
-16
8
4
2
1
0
1
0
1
0
10 = 8+2 = 10
-16
8
4
2
1
1
0
1
1
0
-10 = -16+4+2 = -10
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
21
Carnegie Mellon
Two-complement Encoding Example (Cont.)
x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
Weight
15213
-15213
1 2 4 8
16 32 64
128 256 512
1024
2048
4096
8192
16384 -32768
11 00 14 18 00 1 32 1 64 00 1 256 1 512 00 1 2048 1 4096 1 8192 00 00
11 12 00 00 1 16 00 00 1 128 00 00 1 1024 00 00 00 1 16384 1 -32768
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Sum 15213 -15213
22
Carnegie Mellon
Numeric Ranges Unsigned Values
Two’s Complement Values
▪ UMin 000…0
▪ UMax 111…1
= 0
= 2w – 1
▪ TMin = 100…0
▪ TMax = 011…1
▪ Minus 1 111…1
–2w–1 2w–1 – 1
Values for W = 16
Decimal
Hex
Binary
UMax
65535
FF FF
11111111 11111111
TMax
32767
7F FF
01111111 11111111
TMin
-32768
80 00
10000000 00000000
-1
-1
FF FF
11111111 11111111
0
0
00 00
00000000 00000000
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
23
Carnegie Mellon
Values for Different Word Sizes
W
8
16
32
64
UMax
255
65,535
4,294,967,295
18,446,744,073,709,551,615
TMax
127
32,767
2,147,483,647
9,223,372,036,854,775,807
TMin
-128
-32,768
-2,147,483,648
-9,223,372,036,854,775,808
Observations
▪ |TMin | = TMax + 1
▪ Asymmetric range
▪ UMax = 2 * TMax + 1 ▪ Question: abs(TMin)?
C Programming
▪ #include
▪ ULONG_MAX ▪ LONG_MAX ▪ LONG_MIN
▪ Values platform specific
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
24
Carnegie Mellon
Unsigned & Signed Numeric Values Equivalence
▪ Same encodings for nonnegative values
Uniqueness
▪ Every bit pattern represents
unique integer value
▪ Each representable integer has unique bit encoding
Can Invert Mappings ▪ U2B(x) = B2U-1(x)
▪ Bit pattern for unsigned integer
▪ T2B(x) = B2T-1(x)
▪ Bit pattern for two’s comp integer
X
B2U(X) B2T(X)
0000
00
0001
11
0010
22
0011
33
0100
44
0101
55
0110
66
0111
77
1000
8 –8
1001
9 –7
1010
10 –6
1011
11 –5
1100
12 –4
1101
13 –3
1110
14 –2
1111
15 –1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
25
Carnegie Mellon
Quiz Time!
Check out:
https://canvas.cmu.edu/courses/13182
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
26
Carnegie Mellon
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting ▪ Summary
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
27
Carnegie Mellon
Mapping Between Signed & Unsigned
Two’s Complement
T2U
Unsigned
x X ux
T2B
Maintain Same Bit Pattern Unsigned U2T
Two’s Complement
ux X x
Maintain Same Bit Pattern
Mappings between unsigned and two’s complement numbers: Keep bit representations and reinterpret
U2B
B2T
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
28
B2U
Carnegie Mellon
Mapping Signed Unsigned
Bits
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Signed
0
1
2
3
4
5
6
7
-8
-7
-6
-5
-4
-3
-2
-1
Unsigned
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
T2U
U2T
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
29
Carnegie Mellon
Mapping Signed Unsigned =
Bits
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Signed
0
1
2
3
4
5
6
7
-8
-7
-6
-5
-4
-3
-2
-1
Unsigned
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
+/- 16
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
30
Carnegie Mellon
Relation between Signed & Unsigned
Two’s Complement
T2U
Unsigned
x X ux
Maintain Same Bit Pattern
w–1 0 ux
x
Large negative weight
becomes
Large positive weight
T2B
B2U
+
+
+
•••
+
+
+
–
+
+
•••
+
+
+
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
31
Carnegie Mellon
Conversion Visualized
2’s Comp. → Unsigned ▪ Ordering Inversion
▪ Negative → Big Positive
UMax UMax – 1
TMax + 1 TMax TMax
Unsigned Range
2’s Complement Range
00 –1
–2
TMin
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
32
Carnegie Mellon
Signed vs. Unsigned in C
Constants
▪ By default are considered to be signed integers ▪ Unsigned if have “U” as suffix
0U, 4294967259U
Casting
▪ Explicit casting between signed & unsigned same as U2T and T2U int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
▪ Implicit casting also occurs via assignments and procedure calls tx = ux; int fun(unsigned u); uy = ty; uy = fun(tx);
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
33
Carnegie Mellon
Casting Surprises Expression Evaluation
▪If there is a mix of unsigned and signed in single expression, signed values implicitly cast to unsigned
▪ Including comparison operations <, >, ==, <=, >=
▪Examples for W = 32: TMIN = -2,147,483,648 , TMAX = 2,147,483,647
Constant1
0
-1
-1
Constant2
Relation Evaluation == unsigned
signed unsigned signed unsigned signed unsigned unsigned signed
0
0U
-1
0
-1
0 < 0U
2147483647
0U > -2147483647-1
2147483647
-2147483648 > -2147483647-1
2147483647U
2147483647U -1
-2147483648 < -2
-1 (unsigned)-1
-2 > -2
(unsigned) -1
-2 > 2147483648U
2147483647
2147483647
2147483648U < (int) 2147483648U
2147483647
2147483647
(int) 2147483648U > Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
34
0U
Carnegie Mellon
Summary
Casting Signed ↔ Unsigned: Basic Rules
Bit pattern is maintained
But reinterpreted
Can have unexpected effects: adding or subtracting 2w
Expression containing signed and unsigned int
▪ int is cast to unsigned!!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
36
Carnegie Mellon
Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting ▪ Summary
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
37
Carnegie Mellon
Sign Extension
Task:
▪ Given w-bit signed integer x
▪ Convert it to w+k-bit integer with same value
Rule:
▪ Make k copies of sign bit:
▪X= xw–1,…,xw–1,xw–1,xw–2,…,x0
k copies of MSB
w
•••
X
•••
X
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
•••
•••
kw
38
Sign Extension: Simple Example
Positive number
Negative number
Carnegie Mellon
-16
8
4
2
1
0
1
0
1
0
10 =
-10 =
-32
16
8
4
2
1
0
0
1
0
1
0
10 =
-10 =
-16
8
4
2
1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
39
-32
1
1
16
1
0
8
0
1
4
1
1
2
1
0
1
0
Carnegie Mellon
Larger Sign Extension Example
short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;
Decimal
Hex
Binary
x
15213
3B 6D
00111011 01101101
ix
15213
00 00 3B 6D
00000000 00000000 00111011 01101101
y
-15213
C4 93
11000100 10010011
iy
-15213
FF FF C4 93
11111111 11111111 11000100 10010011
Converting from smaller to larger integer data type C automatically performs sign extension
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
40
Carnegie Mellon
Truncation
Task:
▪ Given k+w-bit signed or unsigned integer X
▪ Convert it to w-bit integer X’ with same value for “small enough” X
Rule:
▪ Drop top k bits: ▪X= xw–1,xw–2,…,x0
X
kw
•••
X
w
41
•••
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
•••
•••
Carnegie Mellon
Truncation: Simple Example
No sign change
Sign change
10 =
-6 =
10 mod 16 = 10U mod 16 = 10U = -6
-10 =
6 =
-10 mod 16 = 22U mod 16 = 6U = 6
-16
8
4
2
1
0
0
0
1
0
-16
8
4
2
1
0
1
0
1
0
2 =
2 =
-8
4
2
1
0
0
1
0
-8
4
2
1
1
0
1
0
2 mod 16 = 2
-16
8
4
2
1
1
1
0
1
0
-16
8
4
2
1
1
0
1
1
0
-6 =
-6 =
-8
4
2
1
1
0
1
0
-8
4
2
1
0
1
1
0
-6 mod 16 = 26U mod 16 = 10U = -6
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
42
Carnegie Mellon
Summary:
Expanding, Truncating: Basic Rules
Expanding (e.g., short int to int) ▪ Unsigned: zeros added
▪ Signed: sign extension
▪ Both yield expected result
Truncating (e.g., unsigned to unsigned short) ▪ Unsigned/signed: bits are truncated
▪ Result reinterpreted
▪ Unsigned: mod operation
▪ Signed: similar to mod
▪ For small (in magnitude) numbers yields expected behavior
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
43
Carnegie Mellon
Summary of Today: Bits, Bytes, and Integers
Representing information as bits
Bit-level manipulations
Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
Representations in memory, pointers, strings Summary
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
44