Introduction to Computer Systems 15-213/18-243, spring 2009 1st Lecture, Jan. 12th
Data representation I
Copyright By PowCoder代写 加微信 powcoder
Acknowledgement: These slides are based on the textbook
(Computer Systems: A Programmer’s Perspective) and its slides.
What happens in a computer during program execution?
Our focus today:
“How is data stored in memory?”
#include
int main()
int c=a+b;
printf(“%d\n”,c);
Background on bits
Representation: unsigned and signed
Conversion, casting
Addition, negation, multiplication, shifting
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Represent everything in bits
Each bit is 0 or 1
Bits are used to
Encode instructions (to be used in CPU)
Represent numbers, sets, strings, etc…
Base 2 number representation
Represent 1521310 as 111011011011012
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Encoding Byte Values
1 byte = 8 bits
Binary 000000002 to 111111112
Decimal: 010 to 25510
Hexadecimal 0016 to FF16
Base 16 number representation
Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’
Write FA1D37B16 in C as
#include
int main() {
printf(“%x\n”,262263675);
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Example Data Representations
C Data Type Typical 32-bit Typical 64-bit x86-64
char 1 1 1
short 2 2 2
long 4 8 8
float 4 4 4
double 8 8 8
long double − − 16
pointer 4 8 8
We will study them in lecture 3
#include
int main() {
printf(“%d\n”,sizeof(int));
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Boolean Algebra on Single Bit
Algebra of logic expression
Encode “True” as 1 and “False” as 0
A&B = 1 when both A=1 and B=1
A|B = 1 when either A=1 or B=1
~A = 1 when A=0
Exclusive-Or (Xor) ^
A^B = 1 when either A=1 or B=1, but not both
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Boolean Algebra on Bit Vectors
Operate on Bit Vectors
Operations applied bitwise
& 01010101
| 01010101
^ 01010101
~ 01010101
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Example: Representing & Manipulating Sets
Representation
Width w bit vector represents subsets of {0, …, w–1}
Examples of bit vectors:
Operations
X & Y Intersection 01000001 { 0, 6 }
X | Y Union 01111101 { 0, 2, 3, 4, 5, 6 }
X ^ Y Symmetric difference 00111100 { 2, 3, 4, 5 }
~Y Complement 10101010 { 1, 3, 5, 7 }
Position 7 6 5 4 3 2 1 0
Vector X 0 1 1 0 1 0 0 1
Vector Y 0 1 0 1 0 1 0 1
{ 0, 3, 5, 6 }
{ 0, 2, 4, 6 }
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Bit-Level Operations in C
Operations &, |, ~, ^ Available in C
Apply to any “integral” data type
long, int, short, char, unsigned
View arguments as bit vectors
Arguments applied bit-wise
Examples (char data type)
~0x41 0xBE
~010000012 101111102
~0x00 0xFF
~000000002 111111112
0x69 & 0x55 0x41
011010012 & 010101012 010000012
0x69 | 0x55 0x7D
011010012 | 010101012 011111012
#include
int main() {
char ch = 0x69 & 0x55;
printf(“%x\n”,ch);
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Contrast: Logic Operations in C
Contrast to Logical Operators
View 0 as “False”
View nonzero as “True”
Always return 0 or 1
Examples (char data type)
!0x41 0x00
!0x00 0x01
!!0x41 0x01
0x69 && 0x55 0x01
0x69 || 0x55 0x01
Watch out for && vs. & (and || vs. |)…
a common mistake in C programming
#include
int main() {
char ch = 0x69 && 0x55;
printf(“%x\n”,ch);
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Shift Operations
Left Shift: x << y
Shift bit-vector x left y positions
Throw away extra bits on left
Fill with 0’s on right
Right Shift: x >> y
Shift bit-vector x right y positions
Throw away extra bits on right
Logical shift
Fill with 0’s on left
Arithmetic shift
Replicate most significant bit on left
Undefined Behavior
Shift amount < 0 or ≥ word size
Argument x
Argument x
Arith. >> 2
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Background on bits
Representation: unsigned and signed
Conversion, casting
Addition, negation, multiplication, shifting
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Encoding Integers
short int x = 15213;
short int y = -15213;
C short: 2 bytes long
For 2’s complement, most significant bit indicates sign
0 for nonnegative
1 for negative
Two’s Complement
#include
int main() {
short int x = 15213;
short int y = -15213;
printf(“%x %x\n”,x,y);
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Numeric Ranges
Unsigned Values
UMax = 2w – 1
Two’s Complement Values
TMin = –2w–1
TMax = 2w–1 – 1
Other Values
Values for W = 16
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Values for Different Word Sizes
Observations
|TMin | = TMax + 1
Asymmetric range
UMax = 2 * TMax + 1
C Programming
#include
Values platform specific
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Unsigned & Signed Numeric Values
Equivalence
Same encodings for nonnegative values
Uniqueness
Every bit pattern represents unique integer value
Each representable integer has unique bit encoding
Can Invert Mappings
U2B(x) = B2U-1(x)
Bit pattern for unsigned integer
T2B(x) = B2T-1(x)
Bit pattern for two’s comp integer
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Background on bits
Representation: unsigned and signed
Conversion, casting
Addition, negation, multiplication, shifting
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Two’s Complement
Maintain Same Bit Pattern
Mapping Between Signed & Unsigned
Two’s Complement
Maintain Same Bit Pattern
Mappings between unsigned and two’s complement numbers:
Keep bit representations and reinterpret
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Mapping Signed Unsigned
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Mapping Signed Unsigned
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
T Max + 1
2’s Complement Range
Conversion Visualized
2’s Comp. Unsigned
Ordering Inversion
Negative Big Positive
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Signed vs. Unsigned in C
By default are considered to be signed integers
Unsigned if have “U” as suffix
0U, 4294967259U
Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
Implicit casting also occurs via assignments and procedure calls
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
0 0U == unsigned
-1 0 < signed
-1 0U > unsigned
2147483647 -2147483648 > signed
2147483647U -2147483648 < unsigned
2147483647 2147483648U < unsigned
2147483647 (int) 2147483648U > signed
Expression Evaluation
If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
Including comparison operations <, >, ==, <=, >=
Examples for W = 32: TMIN = -2147483648 , TMAX = 2147483647
Constant1 Constant2 Relation Evaluation
2147483647 -2147483647-1
2147483647U -2147483647-1
2147483647 2147483648U
2147483647 (int) 2147483648U
Casting Surprises
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Background on bits
Representation: unsigned and signed
Conversion, casting
Addition, negation, multiplication, shifting
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Unsigned Addition
Standard Addition Function
Ignores carry output
Implements Modular Arithmetic
s = UAddw(u , v) = u + v mod 2w
True Sum: w+1 bits
Operands: w bits
Discard Carry: w bits
UAddw(u , v)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Two’s Complement Addition (for Signed)
TAdd and UAdd have Identical Bit-Level Behavior
Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
Will give s == t
True Sum: w+1 bits
Operands: w bits
Discard Carry: w bits
TAddw(u , v)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
TAdd Overflow
Functionality
True sum requires w+1 bits
Drop off MSB
Treat remaining bits as 2’s comp. integer
TAdd Result
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Unsigned Multiplication in C
Standard Multiplication Function
Ignores high order w bits
Implements Modular Arithmetic
UMultw(u , v) = u · v mod 2w
True Product: 2*w bits
Operands: w bits
Discard w bits: w bits
UMultw(u , v)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Signed Multiplication in C
Standard Multiplication Function
Ignores high order w bits
Some of which are different for signed vs. unsigned multiplication
Lower bits are the same
True Product: 2*w bits
Operands: w bits
Discard w bits: w bits
TMultw(u , v)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Power-of-2 Multiply with Shift
u << k gives u * 2k
Both signed and unsigned
u << 3 == u * 8
(u << 5) – (u << 3) == u * 24
Most machines shift and add faster than multiply
Compiler generates this code automatically
True Product: w+k bits
Operands: w bits
Discard k bits: w bits
UMultw(u , 2k)
TMultw(u , 2k)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Unsigned Power-of-2 Divide with Shift
Quotient of Unsigned by Power of 2
u >> k gives u / 2k
Uses logical shift
u / 2k
Binary Point
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Unsigned vs. Signed
Understand the domain of types
Don’t use unsigned without understanding implications
Easy to make mistakes
unsigned i;
unsigned cnt=10;
for (i = cnt; i >= 0; i–)
printf(“%u %d\n”,i,i);
OK to use unsigned to represent sets
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Background on bits
Representation: unsigned and signed
Conversion, casting
Addition, negation, multiplication, shifting
Representations in memory, pointers, strings
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Byte-Oriented Memory Organization
Programs refer to data by address
Conceptually, we view it as a very large array of bytes
An address is like an index into that array
and, a pointer variable stores an address
Note: system provides private address spaces to each “process”
Think of a process as a program being executed
So, a program can read/write its own data, but not others’ data
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Machine Words
Any given computer has a “Word Size”
Nominal size of integer-valued data
and of addresses
Until recently, most machines used 32 bits (4 bytes) as word size
Limits addresses to 4GB (232 bytes)
Increasingly, machines have 64-bit word size
Potentially, could use 18 EB (exabytes) of addressable memory
That’s 18.4 X 1018
Machines still support multiple data formats
Fractions or multiples of word size
Always integral number of bytes
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Word-Oriented Memory Organization
Addresses Specify Byte Locations
Address of first byte in word
Addresses of successive words differ by 4 (32-bit) or 8 (64-bit)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Byte Ordering
How are the bytes within a word ordered in memory?
Conventions
Big Endian: Sun, PPC Mac, Internet
Least significant byte has highest address
Little Endian: x86, ARM processors running Android, iOS, and Windows
Least significant byte has lowest address
Variable x has 4-byte value of 0x01234567
Address given by &x is 0x100
Big Endian
Little Endian
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Representing Integers
Decimal: 15213
Binary: 0011 1011 0110 1101
Hex: 3 B 6 D
IA32, x86-64
int A = 15213;
IA32, x86-64
Two’s complement representation
int B = -15213;
long int C = 15213;
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Examining Data Representations
Code to Print Byte Representation of Data
Casting pointer to unsigned char * allows treatment as a byte array
Printf directives:
%p: Print pointer
%x: Print Hexadecimal
typedef unsigned char *pointer;
void show_bytes(pointer start, size_t len){
for (i = 0; i < len; i++)
printf("%p\t0x%.2x\n",start+i, start[i]);
printf("\n");
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
show_bytes Execution Example
int a = 15213;
printf("int a = 15213;\n");
show_bytes((pointer) &a, sizeof(int));
Result (Linux x86-64):
int a = 15213;
0x7fffb7f71dbc 6d
0x7fffb7f71dbd 3b
0x7fffb7f71dbe 00
0x7fffb7f71dbf 00
Decimal: 15213
Binary: 0011 1011 0110 1101
Hex: 3 B 6 D
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Representing Pointers
Different compilers & machines assign different locations to objects
Even get different results each time run program
int B = -15213;
int *P = &B;
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
char S[6] = "18213";
Representing Strings
Strings in C
Represented by array of characters
Each character encoded in ASCII format
Standard 7-bit encoding of character set
Character “0” has code 0x30
Digit i has code 0x30+i
String should be null-terminated
Final character = 0
Compatibility
Byte ordering not an issue
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Decimal Hex Binary
3B 6D 00111011 01101101
C4 93 11000100 10010011
00111011 01101101
11000100 10010011
Decimal Hex Binary
FF FF 11111111 11111111
7F FF 01111111 11111111
80 00 10000000 00000000
FF FF 11111111 11111111
00 00 00000000 00000000
11111111 11111111
01111111 11111111
10000000 00000000
11111111 11111111
00000000 00000000
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808
4,294,967,295
18,446,744,073,709,551,615
2,147,483,647
9,223,372,036,854,775,807
-2,147,483,648
-9,223,372,036,854,775,808
Division Computed Hex Binary
15213 15213
3B 6D 00111011 01101101
7606.5 7606
1D B6 00011101 10110110
950.8125 950
03 B6 00000011 10110110
59.4257813 59
00 3B 00000000 00111011
00111011 01101101
00011101 10110110
00000011 10110110
59.4257813
00000000 00111011
/docProps/thumbnail.jpeg
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com