Introduction to Computer Systems 15-213/18-243, spring 2009 1st Lecture, Jan. 12th
Memory, Data, & Addressing I
http://xkcd.com/953/
CMPT 295
L02: Memory & Data I
1
Roadmap
2
car *c = malloc(sizeof(car));
c->miles = 100;
c->gals = 17;
float mpg = get_mpg(c);
free(c);
Car c = new Car();
c.setMiles(100);
c.setGals(17);
float mpg =
c.getMPG();
Java:
C:
Assembly language:
Machine code:
0111010000011000
100011010000010000000010
1000100111000010
110000011111101000011111
Computer system:
OS:
Memory & data
Arrays & structs
Integers & floats
RISC V assembly
Procedures & stacks
Executables
Memory & caches
Processor Pipeline
Performance
Parallelism
CMPT 295
L02: Memory & Data I
2
Hardware: Physical View
3
CPU
(empty slot)
USB…
Bus connections
I/O
controller
Storage connections
Memory
CMPT 295
L02: Memory & Data I
3
CPU
Looks funny because of heat sink
Hardware: Logical View
4
CPU
Memory
Disks
Net
USB
Etc.
Bus
CMPT 295
L02: Memory & Data I
4
CPU: all computations done here
Memory: big array of bytes
Big pipe between CPU and memory, in order to keep CPU fed
Bus: connect everything else to each other
Bus just means you broadcast on it (to everyone)
Same root as the other kind of “bus”
Disks bigger storage, slower, farther away, persistent
Network, etc.
Hardware: 295 View (version 0)
The CPU executes instructions
Memory stores data
Binary encoding!
Instructions are just data
5
Memory
CPU
?
How are data and instructions represented?
CMPT 295
L02: Memory & Data I
5
The CPU and Memory are what this course focuses on.
Binary Encoding Additional Details
Because storage is finite in reality, everything is stored as “fixed” length
Data is moved and manipulated in fixed-length chunks
Multiple fixed lengths (e.g. 1 byte, 4 bytes, 8 bytes)
Leading zeros now must be included up to “fill out” the fixed length
Example: the “eight-bit” representation of the number 4 is 0b00000100
6
Least Significant Bit (LSB)
Most Significant Bit (MSB)
CMPT 295
L02: Memory & Data I
6
Hardware: 295 View (version 0)
To execute an instruction, the CPU must:
Fetch the instruction
(if applicable) Fetch data needed by the instruction
Perform the specified computation
(if applicable) Write the result back to memory
7
Memory
CPU
?
data
instructions
CMPT 295
L02: Memory & Data I
7
The CPU and Memory are what this course focuses on.
Hardware: 295 View (version 1)
8
Memory
CPU
take 300
registers
data
instructions
We will start by learning about Memory
How does a program find its data in memory?
CMPT 295
L02: Memory & Data I
8
Byte-Oriented Memory Organization
Conceptually, memory is a single, large array of bytes,
each with a unique address (index)
Each address is just a number represented in fixed-length binary
Programs refer to bytes in memory by their addresses
Domain of possible addresses = address space
We can store addresses as data to “remember” where other data is in memory
But not all values fit in a single byte… (e.g. 351)
Many operations actually use multi-byte values
9
00•••0
FF•••F
• • •
CMPT 295
L02: Memory & Data I
Address is n bit number.
Each byte can be read and written
Address space determined by how many bits in the address
So what if things don’t fit in a single byte?
How many things does 1 byte allow us to address?
Peer Instruction Question
If we choose to use 4-bit addresses, how big is our address space?
i.e. How much space can we “refer to” using our addresses?
16 bits
16 bytes
4 bits
4 bytes
We’re lost…
10
CMPT 295
L02: Memory & Data I
10
Machine “Words”
We have chosen to tie word size to address size/width
word size = address size = register size
word size = bits addresses
Current x86 systems use 64-bit (8-byte) words
Potential address space: addresses
264 bytes 1.8 x 1019 bytes
= 18 billion billion bytes = 18 EB (exabytes)
Actual physical address space: 48 bits
11
CMPT 295
L02: Memory & Data I
Instead, we typically talk about addresses in terms of “words”
Use same word size anywhere that you typically store addresses
You could count the world population with 33 fingers (using population number of 7.454 billion)
Word-Oriented Memory Organization
Addresses still specify
locations of bytes in memory
Addresses of successive words
differ by word size (in bytes):
e.g. 4 (32-bit) or 8 (64-bit)
Address of word 0, 1, … 10?
12
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x0A
0x0B
32-bit
Words
Bytes
0x0C
0x0D
0x0E
0x0F
64-bit
Words
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr.
(hex)
CMPT 295
L02: Memory & Data I
What do we use to address bigger chunks of bytes?
Word-Oriented Memory Organization
Addresses still specify
locations of bytes in memory
Addresses of successive words
differ by word size (in bytes):
e.g. 4 (32-bit) or 8 (64-bit)
Address of word 0, 1, … 10?
Address of word
= address of first byte in word
The address of any chunk of
memory is given by the address
of the first byte
Alignment
13
32-bit
Words
Bytes
64-bit
Words
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
0000
0004
0008
0012
0000
0008
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x0A
0x0B
0x0C
0x0D
0x0E
0x0F
Addr.
(hex)
CMPT 295
L02: Memory & Data I
Use the first byte of the word
Still count by bytes
Aside: isn’t this wasteful?
Yes, but remember how much we can address with 64 bits? Probably don’t need to worry too much.
A Picture of Memory (64-bit view)
A “64-bit (8-byte) word-aligned” view of memory:
In this type of picture, each row is composed of 8 bytes
Each cell is a byte
A 64-bit pointer
will fit on one row
14
0x00
0x
0x
0x
0x
0x
0x
0x
0x
0x
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03
one word
Address
CMPT 295
L02: Memory & Data I
What’s the second row’s address going to be?
0x08
Third?
0x10
14
A Picture of Memory (64-bit view)
A “64-bit (8-byte) word-aligned” view of memory:
In this type of picture, each row is composed of 8 bytes
Each cell is a byte
A 64-bit pointer
will fit on one row
15
0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48
Address
one word
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03
0x0D
0x0E
0x0F
0x0C
0x09
0x0A
0x0B
0x08
CMPT 295
L02: Memory & Data I
15
Addresses and Pointers
An address is a location in memory
A pointer is a data object that holds an address
Address can point to any data
Value 504 stored at
address 0x08
50410 = 1F816
= 0x 00 … 00 01 F8
Pointer stored at
0x38 points to
address 0x08
16
0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48
Address
00
00
00
00
00
00
01
F8
00
00
00
00
00
00
00
08
64-bit example
(pointers are 64-bits wide)
big-endian
CMPT 295
L02: Memory & Data I
Note that everything is just padded with extra 0s
16
Addresses and Pointers
An address is a location in memory
A pointer is a data object that holds an address
Address can point to any data
Pointer stored at
0x48 points to
address 0x38
Pointer to a pointer!
Is the data stored
at 0x08 a pointer?
Could be, depending
on how you use it
17
0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48
Address
00
00
00
00
00
00
01
F8
00
00
00
00
00
00
00
08
00
00
00
00
00
00
00
38
64-bit example
(pointers are 64-bits wide)
big-endian
CMPT 295
L02: Memory & Data I
Note that everything is just padded with extra 0s
17
Data Representations
Sizes of data types (in bytes)
18
To use “bool” in C, you must #include
Java Data Type C Data Type 32-bit x86-64
boolean bool 1 1
byte char 1 1
char 2 2
short short int 2 2
int int 4 4
float float 4 4
long int 4 8
double double 8 8
long long 8 8
long double 8 16
(reference) pointer * 4 8
(reference) pointer * 4 8
address size = word size
CMPT 295
L02: Memory & Data I
This mapping is mostly an arbitrary historical artifact except for address sizes.
Memory Alignment
Aligned: Primitive object of bytes must have an address that is a multiple of
More about alignment later in the course
For good memory system performance, data has to be aligned.
19
Type
1 char
2 short
4 int, float
8 long, double, pointers
CMPT 295
L02: Memory & Data I
19
Byte Ordering
How should bytes within a word be ordered in memory?
Example: store the 4-byte (32-bit) int:
0x a1 b2 c3 d4
By convention, ordering of bytes called endianness
The two options are big-endian and little-endian
In which address does the least significant byte go?
Based on Gulliver’s Travels: tribes cut eggs on different sides
(big, little)
20
CMPT 295
L02: Memory & Data I
Byte Ordering
Big-endian (SPARC, z/Architecture)
Least significant byte has highest address
Little-endian (x86, x86-64, RISC-V)
Least significant byte has lowest address
Bi-endian (ARM, PowerPC)
Endianness can be specified as big or little
Example: 4-byte data 0xa1b2c3d4 at address 0x100
21
0x100
0x101
0x102
0x103
01
23
45
67
0x100
0x101
0x102
0x103
67
45
23
01
Big-Endian
Little-Endian
a1
b2
c3
d4
d4
c3
b2
a1
CMPT 295
L02: Memory & Data I
Big Endian
Most significant is first
Little Endian
Looks backwards
Let’s look at some more examples…
Byte Ordering Examples
22
Decimal: 12345
Binary: 0011 0000 0011 1001
Hex: 3 0 3 9
39
30
00
00
IA32, x86-64
(little-endian)
00
00
00
00
39
30
00
00
64-bit
x86-64
39
30
00
00
32-bit
IA32
30
39
00
00
SPARC
(big-endian)
30
39
00
00
32-bit
SPARC
30
39
00
00
64-bit
SPARC
00
00
00
00
int x = 12345;
// or x = 0x3039;
long int y = 12345;
// or y = 0x3039;
(A long int is
the size of a word)
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
CMPT 295
L02: Memory & Data I
Peer Instruction Question:
We store the value 0x 01 02 03 04 as a word at address 0x100 in a big-endian, 64-bit machine
What is the byte of data stored at address 0x104?
0x04
0x40
0x01
0x10
We’re lost…
23
CMPT 295
L02: Memory & Data I
23
Endianness
Endianness only applies to memory storage
Often programmer can ignore endianness because it is handled for you
Bytes wired into correct place when reading or storing from memory (hardware)
Compiler and assembler generate correct behavior (software)
Endianness still shows up:
Logical issues: accessing different amount of data than how you stored it (e.g. store int, access byte as a char)
Need to know exact values to debug memory errors
Software emulation machine code (assignment 2)
24
CMPT 295
L02: Memory & Data I
24
Summary
Memory is a long, byte-addressed array
Word size bounds the size of the address space and memory
Different data types use different number of bytes
Address of chunk of memory given by address of lowest byte in chunk
Object of bytes is aligned if it has an address that is a multiple of
Pointers are data objects that hold addresses
Endianness determines memory storage order for multi-byte data
25
CMPT 295
L02: Memory & Data I
25