CS计算机代考程序代写 cache compiler assembly RISC-V Java x86 assembler Introduction to Computer Systems 15-213/18-243, spring 2009 1st Lecture, Jan. 12th

Introduction to Computer Systems 15-213/18-243, spring 2009 1st Lecture, Jan. 12th

Memory, Data, & Addressing I

http://xkcd.com/953/

CMPT 295
L02: Memory & Data I

1

Roadmap
2
car *c = malloc(sizeof(car));
c->miles = 100;
c->gals = 17;
float mpg = get_mpg(c);
free(c);
Car c = new Car();
c.setMiles(100);
c.setGals(17);
float mpg =
c.getMPG();

Java:
C:
Assembly language:
Machine code:
0111010000011000
100011010000010000000010
1000100111000010
110000011111101000011111
Computer system:
OS:

Memory & data
Arrays & structs
Integers & floats
RISC V assembly
Procedures & stacks
Executables
Memory & caches
Processor Pipeline
Performance
Parallelism

CMPT 295
L02: Memory & Data I

2

Hardware: Physical View
3

CPU
(empty slot)
USB…
Bus connections
I/O
controller
Storage connections
Memory

CMPT 295
L02: Memory & Data I
3
CPU
Looks funny because of heat sink

Hardware: Logical View
4
CPU
Memory
Disks
Net
USB

Etc.
Bus

CMPT 295
L02: Memory & Data I
4
CPU: all computations done here
Memory: big array of bytes
Big pipe between CPU and memory, in order to keep CPU fed
Bus: connect everything else to each other
Bus just means you broadcast on it (to everyone)
Same root as the other kind of “bus”
Disks bigger storage, slower, farther away, persistent
Network, etc.

Hardware: 295 View (version 0)
The CPU executes instructions
Memory stores data

Binary encoding!
Instructions are just data
5
Memory

CPU
?
How are data and instructions represented?

CMPT 295
L02: Memory & Data I
5
The CPU and Memory are what this course focuses on.

Binary Encoding Additional Details
Because storage is finite in reality, everything is stored as “fixed” length
Data is moved and manipulated in fixed-length chunks
Multiple fixed lengths (e.g. 1 byte, 4 bytes, 8 bytes)
Leading zeros now must be included up to “fill out” the fixed length

Example: the “eight-bit” representation of the number 4 is 0b00000100
6
Least Significant Bit (LSB)
Most Significant Bit (MSB)

CMPT 295
L02: Memory & Data I

6

Hardware: 295 View (version 0)
To execute an instruction, the CPU must:
Fetch the instruction
(if applicable) Fetch data needed by the instruction
Perform the specified computation
(if applicable) Write the result back to memory

7
Memory
CPU
?
data
instructions

CMPT 295
L02: Memory & Data I
7
The CPU and Memory are what this course focuses on.

Hardware: 295 View (version 1)
8
Memory

CPU
take 300
registers

data
instructions
We will start by learning about Memory
How does a program find its data in memory?

CMPT 295
L02: Memory & Data I
8

Byte-Oriented Memory Organization
Conceptually, memory is a single, large array of bytes,
each with a unique address (index)
Each address is just a number represented in fixed-length binary

Programs refer to bytes in memory by their addresses
Domain of possible addresses = address space
We can store addresses as data to “remember” where other data is in memory

But not all values fit in a single byte… (e.g. 351)
Many operations actually use multi-byte values

9
00•••0
FF•••F
• • •

CMPT 295
L02: Memory & Data I
Address is n bit number.
Each byte can be read and written
Address space determined by how many bits in the address
So what if things don’t fit in a single byte?
How many things does 1 byte allow us to address?

Peer Instruction Question
If we choose to use 4-bit addresses, how big is our address space?
i.e. How much space can we “refer to” using our addresses?

16 bits
16 bytes
4 bits
4 bytes
We’re lost…

10

CMPT 295
L02: Memory & Data I

10

Machine “Words”

We have chosen to tie word size to address size/width
word size = address size = register size
word size = bits addresses

Current x86 systems use 64-bit (8-byte) words
Potential address space: addresses
264 bytes  1.8 x 1019 bytes
= 18 billion billion bytes = 18 EB (exabytes)
Actual physical address space: 48 bits
11

CMPT 295
L02: Memory & Data I
Instead, we typically talk about addresses in terms of “words”
Use same word size anywhere that you typically store addresses
You could count the world population with 33 fingers (using population number of 7.454 billion)

Word-Oriented Memory Organization
Addresses still specify
locations of bytes in memory
Addresses of successive words
differ by word size (in bytes):
e.g. 4 (32-bit) or 8 (64-bit)
Address of word 0, 1, … 10?

12

0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x0A
0x0B

32-bit
Words
Bytes

0x0C

0x0D

0x0E

0x0F
64-bit
Words
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr.
(hex)

CMPT 295
L02: Memory & Data I
What do we use to address bigger chunks of bytes?

Word-Oriented Memory Organization
Addresses still specify
locations of bytes in memory
Addresses of successive words
differ by word size (in bytes):
e.g. 4 (32-bit) or 8 (64-bit)
Address of word 0, 1, … 10?
Address of word
= address of first byte in word
The address of any chunk of
memory is given by the address
of the first byte
Alignment

13

32-bit
Words
Bytes

64-bit
Words
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
Addr
=
??
0000
0004
0008
0012
0000
0008
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x0A
0x0B
0x0C
0x0D
0x0E
0x0F
Addr.
(hex)

CMPT 295
L02: Memory & Data I
Use the first byte of the word
Still count by bytes
Aside: isn’t this wasteful?
Yes, but remember how much we can address with 64 bits? Probably don’t need to worry too much.

A Picture of Memory (64-bit view)
A “64-bit (8-byte) word-aligned” view of memory:
In this type of picture, each row is composed of 8 bytes
Each cell is a byte
A 64-bit pointer
will fit on one row

14

0x00
0x
0x
0x
0x
0x
0x
0x
0x
0x
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03

one word
Address

CMPT 295
L02: Memory & Data I
What’s the second row’s address going to be?
0x08
Third?
0x10
14

A Picture of Memory (64-bit view)
A “64-bit (8-byte) word-aligned” view of memory:
In this type of picture, each row is composed of 8 bytes
Each cell is a byte
A 64-bit pointer
will fit on one row

15

0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48
Address

one word
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03
0x0D
0x0E
0x0F
0x0C
0x09
0x0A
0x0B
0x08

CMPT 295
L02: Memory & Data I

15

Addresses and Pointers
An address is a location in memory
A pointer is a data object that holds an address
Address can point to any data
Value 504 stored at
address 0x08
50410 = 1F816
= 0x 00 … 00 01 F8
Pointer stored at
0x38 points to
address 0x08
16

0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48

Address
00
00
00
00
00
00
01
F8
00
00
00
00
00
00
00
08
64-bit example
(pointers are 64-bits wide)

big-endian

CMPT 295
L02: Memory & Data I
Note that everything is just padded with extra 0s

16

Addresses and Pointers
An address is a location in memory
A pointer is a data object that holds an address
Address can point to any data
Pointer stored at
0x48 points to
address 0x38
Pointer to a pointer!
Is the data stored
at 0x08 a pointer?
Could be, depending
on how you use it
17

0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38
0x40
0x48

Address
00
00
00
00
00
00
01
F8
00
00
00
00
00
00
00
08
00
00
00
00
00
00
00
38
64-bit example
(pointers are 64-bits wide)
big-endian

CMPT 295
L02: Memory & Data I
Note that everything is just padded with extra 0s

17

Data Representations
Sizes of data types (in bytes)

18
To use “bool” in C, you must #include
Java Data Type C Data Type 32-bit x86-64
boolean bool 1 1
byte char 1 1
char 2 2
short short int 2 2
int int 4 4
float float 4 4
long int 4 8
double double 8 8
long long 8 8
long double 8 16
(reference) pointer * 4 8

(reference) pointer * 4 8

address size = word size

CMPT 295
L02: Memory & Data I
This mapping is mostly an arbitrary historical artifact except for address sizes.

Memory Alignment
Aligned: Primitive object of bytes must have an address that is a multiple of
More about alignment later in the course

For good memory system performance, data has to be aligned.
19
Type
1 char
2 short
4 int, float
8 long, double, pointers

CMPT 295
L02: Memory & Data I

19

Byte Ordering
How should bytes within a word be ordered in memory?
Example: store the 4-byte (32-bit) int:
0x a1 b2 c3 d4

By convention, ordering of bytes called endianness
The two options are big-endian and little-endian
In which address does the least significant byte go?
Based on Gulliver’s Travels: tribes cut eggs on different sides
(big, little)
20

CMPT 295
L02: Memory & Data I

Byte Ordering
Big-endian (SPARC, z/Architecture)
Least significant byte has highest address
Little-endian (x86, x86-64, RISC-V)
Least significant byte has lowest address
Bi-endian (ARM, PowerPC)
Endianness can be specified as big or little
Example: 4-byte data 0xa1b2c3d4 at address 0x100

21
0x100
0x101
0x102
0x103

01
23
45
67

0x100
0x101
0x102
0x103

67
45
23
01

Big-Endian
Little-Endian
a1
b2
c3
d4
d4
c3
b2
a1

CMPT 295
L02: Memory & Data I
Big Endian
Most significant is first
Little Endian
Looks backwards
Let’s look at some more examples…

Byte Ordering Examples
22
Decimal: 12345
Binary: 0011 0000 0011 1001
Hex: 3 0 3 9
39
30
00
00
IA32, x86-64
(little-endian)

00
00
00
00
39
30
00
00
64-bit
x86-64
39
30
00
00
32-bit
IA32

30
39
00
00
SPARC
(big-endian)
30
39
00
00
32-bit
SPARC
30
39
00
00
64-bit
SPARC
00
00
00
00

int x = 12345;
// or x = 0x3039;

long int y = 12345;
// or y = 0x3039;

(A long int is
the size of a word)

0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x00
0x01
0x02
0x03
0x04
0x05
0x06
0x07

CMPT 295
L02: Memory & Data I

Peer Instruction Question:
We store the value 0x 01 02 03 04 as a word at address 0x100 in a big-endian, 64-bit machine
What is the byte of data stored at address 0x104?

0x04
0x40
0x01
0x10
We’re lost…

23

CMPT 295
L02: Memory & Data I

23

Endianness
Endianness only applies to memory storage
Often programmer can ignore endianness because it is handled for you
Bytes wired into correct place when reading or storing from memory (hardware)
Compiler and assembler generate correct behavior (software)
Endianness still shows up:
Logical issues: accessing different amount of data than how you stored it (e.g. store int, access byte as a char)
Need to know exact values to debug memory errors
Software emulation machine code (assignment 2)
24

CMPT 295
L02: Memory & Data I

24

Summary
Memory is a long, byte-addressed array
Word size bounds the size of the address space and memory
Different data types use different number of bytes
Address of chunk of memory given by address of lowest byte in chunk
Object of bytes is aligned if it has an address that is a multiple of
Pointers are data objects that hold addresses
Endianness determines memory storage order for multi-byte data
25

CMPT 295
L02: Memory & Data I

25