slides are adapted from CA course of wisc, princeton, mit, berkeley, etc.
The uses of the slides of this course are for educa/onal purposes only and should be 1 used only in conjunc/on with the textbook. Deriva/ves of the slides must
acknowledge the copyright no/ces of this and the originals.
•
Instruction Set Architecture (ISA)
The “contract” between software and hardware
• Functional definition of operations, modes, and storage locations
supported by hardware
• Precise description of how software can invoke and access them
•
•
Instruction Set Architecture (ISA)
The “contract” between software and hardware
• Functional definition of operations, modes, and storage locations
supported by hardware
• Precise description of how software can invoke and access them
Strictly speaking, ISA is the architecture
• Informally, architecture is also used to talk about the big picture of implementation
• Better to call this microarchitecture
•
M
icro
ISA
• No •
•
•
architecture (微架构)
specifies what hardware
guarantees regarding
How operations are implemented
Which operations are
Which operations take more power
doe
fast and whic
s,
not how it does
h are
and which
slow
take les
it
s
•
M
icro
ISA
• No •
•
architecture (微架构)
specifies what hardware
•
guarantees regarding
How operations are implemented
Which operations are
Which operations take more power
• These issues are determined by the
• •
doe
fast and whic
Microarchitecture = how hardware implements architecture All Pentiums implement the x86 architecture
s,
not how it does
h are
and which
microar
slow
chitecture
take les
it
s
与ISA有关的内容
1.
2.
3.
4.
5. 6.
Th
•
e Von N
Operations
Operand model
•
Where
Datatypes Control
eu
mann model
Implicit structure of all mo
Format
• Length and encoding
are operands stored and h
and operations
dern ISAs
ow do addres
s them?
与ISA有关的内容
•
•
1.
2.
3.
4.
5. 6.
Th
•
我
你
e Von N
Operations
Operand model
•
们
们
Where
Datatypes Control
用
有
的
个
eu
mann model
Implicit structure of all mo
Format
• Length and encoding
are operands stored and h
and operations
例
硬
子: MIP
件
综
合
S ISA
设
计
:
dern ISAs
MIPS
ow do addres
CPU
s them?
1. Von Neu
Fetch PC
Execute
Write Output
•
•
•
mann
Model
Implicit model of all
Key: program counter (PC
m
odern ISAs
• Defines total order
of
• Order and named storage define computation
• Value flows from insn X to Y via storage A iff…
•
X names A as output, Y
And Y a
fter X
in
total order
Processor logically executes loop at left • Instruction execution assumed atomic
• Instruction X finishe
s before
)
Decode
dynamic instructions
Read Inputs
•
Next
PC is
PC++ unless
insn says otherwise
Next PC
•
names A a
insn
X+1 starts
s input…
•
2. Instructio
Leng
1.
2.
3.
–
th
Varia
Compro
•
(长度)
Fixed length
• 32 or 64 bits
+
Sim
Code densit
ble le
Exa
n Format
ple implem
ngth
– Complex implementation + Code density
y
mise: two lengths
mple: MIPS16
entation: com
pute next PC
using only
PC
•
•
2. Instructio
Leng
1.
2.
3.
En
•
•
–
th
Fixed length
• 32 or 64 bits
+
Varia
Sim
ble le
Exa
n Format
ple implem
Code densit
ngth
– Complex implementation + Code density
Compro
•
y
mise: two lengths
mple: MIPS16
entation: com
coding (编码)
A few simple encodings simplify decoder implementation
Complex encoding
can improve code
pute next PC
density
using only
PC
•
M
L
IP
S
ength
Format
• 32-bits
• MIPS16: 16-bit
(指令格式)
variants of common instructio
ns
for density
•
•
M
L
IP
S
ength
Encodin
R-type
Format
• 32-bits
• MIPS16: 16-bit
g
• 3 formats, simple encoding
• Q: how many opera
Op(6)
(指令格式)
variants of common instructio
Rs(5)
tion types can be e
Rt(5)
Rd(5)
Sh(
ns
Func(6)
for density
ncoded in 6-bit opcode?
5)
I-typ
J-type
e
Op(6)
Op(6)
Rs(5)
Rt(5)
Target(26)
Im
med(16)
R Format (寄存器类型) 655556
• e.g., add $1, $2, $3
000000 00010 00011 00001 00000 100000 alu-rr 2 3 1 zero add/signed
opcode
rs
rt
rd
shamt
funct
I Format (立即数类型)
• All loads and stores use I-format • Assembly: lw $1, 100($2)
• Machine:
100011 00010 00001 0000000001100100 lw 2 1 100 (in binary)
opcode rs rt addr/immediate
6
5
5
16
I Format (立即数类型)
• ALU ops with immediates
– addi $1, $2, 100
– 001000 00010 00001 0000000001100100
• Conditional branches
– beq $1, $2, 7
– 000100 00001 00010 0000 0000 0000 0111
– PC = PC + (0000 0111 << 2) // word offset
J Format (跳转类型)
Direct Jump:
opcode addr
6 26
J Format (跳转类型)
Direct Jump:
opcode addr
6 26
• Jump to:
–New PC = 4 MSB of PC || addr || 00 – 4+26+2 = 32 bits for jump target
3. Opera
tions
•
Operation
type
encod
ed in inst
ruction
opcod
e
•
•
3. Opera
Operation
• Integer a
• FP
arithmetic:
tions
type
Many types of operati
encod
rithmetic: add
• Integer logical: and, or, xor,
• •••
ed in inst
ons
, sub, mul,
add, sub, mul, d
iv,
ruction
div, mod/rem
sqrt
not, sll, srl, sra
opcod
(si
e
gned/un
signe
d)
•
•
•
3. Opera
Operation
• Integer a
• FP
arithmetic:
tions
type
Many types of operati
encod
rithmetic: add
• Integer logical: and, or, xor,
• •••
What other operations might
ed in inst
ons
, sub, mul,
add, sub, mul, d
iv,
ruction
div, mod/rem
sqrt
not, sll, srl, sra
be useful?
opcod
(si
e
gned/un
signe
d)
•
•
•
•
3. Opera
Operation
• Integer a
• FP
arithmetic:
tions
type
Many types of operati
encod
rithmetic: add
• Integer logical: and, or, xor,
ed in inst
ons
, sub, mul,
add, sub, mul, d
• •••
What other operations might be useful? More operation types == better ISA?
iv,
ruction
div, mod/rem
sqrt
not, sll, srl, sra
opcod
(si
e
gned/un
signe
d)
•
•
•
•
•
3. Opera
Operation
• FP
DEC VAX
arithmetic:
tions
type
Many types of operati
• Integer a
• Integer logical: and, or, xor,
• E.g., instructio
encod
rithmetic: add
ed in inst
ons
, sub, mul,
add, sub, mul, d
• •••
What other operations might be useful? More operation types == better ISA?
computer had LOTS of o
n for polynomial
iv,
ev
ruction
div, mod/rem
sqrt
not, sll, srl, sra
aluati
pera
on
opcod
(si
e
gned/un
tion types
(no joke!)
signe
d)
•
•
•
•
•
3. Opera
Operation
• FP
arithmetic:
tions
type
Many types of operati
• Integer a
• Integer logical: and, or, xor,
• E.g., instructio
• But
encod
rithmetic: add
ed in inst
ons
, sub, mul,
add, sub, mul, d
• •••
What other operations might be useful? More operation types == better ISA?
DEC VAX
computer had LOTS of o
n for polynomial
many of them were rarely/never
iv,
ev
ruction
div, mod/rem
sqrt
not, sll, srl, sra
aluati
us
pera
on
ed
opcod
(si
e
gned/un
tion types
(no joke!)
signe
d)
•
•
•
4. Operand Model (操作数模型)
If
you’re
going to
add, you
• Two source operands, one destination op Question #1: Where can operands co
Question #2: A
nd h
ow are
need at least
they specified?
3
operands
erand
me from?
•
•
•
•
•
•
4. Operand Model (操作数模型)
If
you’re
Question #2: A
Running e
going to
Discuss: Memory-Onl
Optional: Accumulator & Sta
add, you
• Two source operands, one destination op Question #1: Where can operands co
nd h
ow are
xample: A = B + C
• Several options for answering both questions
y & Re
need at least
they specified?
gisters
ck
3
operands
erand
me from?
Op
erand Model
I:
Memory
Only
•
Memory
only
add A,B,C
mem[A] =
mem[B]
+ me
m[C]
MEM
•
O
perand
Accumu
load
add C
store A
B
lator
Mo
:
del II:
implicit single
Accumulator
ACC
-element
= mem[B]
ACC = ACC + mem[C] mem[A] = ACC
st
ack
ACC
MEM
•
O
perand
Stack
push
push
add
pop A
B
C
Mo
: top of stack
del
III: Stack
(T
OS)
stack[T
stack[T stack[T
m
em[A] =
is
implicit in instruction
OS++] = m
em[B]
OS++] = mem[C]
OS++] = stack[--TOS] + stack[--TOS]
stack[--TO
S]
s
TOS
MEM
•
•
O
L
perand
General
load
add R1, C
store R1,A
oad-store :
load R1,B load R2,C
add R1,R1,R2
store R1,A
Mo
del
IV: Registers
-purpose registers : multiple
R1,B
GPR and only loa
R1 = mem[B]
R1 = R1 + mem[C] mem[A] = R1
R1 = mem[B] R2 = mem[C]
R1 = R1 + R2
mem[A
]=
ds/stor
R1
explicit accumulator
es access memory
s
MEM
•
O
perand
Metric I:
Mo
static code
del: Pros
• Number of instructions needed to • Evaluation: register < load-store
size
and
Cons
represent program, size < memory only
of
each
•
•
Operand
Metric I:
Model: Pros
static code
• Number of instructions needed to • Evaluation: register < load-store
Metric II: data mem
• Number of bytes
• Evaluation:
load-store
size
ory traffic
moved to and from
> register
and
Cons
represent program, size < memory only
memory
> memo
ry only
of
each
•
•
•
O
perand
Metric I:
Metric II: data mem
• Number of bytes
• Wa
Mo
static code
nt low latency to
• Evaluation:
del: Pros
• Number of instructions needed to • Evaluation: register < load-store
load-store
size
ory traffic
moved to and from
• Evaluation: load-store < register < memo
Metric III: instruction latency
and
execute instructions
> register
Cons
represent program, size < memory only
memory
> memory only
ry only
of
each
•
•
•
•
O
perand
Metric I:
• Number of bytes
• Evaluation:
现状:
Mo
static code
del: Pros
• Number of instructions needed to • Evaluation: register < load-store
Metric II: data mem
load-store
most current ISAs are
size
ory traffic
moved to and from
• Evaluation: load-store < register < memo Metric III: instruction latency
• Wa
nt low latency to
and
execute instructions
< register < memo
load-Store
Cons
represent program, size < memory only
memory
ry only
ry only
of
each
•
•
M
IP
MIPS
S
• 32 32-bit
• HI,
•
Operand
is
load
-store
FP registe
Can also be treated a
Model
• 32 32-bit integer registers
• Actually 31: r0 is hardwired to value 0 w
rs
LO: destination registers for multiply/divide
Integer register conventions • Allows separate function-level
s 16 64-bit
FP registers
compilation and
hy?
fast function calls
•
Memory Ad
ISAs assume “virtual”
•
ISA
point? no room
dressing (内存寻址)
• Either 32 or 64 bits
• Program can name 232 bytes (4GB) or 264
for
address size
even
one address
bytes (16PB)
in a 32-bit ins
truction
•
•
A
Memory Ad
ISAs assume “virtual”
• •
•
•
•
•
ISA
ddre
point? no room
ssing mo
de : way of Direct: ld R1,(R2)
Displacement: ld
Indexed: ld
Memory-indirect: ld
Auto-update:
dressi
• Either 32 or 64 bits
• Program can name 232 bytes (4GB) or 264
•
ld R1,8(R2) Scaled: ld R1,(R2,R3,32,8)
for
ng (内存寻
address size
even
R1,8(R2)
R1,(R2,R3)
one addre
址)
ss
specifying address R1=mem[R2]
R1=mem[R2+
R1,@(R2)
bytes (16PB)
in a 32-bit ins
R1=mem[R2+R3]
R1
R2
=mem[mem[R2]]
+=
8; R1=mem[R2]
R1=mem[R2+R3*32+8]
truction
8]
•
•
A
Memory Ad
ISAs assume “virtual”
• •
•
•
•
•
•
ISA
ddre
point? no room
ssing mo
de : way of Direct: ld R1,(R2)
Displacement: ld
Indexed: ld
Memory-indirect: ld
Auto-update:
dressing (内存寻
• Either 32 or 64 bits
• Program can name 232 bytes (4GB) or 264
•
ld R1,8(R2) Scaled: ld R1,(R2,R3,32,8)
What high-level program idioms are
for
address size
even
R1,8(R2)
R1,(R2,R3)
one addre
R1,@(R2)
址)
ss
specifying address R1=mem[R2]
R1=mem[R2+
bytes (16PB)
in a 32-bit ins
R1=mem[R2+R3]
R1
R2
=mem[mem[R2]]
+=
8; R1=mem[R2]
R1=mem[R2+R3*32+8]
these used fo
r?
truction
8]
•
M
IP
MIPS
S
• 80% use
Addressing
impl
Mo
ements only displacement
des: Rationality
• Why? Experiment on VAX (ISA with every mode) found distribution
• Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4%
displacement or register indirect (=
displacement
0)
•
M
IP
MIPS
S
• 80% use
Addressing
impl
Mo
ements only displacement
des: Rationality
• Why? Experiment on VAX (ISA with every mode) found distribution
• Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4%
displacement or register indirect (=
displacement
How about the remain 20%?
0)
•
•
M
IP
MIPS
S
• 80% use
Addressing
impl
I-type instructions: 16-bit displacement • Is 16-bits enough?
• Yes! VAX e
I-typ
e
Mo
ements only displacement
xperiment showed
Op(6)
Rs(5)
des: Rationality
• Why? Experiment on VAX (ISA with every mode) found distribution
• Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4%
displacement or register indirect (=
Rt(5)
1% access
es
displacement
Immed(16)
us
e displacement >16
0)
•
•
M
IP
MIPS
S
• 80% use
Addressing
impl
I-type instructions: 16-bit displacement • Is 16-bits enough?
• Yes! VAX e
I-typ
e
Mo
ements only displacement
xperiment showed
Op(6)
Rs(5)
des: Rationality
• Why? Experiment on VAX (ISA with every mode) found distribution
• Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4%
displacement or register indirect (=
Rt(5)
1% access
es
displacement
Immed(16)
us
e displacement >16
0)
•
Addressing
Byte Order (字节序)
Littl
Vax,
e En
Issue: E
dian: byte 0 is 8 least sig
DEC/Compaq Alpha
ndian-ness
•
Big Endian: byte 0 is 8 most significant bits IBM Motorola 68k, MIPS, SPARC, HP PA-RISC
nificant bits
360/370,
Intel 80×86, DEC
•
•
Addressing
Alignment: require that
is multiple of their size
32-bit intege
• Alig
• Alig
ned if
ned: lw
• Not:lw@XX
r
Issue: Alignm
address % 4 =
@XXXX
XX10
00
objec
ts fall
ent
on address that
0 [% is symbol for “mod”]
Aligned
Not
0
1
Byte #
2
3
•
•
•
Another Addressing Iss
Alignment: require that
is multiple of their size
32-bit intege
• Alig
• Alig
ned if
ned: lw
• Not:lw@XX
Que
stion:
(uncommon
r
address % 4 =
what
• Support in hardware?
@XXXX
XX10
to
case)?
• Trap to software routine? Possibility
• MIPS? ISA sup
instructions:
lw
@X
XXX10 = l
00
do with
wl
objec
ue: Alig
ts fall
0 [% is symbol for “mod”]
unaligned acces
Makes all accesses slow
port: unaligned access using two
@XXXX10;
lwr
nment
on address that
ses
@XXX
Aligned
X10
Not
0
1
Byte #
2
3
•
5. Datatypes
Dat
atypes
• Software view: property of data
• Hardware view: data is just bits, property of opera
tions
•
•
5. Datatypes
Dat
atypes
• Software view: property of data
• Hardware view: data is just bits, property of opera
Hardware
• Integer: 8 bits
datatypes
(byte), 16b
(half), 3
• IEEE754 FP: 32b (single-precision), 64b (double-precision)
• Packed integer: treat 64b int as 8 8b int’s or 4 16b int’s
2b
(word),
64b (lon
tions
g)
•
MIPS
Dat
Datatypes
atypes: a
• All integer operations rea
•
ll the b
No partial dependenc
• Only byte/half variants are load-store
lb, lbu, lh, lhu, sb, sh
• Loads sign-extend (or not) byte/half
(and Operations)
asic
ones (byte,
d/writ
es
on
e 32-bits
re
gisters
half, word,
into 32-bits
FP)
•
•
MIPS
Dat
Datatypes
atypes: a
• All integer operations rea
•
ll the b
No partial dependenc
• Only byte/half variants are load-store
lb, lbu, lh, lhu, sb, sh
• Loads sign-extend (or not) byte/half
Operations: all
• Signed/unsigned varia
• Immediate variants for
th
add, addu, addi, addiu
(and Operations)
asic
e basic o
• Regularity/orthogonality: all variants • Makes compiler’s “life” easier
ones (byte,
d/writ
es
on
nes
e 32-bits
re
gisters
nts for integer arithmetic all instructions
half, word,
into 32-bits
available for all operations
FP)
•
6.1
C
ontrol
One issue: testing f
subi
bn
Instructions
$2,$1,10
target
• Option III: condition registers, s
or
• Option I: compare and branch instructions blti $1,10,target
I
conditions
+
Simple,
–t
wo ALUs: one for c
• Option II: implicit condition codes
ondition, one
sets “negative”
+ Condition codes set “for free”, – implicit dependence is tricky
eparate
br
an
for
ch
target a
insns
ddress
//
CC
–
slti
bnez
Additional ins
$2,$1,10
$2,target
tru
ctions, + one AL
U per
,+
explicit dependence
•
•
M
IP
MIPS
S
Conditional Branch
uses c
• Compare 2
• Compare 1
+ • Set
Why?
•
Gr
ombination
registers and
• Equality and inequality only
+
Don’t
need an adder for
register
to
eater/less than comparisons
Don’t need adder for comparison
explicit condition registers: slt, sltu, slti,
es
of options II and III
branch:
compari
zero and branch: b
beq,
son
bne
gt
z, bgez, bltz, blez
sltiu,
etc.
•
•
M
IP
MIPS
S
Conditional Branch
uses c
• Compare 2
• Compare 1
+ • Set
Why?
compari
•
Gr
ombination
registers and
• Equality and inequality only
+
Don’t
need an adder for
register
to
eater/less than comparisons
branch:
es
of options II and III
compari
zero and branch: b
Don’t need adder for comparison
explicit condition registers: slt, sltu, slti,
86% of branch
sons to 0
beq,
son
bne
gt
es in programs are (in)equalities or
z, bgez, bltz, blez
sltiu,
etc.
•
6.2
Another
C
• Option I:
•
•
•
ontrol
issu
Position
Needed
Used
Instructions
e:
PC-relativ
com
-independent
puting targets
e
es
an
II
within procedure
d jum
for jumping to dynamic targets
for returns
, dynamic pro
ps
within a
procedure
re
cedure calls, switches, ???
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)
•
Control
Another
• Option I:
•
•
•
•
Instructions
issu
Position
Needed
Used
• How far do
•
Ty
Fu
pically n
rth
e:
PC-relativ
com
-independent
es
II
puting targets
e
an
within procedure
d jum
for jumping to dynamic targets
for returns
ot so
er from one proced
, dynamic pro
you need to jump?
far
ps
within a
procedure
re
cedure calls, switches, ???
within a procedure (they don’t get t
ure to anot
her
hat big)
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)
•
Control
Another
• Option I:
•
•
•
•
•
Instructions
issu
Position
Needed
Used
• How far do
Ty
pically n
Furth
e:
PC-relativ
computi
-independent
e
es
II
an
ng targets
within procedure
d jum
for jumping to dynamic targets
for returns
, dynamic pro
you need to jump?
ot so
far
ps
within a
within a procedure (they don’t get t
er from one procedure to another
procedure
re
cedure calls, switches
hat big)
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)
•
M
IP
MIPS
S
• PC-relati
Control
uses all thr
PC =
ve
Instructions
co
ee
nditio
nal branches: bne,
• 16-bit relative offset, <0.1% branches need more
•
PC + 4 + immediate if condition is true (else P
beq,
blez, etc.
C=PC+4)
• Absolute
I-type
Op(6)
Rs(5)
unconditional jumps:
• 26-bit offset (can address 228
Rt(5)
j
Immed(16)
target
words < 232 what give
s?)
J-type
• IndirectIndirect jum
R-type
Op(6)
Op(6)
ps:jr $rd
Rs(5)
Rt(5)
Target(26)
Rd(5)
Sh(5)
Func(6)
•
6.3
Another
• We
PC
C
ontrol
issu
Instructions
e:
how
to
III
support proced
“link” (remember) address of the calling instruction + 4 (curren
+ 4) so we can return to it after the procedure
ure calls?
t
•
•
Control
Another
• We PC
MIPS
•I
mplicit retu
Instructions
issu
e:
how
rn address
• Direct jump-and-link: jal address $ra = PC+4; PC = address
• Can then return from call
to
III
support proced
“link” (remember) address of the calling instruction + 4 (curren + 4) so we can return to it after the procedure
register
with:
is
$r
jr $ra
a(
=$
ure calls?
31
)
t
•
•
Control
Another
• We PC
MIPS
•I
mplicit retu
Instructions
issu
• Can then return from call
e:
how
rn address
• Direct jump-and-link: jal address $ra = PC+4; PC = address
• Or can call with indirect jump-and-link:
$rd = PC+4; • Then return with:
to
PC = $rs
jr $rd
III
support proced
“link” (remember) address of the calling instruction + 4 (curren + 4) so we can return to it after the procedure
register
with:
is
$r
jr $ra
a(
=$
// explicit return addre
ure calls?
31
)
jalr $rd,
ss
$rs
register
t
Control习语:
If-Then-E
lse
•
Understanding
programs helps with a
rchitecture
• Know
what
common programming idioms
look like in assembly
• Wh
y?
How
can y
ou
MCCF if you
don’t
know
what
CC is?
•
•
Control习语:
Understanding
• Know
• Wh
y?
what
How
If-Then-E
common programming idioms
can y
First control idiom: if -then -else
if
else
(A < B)
B++;
What's
A++;
the
ou
MCCF if you
M
//
//
IPS
lse
programs helps with a
A
B
don’t
in $s1
in $s2
format?
rchitecture
know
look like in assembly
what
CC is?
•
•
Control习语:
Understanding
• Know
• Wh
if
else
y?
(A < B)
what
How
B++;
else: join:
can y
slt
If-Then-E
First control idiom: if -then -else
A++;
ou
MCCF if you
$s3,$s1,$s2
beqz $s3,else
addi $s1,$s1,1
j join
addi $s2,$s2,1
//
//
lse
programs helps with a
common programming idioms
A
B
don’t
in $s1
in $s2
// if
$s
// branch to
rchitecture
know
look like in assembly
what
1<$s2, t
// jump to join
CC is?
hen $s3=1
else if !condition
•
Control习语:
Second
idiom:
int A[100], sum,
for (i=0; i