代写 R C game MIPS assembly compiler software slides are adapted from CA course of wisc, princeton, mit, berkeley, etc.

slides are adapted from CA course of wisc, princeton, mit, berkeley, etc.
The uses of the slides of this course are for educa/onal purposes only and should be 1 used only in conjunc/on with the textbook. Deriva/ves of the slides must
acknowledge the copyright no/ces of this and the originals.


Instruction Set Architecture (ISA)
The “contract” between software and hardware
• Functional definition of operations, modes, and storage locations
supported by hardware
• Precise description of how software can invoke and access them



Instruction Set Architecture (ISA)
The “contract” between software and hardware
• Functional definition of operations, modes, and storage locations
supported by hardware
• Precise description of how software can invoke and access them
Strictly speaking, ISA is the architecture
• Informally, architecture is also used to talk about the big picture of implementation
• Better to call this microarchitecture


M
icro
ISA
• No •


architecture (微架构)
specifies what hardware
guarantees regarding
How operations are implemented
Which operations are
Which operations take more power
doe
fast and whic
s,
not how it does
h are
and which
slow
take les
it
s


M
icro
ISA
• No •

architecture (微架构)
specifies what hardware

guarantees regarding
How operations are implemented
Which operations are
Which operations take more power
• These issues are determined by the
• •
doe
fast and whic
Microarchitecture = how hardware implements architecture All Pentiums implement the x86 architecture
s,
not how it does
h are
and which
microar
slow
chitecture
take les
it
s

与ISA有关的内容
1.
2.
3.
4.
5. 6.
Th

e Von N
Operations
Operand model

Where
Datatypes Control
eu
mann model
Implicit structure of all mo
Format
• Length and encoding
are operands stored and h
and operations
dern ISAs
ow do addres
s them?

与ISA有关的内容


1.
2.
3.
4.
5. 6.
Th



e Von N
Operations
Operand model



Where
Datatypes Control




eu
mann model
Implicit structure of all mo
Format
• Length and encoding
are operands stored and h
and operations


子: MIP



S ISA


:
dern ISAs
MIPS
ow do addres
CPU
s them?

1. Von Neu
Fetch PC
Execute
Write Output



mann
Model
Implicit model of all
Key: program counter (PC
m
odern ISAs
• Defines total order
of
• Order and named storage define computation
• Value flows from insn X to Y via storage A iff…

X names A as output, Y
And Y a
fter X
in
total order
Processor logically executes loop at left • Instruction execution assumed atomic
• Instruction X finishe
s before
)
Decode
dynamic instructions
Read Inputs

Next
PC is
PC++ unless
insn says otherwise
Next PC

names A a
insn
X+1 starts
s input…


2. Instructio
Leng
1.
2.
3.

th
Varia
Compro

(长度)
Fixed length
• 32 or 64 bits
+
Sim
Code densit
ble le
Exa
n Format
ple implem
ngth
– Complex implementation + Code density
y
mise: two lengths
mple: MIPS16
entation: com
pute next PC
using only
PC



2. Instructio
Leng
1.
2.
3.
En



th
Fixed length
• 32 or 64 bits
+
Varia
Sim
ble le
Exa
n Format
ple implem
Code densit
ngth
– Complex implementation + Code density
Compro

y
mise: two lengths
mple: MIPS16
entation: com
coding (编码)
A few simple encodings simplify decoder implementation
Complex encoding
can improve code
pute next PC
density
using only
PC


M
L
IP
S
ength
Format
• 32-bits
• MIPS16: 16-bit
(指令格式)
variants of common instructio
ns
for density



M
L
IP
S
ength
Encodin
R-type
Format
• 32-bits
• MIPS16: 16-bit
g
• 3 formats, simple encoding
• Q: how many opera
Op(6)
(指令格式)
variants of common instructio
Rs(5)
tion types can be e
Rt(5)
Rd(5)
Sh(
ns
Func(6)
for density
ncoded in 6-bit opcode?
5)
I-typ
J-type
e
Op(6)
Op(6)
Rs(5)
Rt(5)
Target(26)
Im
med(16)

R Format (寄存器类型) 655556
• e.g., add $1, $2, $3
000000 00010 00011 00001 00000 100000 alu-rr 2 3 1 zero add/signed
opcode
rs
rt
rd
shamt
funct

I Format (立即数类型)
• All loads and stores use I-format • Assembly: lw $1, 100($2)
• Machine:
100011 00010 00001 0000000001100100 lw 2 1 100 (in binary)
opcode rs rt addr/immediate
6
5
5
16

I Format (立即数类型)
• ALU ops with immediates
– addi $1, $2, 100
– 001000 00010 00001 0000000001100100
• Conditional branches
– beq $1, $2, 7
– 000100 00001 00010 0000 0000 0000 0111
– PC = PC + (0000 0111 << 2) // word offset J Format (跳转类型) Direct Jump: opcode addr 6 26 J Format (跳转类型) Direct Jump: opcode addr 6 26 • Jump to: –New PC = 4 MSB of PC || addr || 00 – 4+26+2 = 32 bits for jump target 3. Opera tions • Operation type encod ed in inst ruction opcod e • • 3. Opera Operation • Integer a • FP arithmetic: tions type Many types of operati encod rithmetic: add • Integer logical: and, or, xor, • ••• ed in inst ons , sub, mul, add, sub, mul, d iv, ruction div, mod/rem sqrt not, sll, srl, sra opcod (si e gned/un signe d) • • • 3. Opera Operation • Integer a • FP arithmetic: tions type Many types of operati encod rithmetic: add • Integer logical: and, or, xor, • ••• What other operations might ed in inst ons , sub, mul, add, sub, mul, d iv, ruction div, mod/rem sqrt not, sll, srl, sra be useful? opcod (si e gned/un signe d) • • • • 3. Opera Operation • Integer a • FP arithmetic: tions type Many types of operati encod rithmetic: add • Integer logical: and, or, xor, ed in inst ons , sub, mul, add, sub, mul, d • ••• What other operations might be useful? More operation types == better ISA? iv, ruction div, mod/rem sqrt not, sll, srl, sra opcod (si e gned/un signe d) • • • • • 3. Opera Operation • FP DEC VAX arithmetic: tions type Many types of operati • Integer a • Integer logical: and, or, xor, • E.g., instructio encod rithmetic: add ed in inst ons , sub, mul, add, sub, mul, d • ••• What other operations might be useful? More operation types == better ISA? computer had LOTS of o n for polynomial iv, ev ruction div, mod/rem sqrt not, sll, srl, sra aluati pera on opcod (si e gned/un tion types (no joke!) signe d) • • • • • 3. Opera Operation • FP arithmetic: tions type Many types of operati • Integer a • Integer logical: and, or, xor, • E.g., instructio • But encod rithmetic: add ed in inst ons , sub, mul, add, sub, mul, d • ••• What other operations might be useful? More operation types == better ISA? DEC VAX computer had LOTS of o n for polynomial many of them were rarely/never iv, ev ruction div, mod/rem sqrt not, sll, srl, sra aluati us pera on ed opcod (si e gned/un tion types (no joke!) signe d) • • • 4. Operand Model (操作数模型) If you’re going to add, you • Two source operands, one destination op Question #1: Where can operands co Question #2: A nd h ow are need at least they specified? 3 operands erand me from? • • • • • • 4. Operand Model (操作数模型) If you’re Question #2: A Running e going to Discuss: Memory-Onl Optional: Accumulator & Sta add, you • Two source operands, one destination op Question #1: Where can operands co nd h ow are xample: A = B + C • Several options for answering both questions y & Re need at least they specified? gisters ck 3 operands erand me from? Op erand Model I: Memory Only • Memory only add A,B,C mem[A] = mem[B] + me m[C] MEM • O perand Accumu load add C store A B lator Mo : del II: implicit single Accumulator ACC -element = mem[B] ACC = ACC + mem[C] mem[A] = ACC st ack ACC MEM • O perand Stack push push add pop A B C Mo : top of stack del III: Stack (T OS) stack[T stack[T stack[T m em[A] = is implicit in instruction OS++] = m em[B] OS++] = mem[C] OS++] = stack[--TOS] + stack[--TOS] stack[--TO S] s TOS MEM • • O L perand General load add R1, C store R1,A oad-store : load R1,B load R2,C add R1,R1,R2 store R1,A Mo del IV: Registers -purpose registers : multiple R1,B GPR and only loa R1 = mem[B] R1 = R1 + mem[C] mem[A] = R1 R1 = mem[B] R2 = mem[C] R1 = R1 + R2 mem[A ]= ds/stor R1 explicit accumulator es access memory s MEM • O perand Metric I: Mo static code del: Pros • Number of instructions needed to • Evaluation: register < load-store size and Cons represent program, size < memory only of each • • Operand Metric I: Model: Pros static code • Number of instructions needed to • Evaluation: register < load-store Metric II: data mem • Number of bytes • Evaluation: load-store size ory traffic moved to and from > register
and
Cons
represent program, size < memory only memory > memo
ry only
of
each




O
perand
Metric I:
Metric II: data mem
• Number of bytes
• Wa
Mo
static code
nt low latency to
• Evaluation:
del: Pros
• Number of instructions needed to • Evaluation: register < load-store load-store size ory traffic moved to and from • Evaluation: load-store < register < memo Metric III: instruction latency and execute instructions > register
Cons
represent program, size < memory only memory > memory only
ry only
of
each





O
perand
Metric I:
• Number of bytes
• Evaluation:
现状:
Mo
static code
del: Pros
• Number of instructions needed to • Evaluation: register < load-store Metric II: data mem load-store most current ISAs are size ory traffic moved to and from • Evaluation: load-store < register < memo Metric III: instruction latency • Wa nt low latency to and execute instructions < register < memo load-Store Cons represent program, size < memory only memory ry only ry only of each • • M IP MIPS S • 32 32-bit • HI, • Operand is load -store FP registe Can also be treated a Model • 32 32-bit integer registers • Actually 31: r0 is hardwired to value 0  w rs LO: destination registers for multiply/divide Integer register conventions • Allows separate function-level s 16 64-bit FP registers compilation and hy? fast function calls • Memory Ad ISAs assume “virtual” • ISA point? no room dressing (内存寻址) • Either 32 or 64 bits • Program can name 232 bytes (4GB) or 264 for address size even one address bytes (16PB) in a 32-bit ins truction • • A Memory Ad ISAs assume “virtual” • • • • • • ISA ddre point? no room ssing mo de : way of Direct: ld R1,(R2) Displacement: ld Indexed: ld Memory-indirect: ld Auto-update: dressi • Either 32 or 64 bits • Program can name 232 bytes (4GB) or 264 • ld R1,8(R2) Scaled: ld R1,(R2,R3,32,8) for ng (内存寻 address size even R1,8(R2) R1,(R2,R3) one addre 址) ss specifying address R1=mem[R2] R1=mem[R2+ R1,@(R2) bytes (16PB) in a 32-bit ins R1=mem[R2+R3] R1 R2 =mem[mem[R2]] += 8; R1=mem[R2] R1=mem[R2+R3*32+8] truction 8] • • A Memory Ad ISAs assume “virtual” • • • • • • • ISA ddre point? no room ssing mo de : way of Direct: ld R1,(R2) Displacement: ld Indexed: ld Memory-indirect: ld Auto-update: dressing (内存寻 • Either 32 or 64 bits • Program can name 232 bytes (4GB) or 264 • ld R1,8(R2) Scaled: ld R1,(R2,R3,32,8) What high-level program idioms are for address size even R1,8(R2) R1,(R2,R3) one addre R1,@(R2) 址) ss specifying address R1=mem[R2] R1=mem[R2+ bytes (16PB) in a 32-bit ins R1=mem[R2+R3] R1 R2 =mem[mem[R2]] += 8; R1=mem[R2] R1=mem[R2+R3*32+8] these used fo r? truction 8] • M IP MIPS S • 80% use Addressing impl Mo ements only displacement des: Rationality • Why? Experiment on VAX (ISA with every mode) found distribution • Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4% displacement or register indirect (= displacement 0) • M IP MIPS S • 80% use Addressing impl Mo ements only displacement des: Rationality • Why? Experiment on VAX (ISA with every mode) found distribution • Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4% displacement or register indirect (= displacement How about the remain 20%? 0) • • M IP MIPS S • 80% use Addressing impl I-type instructions: 16-bit displacement • Is 16-bits enough? • Yes! VAX e I-typ e Mo ements only displacement xperiment showed Op(6) Rs(5) des: Rationality • Why? Experiment on VAX (ISA with every mode) found distribution • Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4% displacement or register indirect (= Rt(5) 1% access es displacement Immed(16) us e displacement >16
0)



M
IP
MIPS
S
• 80% use
Addressing
impl
I-type instructions: 16-bit displacement • Is 16-bits enough?
• Yes! VAX e
I-typ
e
Mo
ements only displacement
xperiment showed
Op(6)
Rs(5)
des: Rationality
• Why? Experiment on VAX (ISA with every mode) found distribution
• Disp: 61%, reg-ind: 19%, scaled: 11%, mem-ind: 5%, other: 4%
displacement or register indirect (=
Rt(5)
1% access
es
displacement
Immed(16)
us
e displacement >16
0)


Addressing
Byte Order (字节序)
Littl
Vax,
e En
Issue: E
dian: byte 0 is 8 least sig
DEC/Compaq Alpha
ndian-ness

Big Endian: byte 0 is 8 most significant bits IBM Motorola 68k, MIPS, SPARC, HP PA-RISC
nificant bits
360/370,
Intel 80×86, DEC



Addressing
Alignment: require that
is multiple of their size
32-bit intege
• Alig
• Alig
ned if
ned: lw
• Not:lw@XX
r
Issue: Alignm
address % 4 =
@XXXX
XX10
00
objec
ts fall
ent
on address that
0 [% is symbol for “mod”]
Aligned
Not
0
1
Byte #
2
3




Another Addressing Iss
Alignment: require that
is multiple of their size
32-bit intege
• Alig
• Alig
ned if
ned: lw
• Not:lw@XX
Que
stion:
(uncommon
r
address % 4 =
what
• Support in hardware?
@XXXX
XX10
to
case)?
• Trap to software routine? Possibility
• MIPS? ISA sup
instructions:
lw
@X
XXX10 = l
00
do with
wl
objec
ue: Alig
ts fall
0 [% is symbol for “mod”]
unaligned acces
Makes all accesses slow
port: unaligned access using two
@XXXX10;
lwr
nment
on address that
ses
@XXX
Aligned
X10
Not
0
1
Byte #
2
3


5. Datatypes
Dat
atypes
• Software view: property of data
• Hardware view: data is just bits, property of opera
tions



5. Datatypes
Dat
atypes
• Software view: property of data
• Hardware view: data is just bits, property of opera
Hardware
• Integer: 8 bits
datatypes
(byte), 16b
(half), 3
• IEEE754 FP: 32b (single-precision), 64b (double-precision)
• Packed integer: treat 64b int as 8 8b int’s or 4 16b int’s
2b
(word),
64b (lon
tions
g)


MIPS
Dat
Datatypes
atypes: a
• All integer operations rea

ll the b
No partial dependenc
• Only byte/half variants are load-store
lb, lbu, lh, lhu, sb, sh
• Loads sign-extend (or not) byte/half
(and Operations)
asic
ones (byte,
d/writ
es
on
e 32-bits
re
gisters
half, word,
into 32-bits
FP)



MIPS
Dat
Datatypes
atypes: a
• All integer operations rea

ll the b
No partial dependenc
• Only byte/half variants are load-store
lb, lbu, lh, lhu, sb, sh
• Loads sign-extend (or not) byte/half
Operations: all
• Signed/unsigned varia
• Immediate variants for
th
add, addu, addi, addiu
(and Operations)
asic
e basic o
• Regularity/orthogonality: all variants • Makes compiler’s “life” easier
ones (byte,
d/writ
es
on
nes
e 32-bits
re
gisters
nts for integer arithmetic all instructions
half, word,
into 32-bits
available for all operations
FP)


6.1
C
ontrol
One issue: testing f
subi
bn
Instructions
$2,$1,10
target
• Option III: condition registers, s
or
• Option I: compare and branch instructions blti $1,10,target
I
conditions
+
Simple,
–t
wo ALUs: one for c
• Option II: implicit condition codes
ondition, one
sets “negative”
+ Condition codes set “for free”, – implicit dependence is tricky
eparate
br
an
for
ch
target a
insns
ddress
//
CC

slti
bnez
Additional ins
$2,$1,10
$2,target
tru
ctions, + one AL
U per
,+
explicit dependence



M
IP
MIPS
S
Conditional Branch
uses c
• Compare 2
• Compare 1
+ • Set
Why?

Gr
ombination
registers and
• Equality and inequality only
+
Don’t
need an adder for
register
to
eater/less than comparisons
Don’t need adder for comparison
explicit condition registers: slt, sltu, slti,
es
of options II and III
branch:
compari
zero and branch: b
beq,
son
bne
gt
z, bgez, bltz, blez
sltiu,
etc.



M
IP
MIPS
S
Conditional Branch
uses c
• Compare 2
• Compare 1
+ • Set
Why?
compari

Gr
ombination
registers and
• Equality and inequality only
+
Don’t
need an adder for
register
to
eater/less than comparisons
branch:
es
of options II and III
compari
zero and branch: b
Don’t need adder for comparison
explicit condition registers: slt, sltu, slti,
86% of branch
sons to 0
beq,
son
bne
gt
es in programs are (in)equalities or
z, bgez, bltz, blez
sltiu,
etc.


6.2
Another
C
• Option I:



ontrol
issu
Position
Needed
Used
Instructions
e:
PC-relativ
com
-independent
puting targets
e
es
an
II
within procedure
d jum
for jumping to dynamic targets
for returns
, dynamic pro
ps
within a
procedure
re
cedure calls, switches, ???
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)


Control
Another
• Option I:




Instructions
issu
Position
Needed
Used
• How far do

Ty
Fu
pically n
rth
e:
PC-relativ
com
-independent
es
II
puting targets
e
an
within procedure
d jum
for jumping to dynamic targets
for returns
ot so
er from one proced
, dynamic pro
you need to jump?
far
ps
within a
procedure
re
cedure calls, switches, ???
within a procedure (they don’t get t
ure to anot
her
hat big)
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)


Control
Another
• Option I:





Instructions
issu
Position
Needed
Used
• How far do
Ty
pically n
Furth
e:
PC-relativ
computi
-independent
e
es
II
an
ng targets
within procedure
d jum
for jumping to dynamic targets
for returns
, dynamic pro
you need to jump?
ot so
far
ps
within a
within a procedure (they don’t get t
er from one procedure to another
procedure
re
cedure calls, switches
hat big)
• Used for branch
• Option II: Absolute
• Position independent outside procedu
• Used for procedure calls
• Option III: Indirect (target found in register)


M
IP
MIPS
S
• PC-relati
Control
uses all thr
PC =
ve

Instructions
co
ee
nditio
nal branches: bne,
• 16-bit relative offset, <0.1% branches need more • PC + 4 + immediate if condition is true (else P beq, blez, etc. C=PC+4) • Absolute I-type  Op(6) Rs(5) unconditional jumps: • 26-bit offset (can address 228 Rt(5) j Immed(16) target words < 232  what give s?) J-type • IndirectIndirect jum R-type Op(6) Op(6) ps:jr $rd Rs(5) Rt(5) Target(26) Rd(5) Sh(5) Func(6) • 6.3 Another • We PC C ontrol issu Instructions e: how to III support proced “link” (remember) address of the calling instruction + 4 (curren + 4) so we can return to it after the procedure ure calls? t • • Control Another • We PC MIPS •I mplicit retu Instructions issu e: how rn address • Direct jump-and-link: jal address $ra = PC+4; PC = address • Can then return from call to III support proced “link” (remember) address of the calling instruction + 4 (curren + 4) so we can return to it after the procedure register with: is $r jr $ra a( =$ ure calls? 31 ) t • • Control Another • We PC MIPS •I mplicit retu Instructions issu • Can then return from call e: how rn address • Direct jump-and-link: jal address $ra = PC+4; PC = address • Or can call with indirect jump-and-link: $rd = PC+4; • Then return with: to PC = $rs jr $rd III support proced “link” (remember) address of the calling instruction + 4 (curren + 4) so we can return to it after the procedure register with: is $r jr $ra a( =$ // explicit return addre ure calls? 31 ) jalr $rd, ss $rs register t Control习语: If-Then-E lse • Understanding programs helps with a rchitecture • Know what common programming idioms look like in assembly • Wh y? How can y ou MCCF if you don’t know what CC is? • • Control习语: Understanding • Know • Wh y? what How If-Then-E common programming idioms can y First control idiom: if -then -else if else (A < B) B++; What's A++; the ou MCCF if you M // // IPS lse programs helps with a A B don’t in $s1 in $s2 format? rchitecture know look like in assembly what CC is? • • Control习语: Understanding • Know • Wh if else y? (A < B) what How B++; else: join: can y slt If-Then-E First control idiom: if -then -else A++; ou MCCF if you $s3,$s1,$s2 beqz $s3,else addi $s1,$s1,1 j join addi $s2,$s2,1 // // lse programs helps with a common programming idioms A B don’t in $s1 in $s2 // if $s // branch to rchitecture know look like in assembly what 1<$s2, t // jump to join CC is? hen $s3=1 else if !condition • Control习语: Second idiom: int A[100], sum, for (i=0; i