COMPUTER ORGANIZATION AND DESIGN
The Hardware/Software Interface
Instructions: Language of the Computer
Copyright By PowCoder代写 加微信 powcoder
Instruction Set
The repertoire of instructions of a computer
Different computers have different instruction sets
But with many aspects in common
Early computers had very simple
instruction sets
Simplified implementation
Many modern computers also have simple instruction sets
Chapter 2 — Instructions: Language of the Computer — 2
§2.1 Introduction
The RISC-V Instruction Set
Used as the example throughout the book
Developed at UC Berkeley as open ISA
Now managed by the RISC-V Foundation (riscv.org)
Typical of many modern ISAs
See RISC-V Reference Data tear-out card
Similar ISAs have a large share of embedded core market
Applications in consumer electronics, network/storage equipment, cameras, printers, …
Chapter 2 — Instructions: Language of the Computer — 3
Arithmetic Operations
Add and subtract, three operands Two sources and one destination
adda,b,c //agetsb+c
All arithmetic operations have this form Design Principle 1: Simplicity favours
regularity
Regularity makes implementation simpler
Simplicity enables higher performance at lower cost
Chapter 2 — Instructions: Language of the Computer — 4
§2.2 Operations of the Computer Hardware
Arithmetic Example
f = (g + h) – (i + j);
Compiled RISC-V code:
addt0,g,h //tempt0=g+h addt1,i,j //tempt1=i+j subf,t0,t1 //f=t0-t1
Chapter 2 — Instructions: Language of the Computer — 5
Register Operands
Arithmetic instructions use register operands
RISC-V has a 32 × 64-bit register file
Use for frequently accessed data
64-bit data is called a “doubleword”
32 x 64-bit general purpose registers x0 to x31
32-bit data is called a “word”
Design Principle 2: Smaller is faster
c.f. main memory: millions of locations
Chapter 2 — Instructions: Language of the Computer — 6
§2.3 Operands of the Computer Hardware
RISC-V Registers
x0: the constant value 0 x1: return address
x2: stack pointer
x3: global pointer
x4: thread pointer
x5 – x7, x28 – x31: temporaries
x8: frame pointer
x9, x18 – x27: saved registers
x10 – x11: function arguments/results x12 – x17: function arguments
Chapter 2 — Instructions: Language of the Computer — 7
Register Operand Example
f = (g + h) – (i + j);
f, …, j in x19, x20, …, x23
Compiled RISC-V code: (p.68)
add x5, x20, x21
add x6, x22, x23
sub x19, x5, x6
Chapter 2 — Instructions: Language of the Computer — 8
Memory Operands
Main memory used for composite data Arrays, structures, dynamic data
To apply arithmetic operations
Load values from memory into registers Store result from register to memory
Memory is byte addressed
Each address identifies an 8-bit byte
RISC-V is Little Endian
Least-significant byte at least address of a word
c.f. Big Endian: most-significant byte at least address
RISC-V does not require words to be aligned in memory (note that RVS reports alignment errors)
Unlike some other ISAs
Chapter 2 — Instructions: Language of the Computer — 9
Byte/Halfword/Word Operations
RISC-V byte/halfword/word load/store
Load byte/halfword/word: Sign extend to 64 bits in rd lb rd, offset(rs1)
lh rd, offset(rs1)
lw rd, offset(rs1)
Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
lbu rd, offset(rs1) lhu rd, offset(rs1) lwu rd, offset(rs1)
Store byte/halfword/word: Store rightmost 8/16/32 bits sb rs2, offset(rs1)
sh rs2, offset(rs1)
sw rs2, offset(rs1)
Chapter 2 — Instructions: Language of the Computer — 10
Memory Operand Example
A[12] = h + A[8];
h in x21, base address of A in x22 Compiled RISC-V code:(p.71)
Index 8 requires offset of 64 8 bytes per doubleword
x9, 64(x22)
x9, x21, x9
x9, 96(x22)
Chapter 2 — Instructions: Language of the Computer — 11
Registers vs. Memory
Registers are faster to access than memory
Operating on memory data requires loads and stores
More instructions to be executed
Compiler must use registers for variables
as much as possible
Only spill to memory for less frequently used variables
Register optimization is important!
Chapter 2 — Instructions: Language of the Computer — 12
Immediate Operands (p.72)
Constant data specified in an instruction
addi x22, x22, 4
Make the common case fast
Small constants are common
Immediate operand avoids a load instruction
Chapter 2 — Instructions: Language of the Computer — 13
Unsigned Binary Integers
Given an n-bit number
x = xn-12n-1 + xn-22n-2 + … + x121 + x020
Range: 0 to +2n-1 Example
0000 0000 … 0000 10112 =0+…+1×23 +0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110
addi x5, x0, 11 //see the value in x5
Using 64 bits: 0 to +18,446,774,073,709,551,615
Chapter 2 — Instructions: Language of the Computer — 14
§2.4 Signed and Unsigned Numbers
2s-Complement Signed Integers
Given an n-bit number
x = -xn-12n-1 + xn-22n-2 + … + x121 + x020
Range: -2n-1 to +2n-1-1 Example
1111 1111 … 1111 11002
= -1×231 + 1×230 + … + 1×22 +0×21 +0×20 = -2,147,483,648 + 2,147,483,644 = -410
addi x5, x0, -4 //see the value in x5
Using 64 bits: -9,223,372,036,854,775,808
to 9,223,372,036,854,775,807
Chapter 2 — Instructions: Language of the Computer — 15
2s-Complement Signed Integers
Bit 63 is sign bit
1 for negative numbers
0 for non-negative numbers
–(–2n – 1) can’t be represented
Non-negative numbers have the same unsigned
and 2s-complement representation
Some specific numbers
0: 0000 0000 … 0000
–1: 1111 1111 … 1111
Most-negative: 1000 0000 … 0000 Most-positive: 0111 1111 … 1111
Chapter 2 — Instructions: Language of the Computer — 16
Signed Negation
Complement and add 1
Complement means 1 → 0, 0 → 1
x+x=1111…111 =−1 2
Example: negate +2
+2 = 0000 0000 … 0010two
–2=11111111…1101two +1= = 1111 1111 … 1110two
Chapter 2 — Instructions: Language of the Computer — 17
Sign Extension
Representing a number using more bits Preserve the numeric value
Replicate the sign bit to the left
c.f. unsigned values: extend with 0s
Examples: 8-bit to 16-bit
+2: 0000 0010 => 0000 0000 0000 0010 –2: 1111 1110 => 1111 1111 1111 1110
In RISC-V instruction set
lb: sign-extend loaded byte lbu: zero-extend loaded byte
Chapter 2 — Instructions: Language of the Computer — 18
Representing Instructions
Instructions are encoded in binary Called machine code
RISC-V instructions
Encoded as 32-bit instruction words
Small number of formats encoding operation code (opcode), register numbers, …
Regularity!
Chapter 2 — Instructions: Language of the Computer — 19
§2.5 Representing Instructions in the Computer
Hexadecimal
Compact representation of bit strings 4 bits per hex digit
Example: ECA8 6420
1110 1100 1010 1000 0110 0100 0010 0000
Chapter 2 — Instructions: Language of the Computer — 20
RISC-V R-format Instructions
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
Instruction fields
opcode: operation code
rd: destination register number
funct3: 3-bit function code (additional opcode) rs1: the first source register number
rs2: the second source register number
funct7: 7-bit function code (additional opcode)
Chapter 2 — Instructions: Language of the Computer — 21
R-format Example
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
add x9,x20,x21
0000 0001 0101 1010 0000 0100 1011 0011two = 015A04B316
Chapter 2 — Instructions: Language of the Computer — 22
RISC-V I-format Instructions
12 bits 5 bits 3 bits 5 bits 7 bits
Immediate arithmetic and load instructions
rs1: source or base address register number
immediate: constant operand, or offset added to base address 2s-complement, sign extended
Design Principle 3: Good design demands good compromises
Different formats complicate decoding, but allow 32-bit instructions uniformly
Keep formats as similar as possible
Chapter 2 — Instructions: Language of the Computer — 23
RISC-V S-format Instructions
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
Different immediate format for store instructions rs1: base address register number
rs2: source operand register number
immediate: offset added to base address
Split so that rs1 and rs2 fields are always in the same place
Chapter 2 — Instructions: Language of the Computer — 24
Logical Operations
Instructions for bitwise manipulation
Shift left
Shift right
Bit-by-bit AND
Bit-by-bit OR
Bit-by-bit XOR
Bit-by-bit NOT
Useful for extracting and inserting groups of bits in a word
Chapter 2 — Instructions: Language of the Computer — 25
§2.6 Logical Operations
Shift Operations
6 bits 6 bits 5 bits 3 bits 5 bits 7 bits
Special treatment of the I-format (srl [30:30]=0; sra [30:30]=1)
immed: how many positions to shift
Shift left logical
Shift left and fill with 0 bits
slli by i bits multiplies by 2i
Shift right logical
Shift right and fill with 0 bits
srli by i bits divides by 2i (unsigned only)
Chapter 2 — Instructions: Language of the Computer — 26
AND Operations
Useful to mask bits in a word
Select some bits, clear others to 0
and x9,x10,x11 ;0x500=0x5C0&0x700
00000000 00000000 00000000 00000000 00000000 00000000 00000101 11000000
00000000 00000000 00000000 00000000 00000000 00000000 00000111
00000000 00000000 00000000 00000000 00000000 00000000 00000101 00000000
addi x10, x0, 0x5C0
addi x11, x0, 0x700
and x9, x10, x11 ;0x500=0x5C0&0x700
Chapter 2 — Instructions: Language of the Computer — 27
OR Operations
Useful to include bits in a word
Set some bits to 1, leave others unchanged
or x9,x10,x11 ;0x5F0=0x590|0x0F0
00000000 00000000 00000000 00000000 00000000 00000000 00000101
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000101
addi x10, x0, 0x590
addi x11, x0, 0x0F0
or x9, x10, x11 ;0x5F0=0x590|0x0F0
Chapter 2 — Instructions: Language of the Computer — 28
XOR Operations
Differencing operation
Set some bits to 1, leave others unchanged
xor x9,x10,x11 // NOT operation
00000000 00000000 00000000 00000000 00000000 00000000 00000
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11
11111111 11111111 11111111 11111111 11111111 11111111 11111010 00111111
ori x10, x0, 0x5C0
addi x11, x0, -1
xor x9, x10,x11 ;NOT operation
Chapter 2 — Instructions: Language of the Computer — 29
0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0011 1101 0000
0000 0000 0000
0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0011 1101 0000
0101 0000 0000
32-bit Constants
Most constants are small
12-bit immediate is sufficient
For the 32-bit constant lui rd, constant
Copies 20-bit constant to bits [31:12] of rd
Extends bit 31 to bits [63:32] (treated as signed!)
Clears bits [11:0] of rd to 0
lui x19, 976 // 0x003D0
addi x19,x19,128 // 0x500
Chapter 2 — Instructions: Language of the Computer — 30
§2.10 RISC-V Addressing for Wide Immediates and Addresses
64-bit Constants
What about constants that are represented by more than 32 bits? To load a 64 bit constant, for example, we could first split it into two 32 bit values, then employ lui&addi method to load the values into registers, and finally combine the two 32 bit values in one register:
c: EQU 0x1234567811223344
lui x6, (c & 0xffffffff) >> 12
addi x6, x6, c & 0xfff
lui x7, c >> 44
addi x7, x7, (c & 0xfff00000000) >> 32
slli x7, x7, 32
or x5, x6, x7
Chapter 2 — Instructions: Language of the Computer — 31
§2.10 RISC-V Addressing for Wide Immediates and Addresses
Conditional Operations
Branch to a labeled instruction if a condition is true
Otherwise, continue sequentially beq rs1, rs2, L1
if (rs1 == rs2) branch to instruction labeled L1 bne rs1, rs2, L1
if (rs1 != rs2) branch to instruction labeled L1 Chapter 2 — Instructions: Language of the Computer — 32
§2.7 Instructions for Making Decisions
Compiling If Statements
if (i==j) f = g+h;
else f = g-h;
f, g, … in x19, x20, …
Compiled RISC-V code: (p.93)
bne x22, x23, Else
add x19, x20, x21
beq x0,x0,Exit // unconditional
Else: sub x19, x20, x21
Assembler calculates addresses
Chapter 2 — Instructions: Language of the Computer — 33
Compiling Loop Statements
while (save[i] == k) i += 1;
i in x22, k in x24, address of save in x25 Compiled RISC-V code: (p.95)
Loop: slli x10, x22, 3
add x10, x10, x25
ld x9, 0(x10)
bne x9, x24, Exit
addi x22, x22, 1
beq x0, x0, Loop
Chapter 2 — Instructions: Language of the Computer — 34
Basic Blocks
A basic block is a sequence of instructions with
No embedded branches (except at end) No branch targets (except at beginning)
A compiler identifies basic blocks for optimization
An advanced processor can accelerate execution of basic blocks
Chapter 2 — Instructions: Language of the Computer — 35
More Conditional Operations
blt rs1, rs2, L1
if (rs1 < rs2) branch to instruction labeled L1
bge rs1, rs2, L1
if (rs1 >= rs2) branch to instruction labeled L1
if (a > b) a += 1;
a in x22, b in x23
bge x23, x22, Exit // branch if b >= a addi x22, x22, 1
Chapter 2 — Instructions: Language of the Computer — 36
Signed vs. Unsigned
Signed comparison: blt, bge
Unsigned comparison: bltu, bgeu
Example (32-bit):
x22 = 1111 1111 1111 1111 1111 1111 1111 1111
x23 = 0000 0000 0000 0000 0000 0000 0000 0001 x22 < x23 // signed
x22 > x23 // unsigned
+4,294,967,295 > +1
Chapter 2 — Instructions: Language of the Computer — 37
Branches and Jumps
Branch instructions such as beq, bne, etc. allow
conditional short jumps defined by 12-bit
beq x0,x0,loop
The jal (jump and link) instruction allows
unconditional long jumps defined by 20-bit offsets :
jal x0, loop
The jalr (jump and link register) instruction
allows unlimited unconditional unlimited jumps because it takes the jump address from a register:
jalr x0, 0(x1)
Chapter 2 — Instructions: Language of the Computer — 38
Branch Addressing
Branch instructions specify
Opcode, two registers, target address
Most branch targets are near branch Forward or backward
SB format:
imm[12] imm[11]
PC-relative addressing
Target address = PC + immediate × 2
imm [10:5]
Chapter 2 — Instructions: Language of the Computer — 39
Jump Addressing
Jump and link (jal) target uses 20-bit immediate for larger range
UJ format:
imm[20] imm[11]
5 bits 7 bits
imm[19:12]
For long jumps, eg, to 32-bit absolute address
lui: load address[31:12] to temp register jalr: add address[11:0] and jump to target
Chapter 2 — Instructions: Language of the Computer — 40
RISC-V Addressing Summary
Chapter 2 — Instructions: Language of the Computer — 41
RISC-V Encoding Summary
Chapter 2 — Instructions: Language of the Computer — 42
RISC-V Multiplication
Four multiply instructions: mul: multiply
Gives the lower 64 bits of the product mulh: multiply high
Gives the upper 64 bits of the product, assuming the operands are signed
mulhu: multiply high unsigned
Gives the upper 64 bits of the product, assuming the operands are unsigned
mulhsu: multiply high signed/unsigned
Gives the upper 64 bits of the product, assuming one operand is signed and the other unsigned
Use mulh result to check for 64-bit overflow
Chapter 3 — Arithmetic for Computers — 43
2’s Complement Multiplication
For multiplication of larger values that produce a result with more than 64 bits the mulh (multiply upper half), mulhu (multiply upper half unsigned), and mulhsu (multiply upper half signed unsigned), instructions are provided for calculating the upper 64 bits. Note that the mul instruction is still needed for calculating the lower 64 bits. Try the following example:
addi x5,x0,-2
addi x6,x0,16
mul x7,x5,x6 ;x7=-32
mulh x8,x5,x6 ;x8=-1 ;the sign extension is all 1’s
mulhu x9,x5,x6 ;x9=15 ;x5, the 1’s shifted 4 bits left
mulhsu x10,x5,x6 ;x10=-1 ;same as mulh because x6>=0 mulhsu x10,x6,x5 ;x10=15 ;same as mulhu because x5<0
We have chosen as a multiplicand -2 for its simple hexadecimal representation of 0xfffffffffffffffe. We have then multiplied it by 16 which effectively shifted the hexadecimal digits of the multiplicand by 1 position (4 bits) to the left.
Chapter 3 — Arithmetic for Computers — 44
2's Complement Multiplication
11111110 -2
x 11111101 x -3
---------- ----
11111110 6
11111110 11111110 ---------------- 000000000000110
•2's complement signed multiplication (size=4):
•sign extend to size*2 •multiply as if unsigned
•mul,mulh,mulhu,mulhsu. Chapter 3 — Arithmetic for Computers — 45
•retain the size*2 LSB •Uses size*2 operands to
calculate a size*4 result. •Try the example -2*16 with
RISC-V Division
Four instructions:
div, rem: signed divide, remainder
divu, remu: unsigned divide, remainder
Overflow and division-by-zero don’t produce errors
Just return defined results
Faster for the common case of no error (Note that RVS produces errors)
Chapter 3 — Arithmetic for Computers — 46
Right Shift and Division
Left shift by i places multiplies an integer by 2i
Right shift divides by 2i? Only for unsigned integers
For signed integers
Arithmetic right shift: replicate the sign bit
e.g., –5 / 4
111110112 >> 2 = 111111102 = –2 Rounds toward –∞
c.f. 111110112 >>> 2 = 001111102 = +62 Attn: (-1)/2=(-1) and the remainder is 1
Chapter 3 — Arithmetic for Computers — 47
§3.9 Fallacies and Pitfalls
FP Instructions in RISC-V
Separate FP registers: f0, …, f31
double-precision
single-precision values stored in the lower 32 bits
FP instructions operate only on FP registers
Programs generally don’t do integer ops on FP data,
or vice versa
More registers with minimal code-size impact
FP load and store instructions flw, fld
fsw, fsd
Chapter 3 — Arithmetic for Computers — 48
FP Instructions in RISC-V
Single-precision arithmetic
fadd.s, fsub.s, fmul.s, fdiv.s, fsqrt.s
e.g., fadds.s f2, f4, f6
fadd.d, fsub.d, fmul.d, fdiv.d, fsqrt.d
Double-precision arithmetic
e.g., fadd.d f2, f4, f6
Single- and double-precision comparison
feq.s, flt.s, fle.s
feq.d, flt.d, fle.d (beq slt/blt bge) Result is 0 or 1 in integer destination register
Use beq, bne to branch on comparison result
Branch on FP condition code 0 or 1, e.g. for
code in x5
beq x5, x0, label
Chapter 3 — Arithmetic for Computers — 49
Procedure Calling
Steps required
1. Placeparametersinregistersx10tox17 2. Transfercontroltoprocedure
3. Acquirestorageforprocedure
4. Performprocedure’soperations
5. Placeresultinregisterforcaller
6. Returntoplaceofcall(addressinx1)
Chapter 2 — Instructions: Language of the Computer — 50
§2.8 Supporting Procedures in Computer Hardware
Procedure Call Instructions
Procedure call: jump and link jal x1, ProcedureLabel
Address of following instruction put in x1 Jumps to target addr
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com