CS代考 COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN
The Hardware/Software Interface
Instructions: Language of the Computer

Copyright By PowCoder代写 加微信 powcoder

Instruction Set
 The repertoire of instructions of a computer
 Different computers have different instruction sets
 But with many aspects in common
 Early computers had very simple
instruction sets
 Simplified implementation
 Many modern computers also have simple instruction sets
Chapter 2 — Instructions: Language of the Computer — 2
§2.1 Introduction

The RISC-V Instruction Set
 Used as the example throughout the book
 Developed at UC Berkeley as open ISA
 Now managed by the RISC-V Foundation (riscv.org)
 Typical of many modern ISAs
 See RISC-V Reference Data tear-out card
 Similar ISAs have a large share of embedded core market
 Applications in consumer electronics, network/storage equipment, cameras, printers, …
Chapter 2 — Instructions: Language of the Computer — 3

Arithmetic Operations
 Add and subtract, three operands  Two sources and one destination
adda,b,c //agetsb+c
 All arithmetic operations have this form  Design Principle 1: Simplicity favours
regularity
 Regularity makes implementation simpler
 Simplicity enables higher performance at lower cost
Chapter 2 — Instructions: Language of the Computer — 4
§2.2 Operations of the Computer Hardware

Arithmetic Example
f = (g + h) – (i + j);
 Compiled RISC-V code:
addt0,g,h //tempt0=g+h addt1,i,j //tempt1=i+j subf,t0,t1 //f=t0-t1
Chapter 2 — Instructions: Language of the Computer — 5

Register Operands
 Arithmetic instructions use register operands
 RISC-V has a 32 × 64-bit register file
 Use for frequently accessed data
 64-bit data is called a “doubleword”
 32 x 64-bit general purpose registers x0 to x31
 32-bit data is called a “word”
 Design Principle 2: Smaller is faster
 c.f. main memory: millions of locations
Chapter 2 — Instructions: Language of the Computer — 6
§2.3 Operands of the Computer Hardware

RISC-V Registers
 x0: the constant value 0  x1: return address
 x2: stack pointer
 x3: global pointer
 x4: thread pointer
 x5 – x7, x28 – x31: temporaries
 x8: frame pointer
 x9, x18 – x27: saved registers
 x10 – x11: function arguments/results  x12 – x17: function arguments
Chapter 2 — Instructions: Language of the Computer — 7

Register Operand Example
f = (g + h) – (i + j);
 f, …, j in x19, x20, …, x23
 Compiled RISC-V code: (p.68)
add x5, x20, x21
add x6, x22, x23
sub x19, x5, x6
Chapter 2 — Instructions: Language of the Computer — 8

Memory Operands
 Main memory used for composite data  Arrays, structures, dynamic data
 To apply arithmetic operations
 Load values from memory into registers  Store result from register to memory
 Memory is byte addressed
 Each address identifies an 8-bit byte
 RISC-V is Little Endian
 Least-significant byte at least address of a word
 c.f. Big Endian: most-significant byte at least address
 RISC-V does not require words to be aligned in memory (note that RVS reports alignment errors)
 Unlike some other ISAs
Chapter 2 — Instructions: Language of the Computer — 9

Byte/Halfword/Word Operations
 RISC-V byte/halfword/word load/store
 Load byte/halfword/word: Sign extend to 64 bits in rd  lb rd, offset(rs1)
 lh rd, offset(rs1)
 lw rd, offset(rs1)
 Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
 lbu rd, offset(rs1)  lhu rd, offset(rs1)  lwu rd, offset(rs1)
 Store byte/halfword/word: Store rightmost 8/16/32 bits  sb rs2, offset(rs1)
 sh rs2, offset(rs1)
 sw rs2, offset(rs1)
Chapter 2 — Instructions: Language of the Computer — 10

Memory Operand Example
A[12] = h + A[8];
 h in x21, base address of A in x22  Compiled RISC-V code:(p.71)
 Index 8 requires offset of 64  8 bytes per doubleword
x9, 64(x22)
x9, x21, x9
x9, 96(x22)
Chapter 2 — Instructions: Language of the Computer — 11

Registers vs. Memory
 Registers are faster to access than memory
 Operating on memory data requires loads and stores
 More instructions to be executed
 Compiler must use registers for variables
as much as possible
 Only spill to memory for less frequently used variables
 Register optimization is important!
Chapter 2 — Instructions: Language of the Computer — 12

Immediate Operands (p.72)
 Constant data specified in an instruction
addi x22, x22, 4
 Make the common case fast
 Small constants are common
 Immediate operand avoids a load instruction
Chapter 2 — Instructions: Language of the Computer — 13

Unsigned Binary Integers
 Given an n-bit number
 x = xn-12n-1 + xn-22n-2 + … + x121 + x020
 Range: 0 to +2n-1  Example
 0000 0000 … 0000 10112 =0+…+1×23 +0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110
 addi x5, x0, 11 //see the value in x5
 Using 64 bits: 0 to +18,446,774,073,709,551,615
Chapter 2 — Instructions: Language of the Computer — 14
§2.4 Signed and Unsigned Numbers

2s-Complement Signed Integers
 Given an n-bit number
 x = -xn-12n-1 + xn-22n-2 + … + x121 + x020
 Range: -2n-1 to +2n-1-1  Example
 1111 1111 … 1111 11002
= -1×231 + 1×230 + … + 1×22 +0×21 +0×20 = -2,147,483,648 + 2,147,483,644 = -410
 addi x5, x0, -4 //see the value in x5
 Using 64 bits: -9,223,372,036,854,775,808
to 9,223,372,036,854,775,807
Chapter 2 — Instructions: Language of the Computer — 15

2s-Complement Signed Integers
 Bit 63 is sign bit
 1 for negative numbers
 0 for non-negative numbers
 –(–2n – 1) can’t be represented
 Non-negative numbers have the same unsigned
and 2s-complement representation
 Some specific numbers
 0: 0000 0000 … 0000
 –1: 1111 1111 … 1111
 Most-negative: 1000 0000 … 0000  Most-positive: 0111 1111 … 1111
Chapter 2 — Instructions: Language of the Computer — 16

Signed Negation
 Complement and add 1
 Complement means 1 → 0, 0 → 1
x+x=1111…111 =−1 2
 Example: negate +2
 +2 = 0000 0000 … 0010two
–2=11111111…1101two +1= = 1111 1111 … 1110two
Chapter 2 — Instructions: Language of the Computer — 17

Sign Extension
 Representing a number using more bits  Preserve the numeric value
 Replicate the sign bit to the left
 c.f. unsigned values: extend with 0s
 Examples: 8-bit to 16-bit
 +2: 0000 0010 => 0000 0000 0000 0010  –2: 1111 1110 => 1111 1111 1111 1110
 In RISC-V instruction set
 lb: sign-extend loaded byte  lbu: zero-extend loaded byte
Chapter 2 — Instructions: Language of the Computer — 18

Representing Instructions
 Instructions are encoded in binary  Called machine code
 RISC-V instructions
 Encoded as 32-bit instruction words
 Small number of formats encoding operation code (opcode), register numbers, …
 Regularity!
Chapter 2 — Instructions: Language of the Computer — 19
§2.5 Representing Instructions in the Computer

Hexadecimal
 Compact representation of bit strings  4 bits per hex digit
 Example: ECA8 6420
 1110 1100 1010 1000 0110 0100 0010 0000
Chapter 2 — Instructions: Language of the Computer — 20

RISC-V R-format Instructions
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
 Instruction fields
 opcode: operation code
 rd: destination register number
 funct3: 3-bit function code (additional opcode)  rs1: the first source register number
 rs2: the second source register number
 funct7: 7-bit function code (additional opcode)
Chapter 2 — Instructions: Language of the Computer — 21

R-format Example
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
add x9,x20,x21
0000 0001 0101 1010 0000 0100 1011 0011two = 015A04B316
Chapter 2 — Instructions: Language of the Computer — 22

RISC-V I-format Instructions
12 bits 5 bits 3 bits 5 bits 7 bits
 Immediate arithmetic and load instructions
 rs1: source or base address register number
 immediate: constant operand, or offset added to base address  2s-complement, sign extended
 Design Principle 3: Good design demands good compromises
 Different formats complicate decoding, but allow 32-bit instructions uniformly
 Keep formats as similar as possible
Chapter 2 — Instructions: Language of the Computer — 23

RISC-V S-format Instructions
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
 Different immediate format for store instructions  rs1: base address register number
 rs2: source operand register number
 immediate: offset added to base address
 Split so that rs1 and rs2 fields are always in the same place
Chapter 2 — Instructions: Language of the Computer — 24

Logical Operations
 Instructions for bitwise manipulation
Shift left
Shift right
Bit-by-bit AND
Bit-by-bit OR
Bit-by-bit XOR
Bit-by-bit NOT
 Useful for extracting and inserting groups of bits in a word
Chapter 2 — Instructions: Language of the Computer — 25
§2.6 Logical Operations

Shift Operations
6 bits 6 bits 5 bits 3 bits 5 bits 7 bits
Special treatment of the I-format (srl [30:30]=0; sra [30:30]=1)
 immed: how many positions to shift
 Shift left logical
 Shift left and fill with 0 bits
 slli by i bits multiplies by 2i
 Shift right logical
 Shift right and fill with 0 bits
 srli by i bits divides by 2i (unsigned only)
Chapter 2 — Instructions: Language of the Computer — 26

AND Operations
 Useful to mask bits in a word
 Select some bits, clear others to 0
and x9,x10,x11 ;0x500=0x5C0&0x700
00000000 00000000 00000000 00000000 00000000 00000000 00000101 11000000
00000000 00000000 00000000 00000000 00000000 00000000 00000111
00000000 00000000 00000000 00000000 00000000 00000000 00000101 00000000
addi x10, x0, 0x5C0
addi x11, x0, 0x700
and x9, x10, x11 ;0x500=0x5C0&0x700
Chapter 2 — Instructions: Language of the Computer — 27

OR Operations
 Useful to include bits in a word
 Set some bits to 1, leave others unchanged
or x9,x10,x11 ;0x5F0=0x590|0x0F0
00000000 00000000 00000000 00000000 00000000 00000000 00000101
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000101
addi x10, x0, 0x590
addi x11, x0, 0x0F0
or x9, x10, x11 ;0x5F0=0x590|0x0F0
Chapter 2 — Instructions: Language of the Computer — 28

XOR Operations
 Differencing operation
 Set some bits to 1, leave others unchanged
xor x9,x10,x11 // NOT operation
00000000 00000000 00000000 00000000 00000000 00000000 00000
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11
11111111 11111111 11111111 11111111 11111111 11111111 11111010 00111111
ori x10, x0, 0x5C0
addi x11, x0, -1
xor x9, x10,x11 ;NOT operation
Chapter 2 — Instructions: Language of the Computer — 29

0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0011 1101 0000
0000 0000 0000
0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0011 1101 0000
0101 0000 0000
32-bit Constants
 Most constants are small
 12-bit immediate is sufficient
 For the 32-bit constant lui rd, constant
 Copies 20-bit constant to bits [31:12] of rd
 Extends bit 31 to bits [63:32] (treated as signed!)
 Clears bits [11:0] of rd to 0
lui x19, 976 // 0x003D0
addi x19,x19,128 // 0x500
Chapter 2 — Instructions: Language of the Computer — 30
§2.10 RISC-V Addressing for Wide Immediates and Addresses

64-bit Constants
What about constants that are represented by more than 32 bits? To load a 64 bit constant, for example, we could first split it into two 32 bit values, then employ lui&addi method to load the values into registers, and finally combine the two 32 bit values in one register:
c: EQU 0x1234567811223344
lui x6, (c & 0xffffffff) >> 12
addi x6, x6, c & 0xfff
lui x7, c >> 44
addi x7, x7, (c & 0xfff00000000) >> 32
slli x7, x7, 32
or x5, x6, x7
Chapter 2 — Instructions: Language of the Computer — 31
§2.10 RISC-V Addressing for Wide Immediates and Addresses

Conditional Operations
 Branch to a labeled instruction if a condition is true
 Otherwise, continue sequentially  beq rs1, rs2, L1
 if (rs1 == rs2) branch to instruction labeled L1  bne rs1, rs2, L1
 if (rs1 != rs2) branch to instruction labeled L1 Chapter 2 — Instructions: Language of the Computer — 32
§2.7 Instructions for Making Decisions

Compiling If Statements
if (i==j) f = g+h;
else f = g-h;
 f, g, … in x19, x20, …
 Compiled RISC-V code: (p.93)
bne x22, x23, Else
add x19, x20, x21
beq x0,x0,Exit // unconditional
Else: sub x19, x20, x21
Assembler calculates addresses
Chapter 2 — Instructions: Language of the Computer — 33

Compiling Loop Statements
while (save[i] == k) i += 1;
 i in x22, k in x24, address of save in x25  Compiled RISC-V code: (p.95)
Loop: slli x10, x22, 3
add x10, x10, x25
ld x9, 0(x10)
bne x9, x24, Exit
addi x22, x22, 1
beq x0, x0, Loop
Chapter 2 — Instructions: Language of the Computer — 34

Basic Blocks
 A basic block is a sequence of instructions with
 No embedded branches (except at end)  No branch targets (except at beginning)
 A compiler identifies basic blocks for optimization
 An advanced processor can accelerate execution of basic blocks
Chapter 2 — Instructions: Language of the Computer — 35

More Conditional Operations
 blt rs1, rs2, L1
 if (rs1 < rs2) branch to instruction labeled L1  bge rs1, rs2, L1  if (rs1 >= rs2) branch to instruction labeled L1
 if (a > b) a += 1;
 a in x22, b in x23
bge x23, x22, Exit // branch if b >= a addi x22, x22, 1
Chapter 2 — Instructions: Language of the Computer — 36

Signed vs. Unsigned
 Signed comparison: blt, bge
 Unsigned comparison: bltu, bgeu
 Example (32-bit):
 x22 = 1111 1111 1111 1111 1111 1111 1111 1111
 x23 = 0000 0000 0000 0000 0000 0000 0000 0001  x22 < x23 // signed  x22 > x23 // unsigned
 +4,294,967,295 > +1
Chapter 2 — Instructions: Language of the Computer — 37

Branches and Jumps
 Branch instructions such as beq, bne, etc. allow
conditional short jumps defined by 12-bit
 beq x0,x0,loop
 The jal (jump and link) instruction allows
unconditional long jumps defined by 20-bit offsets :
 jal x0, loop
 The jalr (jump and link register) instruction
allows unlimited unconditional unlimited jumps because it takes the jump address from a register:
 jalr x0, 0(x1)
Chapter 2 — Instructions: Language of the Computer — 38

Branch Addressing
 Branch instructions specify
 Opcode, two registers, target address
 Most branch targets are near branch  Forward or backward
 SB format:
imm[12] imm[11]
 PC-relative addressing
 Target address = PC + immediate × 2
imm [10:5]
Chapter 2 — Instructions: Language of the Computer — 39

Jump Addressing
 Jump and link (jal) target uses 20-bit immediate for larger range
 UJ format:
imm[20] imm[11]
5 bits 7 bits
imm[19:12]
 For long jumps, eg, to 32-bit absolute address
 lui: load address[31:12] to temp register  jalr: add address[11:0] and jump to target
Chapter 2 — Instructions: Language of the Computer — 40

RISC-V Addressing Summary
Chapter 2 — Instructions: Language of the Computer — 41

RISC-V Encoding Summary
Chapter 2 — Instructions: Language of the Computer — 42

RISC-V Multiplication
 Four multiply instructions:  mul: multiply
Gives the lower 64 bits of the product  mulh: multiply high
Gives the upper 64 bits of the product, assuming the operands are signed
 mulhu: multiply high unsigned
Gives the upper 64 bits of the product, assuming the operands are unsigned
 mulhsu: multiply high signed/unsigned
Gives the upper 64 bits of the product, assuming one operand is signed and the other unsigned
 Use mulh result to check for 64-bit overflow
Chapter 3 — Arithmetic for Computers — 43

2’s Complement Multiplication
For multiplication of larger values that produce a result with more than 64 bits the mulh (multiply upper half), mulhu (multiply upper half unsigned), and mulhsu (multiply upper half signed unsigned), instructions are provided for calculating the upper 64 bits. Note that the mul instruction is still needed for calculating the lower 64 bits. Try the following example:
addi x5,x0,-2
addi x6,x0,16
mul x7,x5,x6 ;x7=-32
mulh x8,x5,x6 ;x8=-1 ;the sign extension is all 1’s
mulhu x9,x5,x6 ;x9=15 ;x5, the 1’s shifted 4 bits left
mulhsu x10,x5,x6 ;x10=-1 ;same as mulh because x6>=0 mulhsu x10,x6,x5 ;x10=15 ;same as mulhu because x5<0 We have chosen as a multiplicand -2 for its simple hexadecimal representation of 0xfffffffffffffffe. We have then multiplied it by 16 which effectively shifted the hexadecimal digits of the multiplicand by 1 position (4 bits) to the left. Chapter 3 — Arithmetic for Computers — 44 2's Complement Multiplication 11111110 -2 x 11111101 x -3 ---------- ---- 11111110 6 11111110 11111110 ---------------- 000000000000110 •2's complement signed multiplication (size=4): •sign extend to size*2 •multiply as if unsigned •mul,mulh,mulhu,mulhsu. Chapter 3 — Arithmetic for Computers — 45 •retain the size*2 LSB •Uses size*2 operands to calculate a size*4 result. •Try the example -2*16 with RISC-V Division  Four instructions:  div, rem: signed divide, remainder  divu, remu: unsigned divide, remainder  Overflow and division-by-zero don’t produce errors  Just return defined results  Faster for the common case of no error (Note that RVS produces errors) Chapter 3 — Arithmetic for Computers — 46 Right Shift and Division  Left shift by i places multiplies an integer by 2i  Right shift divides by 2i?  Only for unsigned integers  For signed integers  Arithmetic right shift: replicate the sign bit  e.g., –5 / 4  111110112 >> 2 = 111111102 = –2  Rounds toward –∞
 c.f. 111110112 >>> 2 = 001111102 = +62  Attn: (-1)/2=(-1) and the remainder is 1
Chapter 3 — Arithmetic for Computers — 47
§3.9 Fallacies and Pitfalls

FP Instructions in RISC-V
 Separate FP registers: f0, …, f31
 double-precision
 single-precision values stored in the lower 32 bits
 FP instructions operate only on FP registers
 Programs generally don’t do integer ops on FP data,
or vice versa
 More registers with minimal code-size impact
 FP load and store instructions  flw, fld
 fsw, fsd
Chapter 3 — Arithmetic for Computers — 48

FP Instructions in RISC-V
 Single-precision arithmetic
 fadd.s, fsub.s, fmul.s, fdiv.s, fsqrt.s
e.g., fadds.s f2, f4, f6
 fadd.d, fsub.d, fmul.d, fdiv.d, fsqrt.d
 Double-precision arithmetic
e.g., fadd.d f2, f4, f6
 Single- and double-precision comparison
 feq.s, flt.s, fle.s
 feq.d, flt.d, fle.d (beq slt/blt bge)  Result is 0 or 1 in integer destination register
Use beq, bne to branch on comparison result
 Branch on FP condition code 0 or 1, e.g. for
code in x5
 beq x5, x0, label
Chapter 3 — Arithmetic for Computers — 49

Procedure Calling
Steps required
1. Placeparametersinregistersx10tox17 2. Transfercontroltoprocedure
3. Acquirestorageforprocedure
4. Performprocedure’soperations
5. Placeresultinregisterforcaller
6. Returntoplaceofcall(addressinx1)
Chapter 2 — Instructions: Language of the Computer — 50
§2.8 Supporting Procedures in Computer Hardware

Procedure Call Instructions
 Procedure call: jump and link jal x1, ProcedureLabel
 Address of following instruction put in x1  Jumps to target addr

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com