CS代考计算机代写 computer architecture Java compiler scheme RISC-V c++ assembler x86 flex data structure Lecture 3a:

Lecture 3a:
Instructions: Language of the Computer (2/3)
John Owens
Introduction to Computer Architecture UC Davis EEC 170, Winter 2021

From last time …
▪ What instructions look like
-add, sub, ld, sw, addi
– RISC-V: 32 bit instructions, different types (R, I, S)
– RISC-V: Instructions either compute something or move something to/from memory
▪ Numbers
– Integers, signed/unsigned integers, sign extension
– Decimal, binary, hexadecimal
– Converting bits <-> numbers
31 UC Davis EEC 170, Winter 2021 / © John Owens

Representing Instructions

Instructions are encoded in binary – Called “machine code”

RISC-V instructions
How do we get from add x5, x20, x21 to binary? ▪
– Encoded as 32-bit instruction words
– Big picture: We divide the 32-bit instruction word into “fields”, each of a few bits, and encode different pieces information from the instruction into each field
– Small number of formats encoding operation code (opcode), register numbers, …
– Regularity!
32 UC Davis EEC 170, Winter 2021 / © John Owens
§2.5 Representing Instructions in the Computer

Hexadecimal
▪ Base 16
– Compact representation of bit strings
– 4 bits (“nibble”) per hex digit
– 0x means “I’m hexadecimal”
0
0000
4
0100
8
1000
c
1100
1
0001
5
0101
9
1001
d
1101
2
0010
6
0110
a
1010
e
1110
3
0011
7
0111
b
1011
f
1111
▪ Example: 0x eca8 6420
– 1110 1100 1010 1000 0110 0100 0010 0000
33
UC Davis EEC 170, Winter 2021 / © John Owens

ReaD CYCLE upper Half I RDCYCLEH rd
I
RISC-V R-format Instructions
▪ Instruction fields
– opcode: operation code
– rd: destination register number
OR OR Immediate AND
– funct3: 3-bit function code (additional opcode)
– rs1: the first source register numberAND Immediate Compare Set < - rs2: the second source register number Set < Imm Unsigned Branches Branch = SB - funct7: 7-bit function code (additional opcode) Shift Left Immediate SLLI rd,rs1,shamt Shift Right Immediate Shift Right Arithmetic Shift Right Arith Imm Shift Right R I R I SRL rd,rs1,rs2 SRLI rd,rs1,shamt SRA rd,rs1,rs2 SRAI rd,rs1,shamt Arithmetic ADD ADD Immediate SUBtract Load Upper Imm Add Upper Imm to PC R I R U U ADD rd,rs1,rs2 ADDI rd,rs1,imm SUB rd,rs1,rs2 LUI rd,imm AUIPC rd,imm Logical XOR XOR Immediate R I R I R I R I R I XOR rd,rs1,rs2 XORI rd,rs1,imm OR rd,rs1,rs2 ORI rd,rs1,imm AND rd,rs1,rs2 ANDI rd,rs1,imm SLT rd,rs1,rs2 SLTI rd,rs1,imm SLTU rd,rs1,rs2 SLTIU rd,rs1,imm BEQ rs1,rs2,imm BNE rs1,rs2,imm BLT rs1,rs2,imm BGE rs1,rs2,imm BLTU rs1,rs2,imm ,rs2,imm JAL rd,imm J7AbLitRs rd,rs1,imm FENCE FENCE.I SCALL SBREAK Set < Immediate Set < Unsigned Branch ≠ SB funct7 rs2 rs1 funcBtr3anc h ≥ Unrsdigned SB oBpGcEoUders1 7bits 5bits 5bits Branch < Branch ≥ Branch < Unsigned Jump & Link J&L 3Jbuitmsp&LinkR5ebgitister Synch Synch thread Synch Instr & Data System System CALL System BREAK SB SB SB UJ UJ I I I I 34 Counters ReaD CYCLE I UC Davis EEC 170, Winter 2021 / © John Owens RDCYCLE rd R-format Example add x9,x20,x21 7 bits 5 bits 5 bits 3 bits 5 bits 7 bits funct7 rs2 rs1 funct3 rd opcode 0 21 20 0 9 51 0000000 10101 10100 000 01001 0110011 0000 0001 0101 1010 0000 0100 1011 0011two = 015A04B316 35 UC Davis EEC 170, Winter 2021 / © John Owens funct7 rs2 rs1 funct3 rd opcode R-type Opcode Map imm[11:0] imm[11:5] rs2 rs1 funct3 rd rs1 funct3 imm[4:0] rs1 funct3 imm[4:1|11] rd rd opcode opcode opcode opcode opcode I-type S-type B-type U-type J-type LUI AUIPC JAL JALR BEQ BNE BLT BGE BLTU BGEU LB LH LW LBU LHU SB SH SW ADDI SLTI SLTIU XORI ORI ANDI SLLI SRLI SRAI SUB SLL SLT UC Davis EEC 170, Winter 2021 / © John Owens imm[12|10:5] rs2 imm[11:0] imm[11:0] imm[11:0] imm[11:0] imm[31:12] imm[20|10:1|11|19:12] RV32I Base Instruction Set imm[31:12] imm[31:12] rd imm[20|10:1|11|19:12] rd rd 1101111 imm[11:0] rs1 000 rd 1100111 000 rs1 010 rd rd imm[4:0] rd 0000011 imm[11:0] rs1 100 0000011 imm[11:0] rs1 101 rd 0000011 imm[11:5] rs2 rs1 000 0100011 imm[11:5] rs2 rs1 001 imm[4:0] 0100011 imm[11:5] rs2 rs1 010 imm[4:0] 0100011 imm[11:0] rs1 000 rd 0010011 rs1 010 0010011 rs1 011 rd rd 0010011 imm[11:0] rs1 100 0010011 imm[11:0] rs1 110 rd 0010011 imm[11:0] rs1 111 rd 0010011 0000000 0100000 shamt shamt rs1 rs1 001 101 rd rd rd 0010011 0000000 shamt rs1 101 0010011 0010011 0000000 rs2 rs1 000 rd 0110011 ADD 0100000 rs2 rs1 000 rd 0110011 0000000 rs2 rs1 001 rd 0110011 0000000 rs2 0000000 rs2 0000000 rs2 rs1 010 rs1 011 rs1 100 rd rd rd 0110011 0110011 0110011 SLTU 36 XOR rd 0110111 0010111 imm[12|10:5] imm[12|10:5] rs2 rs1 imm[4:1|11] imm[4:1|11] 1100011 imm[12|10:5] rs2 rs1 001 imm[4:1|11] 1100011 imm[12|10:5] rs2 rs1 100 imm[4:1|11] 1100011 rs2 rs1 101 1100011 imm[12|10:5] rs2 rs1 110 imm[4:1|11] 1100011 imm[12|10:5] rs2 rs1 111 imm[4:1|11] 1100011 imm[11:0] rs1 000 rd 0000011 rs1 001 0000011 0000000 rs2 rs1 101 rd 0110011 SRL Shift Right Arithmetic R SRA rd,rs1,rs2 - immediate: constant operand, or offset added to base address - 2s-complement, sign extended - How big can this immediate be? - Why did they pick this size? Set < Unsigned Set < Imm Unsigned Branches Branch = Branch ≠ Branch < Branch ≥ Branch < Unsigned Branch ≥ Unsigned Jump & Link J&L Jump & Link Register R SLTU I SLTIU SB BEQ SB BNE SB BLT SB BGE SB BLTU SB BGEU UJ JAL UJ JALR I FENCE - Advantages/disadvantages of making it bigger/smaller? Arithmetic ADD ADD Immediate SUBtract Load Upper Imm Add Upper Imm to PC I ANDI Compare Set< R SLT Set < Immediate I SLTI R I R U U ADD rd,rs1,rs2 ADDI rd,rs1,imm SUB rd,rs1,rs2 LUI rd,imm AUIPC rd,imm ▪ Immediate arithmetic and load instructions - rs1: source or base address register number Shift Right Arith Imm I SRAI rd,rs1,shamt RISC-V I-format Instructions Logical XOR R XOR XORI OR ORI AND rd,rs1,rs2 rd,rs1,imm rd,rs1,rs2 rd,rs1,imm rd,rs1,rs2 rd,rs1,imm rd,rs1,rs2 rd,rs1,imm rd,rs1,rs2 rd,rs1,imm rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm rd,imm rd,rs1,imm XOR Immediate I OR R I R OR Immediate AND AND Immediate Synch Synch thread Synch Instr & Data I System System CALL I SCALL FENCE.I immediate rs1 funct3 R ReaD TIME rd eaDTIME I RDT opcode pper Half I RDT 12 bits 5 bits 3 bits RDCYCLE rd RDCYCLEH rd IME rd IMEH rd I RDINSTRET rd 37 R UC Davis EEC 170, Winter 2021 / © John Owens I System BREAK I SBREAK Counters ReaD CYCLE ReaD CYCLE upper Half u ReaD INSTR RETired I I 5 bits 7 bits ReaD INSTR upper Half I RDINSTRETH rd 32-bit Instruction Formats RISC-V I-format vs. R-format ▪ ▪ ▪ I-format: R-format: 7 bits 12 bits 5 bits 3 bits 5 bits 7 bits immediate rs1 funct3 rd opcode funct7 rs2 rs1 funct3 rd opcode 5 bits 5 bits 3 bits 5 bits 7 bits Design Principle 3: Good design demands good compromises - Different formats complicate decoding, but allow 32-bit instructions uniformly - Keep formats as similar as possible 38 UC Davis EEC 170, Winter 2021 / © John Owens RISC-V S-format Instructions ▪ Different immediate format for store instructions - rs1: base address register number - rs2: source operand register number - immediate: offset added to base address - Split so that rs1 and rs2 fields always in the same place imm[11:5] rs2 rs1 funct3 imm[4:0] opcode 7 bits 5 bits 5 bits 3 bits 5 bits 7 bits 39 UC Davis EEC 170, Winter 2021 / © John Owens RISC-V I-format vs. R-format vs. S-format ▪ I-format: immediate rs1 funct3 rd opcode ▪ R-format: 7 bits ▪ S-format: 7 bits 5 bits 5 bits 3 bits 5 bits 7 bits 12 bits 5 bits 3 bits 5 bits 7 bits funct7 rs2 rs1 funct3 rd opcode imm[11:5] rs2 rs1 funct3 imm[4:0] opcode 5 bits 5 bits 3 bits 5 bits 7 bits 40 UC Davis EEC 170, Winter 2021 / © John Owens Logical Operations ▪ Instructions for bitwise manipulation Operation C Java RISC-V Shift left << << slli Shift right >>
>>>
srli
Bit-by-bit AND
&
&
and, andi
Bit-by-bit OR
|
|
or, ori
Bit-by-bit XOR
^
^
xor, xori
Bit-by-bit NOT
~
~
■ Useful for extracting and inserting groups of bits in a word
41
UC Davis EEC 170, Winter 2021 / © John Owens
§2.6 Logical Operations

Category Name Fmt RV32I Base
Shift Operations
▪ immed: how many positions to shift
▪ Shift left logical
– Shift left and fill with 0 bits
Loads Load Byte Load Halfword Load Word Load Byte Unsigned Load Half Unsigned
Stores Store Byte Store Halfword Store Word
Shifts Shift Left Shift Left Immediate Shift Right Shift Right Immediate Shift Right Arithmetic Shift Right Arith Imm
Arithmetic ADD ADD Immediate SUBtract
Load Upper Imm Add Upper Imm to PC
Logical XOR XOR Immediate
I LB I LH I LW I LBU I LHU S SB S SH S SW
rd,rs1,imm rd,rs1,imm rd,rs1,imm L rd,rs1,imm rd,rs1,imm L rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm S
S S S S S S
rd,rs1,rs2 A rd,rs1,imm A
R I R I R I
SLL rd,rs1,rs2
SLLI rd,rs1,shamt
SRL rd,rs1,rs2
SRLI rd,rs1,shamt
SRA rd,rs1,rs2
SRAI rd,rs1,shamt

▪ Shift right logical
i slliby i bits multiplies by 2
R ADD I ADDI
R SUB U LUI
U AUIPC R XOR
I XORI R OR
I ORI R AND
I SLTI
– Shift right and fill with 0 bits
rd,imm
rd,imm rd,rs1,rs2 L rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
– –
OR Immediate srliby i bits divides by 2i (unsigned only) OR
Set < Immediate Set < Unsigned Set < Imm Unsigned Compare Set < R SLT Also arithmetic right shifts that fill with sign bit (srai) - Why not an arithmetic left shift? R I SLTU SLTIU BEQ BGE 6 bits 6 bits 5 bits 3 bits Branch ≥ Branch < Unsigned Branch ≥ Unsigned Jump & Link J&L SB 42 UC Davis EEC 170, Winter 2021 / © John Owens AND AND Immediate I ANDI Branches Branch = SB rd,rs1,rs2 S funct6 immed rs1 funct3 rd Bra Bra nch≠ SBBNE opcode nch < SB BLT 5 bits 7 bits SB SB BGEU rs1,rs2,imm BLTU UJ JAL rd,imm Jump & Link Register UJ JALR rd,rs1,imm C S A AND Operations ▪ Useful to mask bits in a word - Select some bits, clear others to 0 ▪ and x9,x10,x11 x10 x11 x9 00000000 00000000 00000000 00000000 00000000 00000000 00001101 11000000 00000000 00000000 00000000 00000000 00000000 00000000 00111100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00001100 00000000 43 UC Davis EEC 170, Winter 2021 / © John Owens OR Operations ▪ Useful to include bits in a word - Set some bits to 1, leave others unchanged or x9,x10,x11 x10 x11 x9 00000000 00000000 00000000 00000000 00000000 00000000 0 00000000 00000000 00000000 00000000 00000000 00000000 0 00000000 00000000 00000000 00000000 00000000 00000000 0 44 UC Davis EEC 170, Winter 2021 / © John Owens 0001101 11000000 0111100 00000000 0111101 11000000 XOR Operations +RV{64,128} Free & Open Base Integer Instructions: RV32I, RV64I, and RV128I Category Name Fmt RV32I Base ▪ Differencing operation Categ CSR A At Atom Atomic Chang En R MMU Loads Load Byte I LB rd,rs1,imm Load Halfword I LH rd,rs1,imm - Set some bits to 1, leave others unchanged Load Word I LW rd,rs1,imm L{D|Q} L{W|D}U S{D|Q} SLL{W|D} rd,rs1,imm rd,rs1,imm rs1,rs2,imm rd,rs1,rs2 Load Byte Unsigned I LBU rd,rs1,imm Load Half Unsigned I LHU rd,rs1,imm xor x9,x10,x12 // NOT operation Stores Store Byte S SB rs1,rs2,imm Shifts Shift Left R R Store Halfword S Store Word S SH SW SLL rs1,rs2,imm rs1,rs2,imm rd,rs1,rs2 Shift Left Immediate I SLLI rd,rs1,shamt SLLI{W|D} rd,rs1,shamt T 00000000 00000000 00000000 00000000 00000000 00000000 00001101 11000000 x10 x12 nr Shift Right SRL rd,rs1,rs2 SRL{W|D} rd,rs1,rs2 Shift Right Immediate I SRLI rd,rs1,shamt SRLI{W|D} rd,rs1,shamt H Shift Right Arithmetic R SRA rd,rs1,rs2 SRA{W|D} rd,rs1,rs2 I 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 Shift Right Arith Imm I SRAI rd,rs1,shamt SRAI{W|D} rd,rs1,shamt Arithmetic ADD R ADD rd,rs1,rs2 ADD{W|D} rd,rs1,rs2 Redir ypervi x9 SUBtract Load Upper Imm Add Upper Imm to PC R SUB U LUI U AUIPC XOR rd,rs1,rs2 XORI rd,rs1,imm OR rd,rs1,rs2 ORI rd,rs1,imm AND rd,rs1,rs2 ANDI rd,rs1,imm 11111111 11111111 11111111 11111111 11111111 11111111 11110010 00111111 Logical XOR XOR Immediate OR OR Immediate AND AND Immediate R I R I R I Compare Set < R SLT rd,rs1,rs2 Stores Store Word CS C.SW ADD Immediate I ADDI rd,rs1,imm ADDI{W|D} rd,rs1,imm rd,rs1,rs2 rd,imm SUB{W|D} rd,rs1,rs2 rd,imm Optional Compressed (1 Category Name Fmt Loads Load Word CL C.LW rap ter 45 Load Quad SP Load Word SP Load Double Load Double SP Load Quad CI C.LWSP CL C.LD CI C.LDSP CL C.LQ CI C.LQSP UC Davis EEC 170, Winter 2021 / © John Owens Set < Immediate I SLTI rd,rs1,imm Store Word SP CSS C.SWSP R o c A i e v e e s u 6 Up to this point, we’ve made an assumption: what happens after we run instruction n? 46 UC Davis EEC 170, Winter 2021 / © John Owens Conditional Operations ▪ Branch to a labeled instruction if a condition is true - Otherwise, continue sequentially ▪ beq rs1, rs2, L1 - if (rs1 == rs2) branch to instruction labeled L1 ▪ bne rs1, rs2, L1 - if (rs1 != rs2) branch to instruction labeled L1 47 UC Davis EEC 170, Winter 2021 / © John Owens §2.7 Instructions for Making Decisions Compiling If Statements ▪ C code: if (i==j) f = g+h; else f = g-h; - f, g, ... in x19, x20, ... ▪ Compiled RISC-V code: bne x22, x23, Else add x19, x20, x21 beq x0, x0, Exit // unconditional Else: sub x19, x20, x21 Exit: ... 48 UC Davis EEC 170, Winter 2021 / © John Owens Assembler calculates addresses Compiling Loop Statements ▪ C code: while (save[i] == k) i += 1; - i in x22, k in x24, address of save in x25 ▪ Compiled RISC-V code: Loop: slli x10, x22, 3 add x10, x10, x25 ld x9, 0(x10) // could we optimize this with an immediate? bne x9, x24, Exit addi x22, x22, 1 beq x0, x0, Loop Exit: ... 49 UC Davis EEC 170, Winter 2021 / © John Owens $ Aside on addressing modes The addressing mode determines, for an instruction that accesses a memory location, how the address for the memory location is specified. To summarize, the following diagram (from http://en.wikipedia.org/wiki/X86#Addressing_modes) shows the possible ways an address could ▪ x86 has many more addressing modes than RISC-V $ be specified: Each square bracket in the above diagram indicates an optional These parts (from left to right) are: A register used as a base add width (or scale) value to multiply the register by, and an displac integer. The address is computed as the sum of: the base registe the displacement. The Intel & AT&T syntax for various addressing modes, depend diagram are used, is shown in the table below from $ ▪ RISC-V can do: integer. The address is computed as the sum of: the base register, the index times the width, and http://simon.baymoo.org/universe/tools/symset/symset.txt (sligh Each square bracket in the above diagram indicates an optional part of the address specification. - register | Absolute | Register | Reg + Off | MOV EAX, [0100] | MOV EAX, [ESI] | MOV EAX, [EBP-8] the displacement. +-------------+----------------------------+---- These parts (from left to right) are: A register used as a base address, a register used as an index, a | Mode | Intel | AT& width (or scale) value to multiply the register by, and an displacement (aka offset) which is an +-------------+----------------------------+---- | mov | mov | mov The Intel & AT&T syntax for vari|ouRs*aWdd+resOsfinfg mo|deMs,OdVepEeAnXd,ing[oEnBXw*h4ich+p0ar1t0s0o]f the ab|ovmeov - reg+off diagram are used, is shown in the table below from https://cs.nyu.edu/courses/fall10/V22.0201-002/addressing_modes.pdf 50 UC Davis EEC 170, Winter 2021 / © John Owens | B + R*W + O | MOV EAX, [EDX + EBX*4 + 8] | mov http://simon.baymoo.org/universe/tools/symset/symset.txt (slightly modified):$ - (small) absolute +-------------+----------------------------+---- +-------------+-----N-o-t-e-t-h-a-t-,-g-i-v-e-n-t-h-e--d-e-f-in-i-t+io-n--o-f-a--l-a-b-e-l$-x-$-in--t-h-e-$.--d-a-t--a-$-se-c-t+ion | Mode | Intel | AT&T | | Register (%esi), %eax | to indicate the memory location, as in +-------------+----------------------------+-----------------------------+ | Absolute | MOV EAX, [0100] | movl 0x0100, %eax | | MOVmEoAvX, [eESaIx],x #Intel| movl | Reg + Off | MOV EAX, [EBP-8] | movl -8(%ebp), %eax | p r e r t - T - l l l l l - Basic Blocks ▪ A basic block is a sequence of instructions with - No embedded branches (except at end) - No branch targets (except at beginning) ■ ■ A compiler identifies basic blocks for optimization An advanced processor can accelerate execution of basic blocks 51 UC Davis EEC 170, Winter 2021 / © John Owens More Conditional Operations ▪ blt rs1, rs2, L1 - if (rs1 < rs2) branch to instruction labeled L1 ▪ bge rs1, rs2, L1 - if (rs1 >= rs2) branch to instruction labeled L1
▪ Example
-if (a > b) a += 1; // a in x22, b in x23
bge x23, x22, Exit // branch if b >= a
addi x22, x22, 1
Exit:
52
UC Davis EEC 170, Winter 2021 / © John Owens

Signed vs. Unsigned
▪ Signed comparison: blt, bge
▪ Unsigned comparison: bltu, bgeu ▪ Example
– x22 = 1111 1111 1111 1111 1111 1111 1111 1111
– x23 = 0000 0000 0000 0000 0000 0000 0000 0001

– –1 < +1 x22 < x23 // signed x22 > x23 // unsigned

– +4,294,967,295 > +1
53
UC Davis EEC 170, Winter 2021 / © John Owens

Let’s say you write an awesome procedure in RISC-V and I want to call it. You use registers, I use registers. What could go wrong?
54 UC Davis EEC 170, Winter 2021 / © John Owens

Procedure Calling
▪ Steps required
– Place parameters in registers x10 to x17
– Transfer control to procedure
– Acquire storage for procedure
– “Storage” may be both register and memory space
– Perform procedure’s operations
– Place result in register for caller
– Return to place of call (address in x1)
55
UC Davis EEC 170, Winter 2021 / © John Owens
§2.8 Supporting Procedures in Computer Hardware

Procedure Call Instructions ▪ Procedure call: jump and link
jal x1, ProcedureLabel
– Address of following instruction put in x1
– Jumps to target address
▪ Procedure return: jump and link register
jalr x0, 0(x1)
– Like jal, but jumps to 0 + address in x1
– Use x0 as rd (x0 cannot be changed)
– Can also be used for computed jumps
– e.g., for case/switch statements
56 UC Davis EEC 170, Winter 2021 / © John Owens

Aside: Data Types in C
▪ The actual size of the integer types varies by implementation. The standard only requires size relations between the data types and minimum sizes for each data type:
▪ The relation requirements are that the long long is not smaller than long, which is not smaller than int, which is not smaller than short. As char’s size is always the minimum supported data type, no other data types (except bit-fields) can be smaller.
▪ The minimum size for char is 8 bits, the minimum size for short and int is 16 bits, for long it is 32 bits and long long must contain at least 64 bits.
▪ The type int should be the integer type that the target processor is most efficiently working with. This allows great flexibility: for example, all types can be 64-bit. However, several different integer width schemes (data models) are popular. Because the data model defines how different programs communicate, a uniform data model is used within a given operating system application interface.
▪ In practice, char is usually eight bits in size and short is usually 16 bits in size (as are their unsigned counterparts). This holds true for platforms as diverse as 1990s SunOS 4 Unix, Microsoft MS-DOS, modern Linux, and Microchip MCC18 for embedded 8-bit
PIC microcontrollers. POSIX requires char to be exactly eight bits in size.
https://en.wikipedia.org/wiki/C_data_types 57 UC Davis EEC 170, Winter 2021 / © John Owens

Leaf Procedure Example

“leaf procedures” make no function calls
C code:
long long int leaf_example (
long long int g, long long int h,
long long int i, long long int j) {
long long int f;
f = (g + h) – (i + j);
return f;
long long int
guarantees at least a 64-bit integer
}
– f in x20
– temporaries x5, x6
– Arguments g, …, j in x10, …, x13
– Callee needs to save x5, x6, x20 on “stack” (magic data structure, we will describe shortly)
58
UC Davis EEC 170, Winter 2021 / © John Owens

Leaf Procedure Example ▪ RISC-V code:
leaf_example:
addi sp,sp,-24
sd x5,16(sp)
sd x6,8(sp)
sd x20,0(sp)
add x5,x10,x11
add x6,x12,x1
sub x20,x5,x6
addi x10,x20,0
ld x20,0(sp)
ld x6,8(sp)
ld x5,16(sp)
addi sp,sp,24
jalr x0,0(x1)
Save x5, x6, x20 on stack (caller might need those values)
x5 = g + h
x6 = i + j
f = x5 – x6
copy f to return register Restore x5, x6, x20 from stack
Return to caller
59
UC Davis EEC 170, Winter 2021 / © John Owens

What could a compiler do to optimize the previous code?
60 UC Davis EEC 170, Winter 2021 / © John Owens

Local Data on the Stack
▪ In RISC-V:
– The stack pointer points to the “top” of the stack (the most recently
used item)
– The stack grows downward
61
UC Davis EEC 170, Winter 2021 / © John Owens

Register Usage (RISC-V Convention)
▪ x5 – x7, x28 – x31: temporary registers – Not preserved by the callee
▪ x8 – x9, x18 – x27: saved registers
– If used, the callee saves and restores them
▪ Big picture: When a procedure call is made, some tasks are the responsibility of the caller and some are the responsibility of the callee
62
UC Davis EEC 170, Winter 2021 / © John Owens

Non-Leaf Procedures
▪ Procedures that call other procedures
▪ For nested call, caller needs to save on the stack:
– Its return address
– Any arguments and temporaries needed after the call ▪ Restore from the stack after the call
63 UC Davis EEC 170, Winter 2021 / © John Owens

Non-Leaf Procedure Example ▪ C code:
long long int fact (long long int n)
{
if (n < 1) return 1; else return n * fact(n - 1); } - - Argument n in x10 Result in x10 64 UC Davis EEC 170, Winter 2021 / © John Owens Non-Leaf Procedure Example ▪ RISC-V code: fact: addi sp,sp,-16 sd x1,8(sp) sd x10,0(sp) addi x5,x10,-1 bge x5,x0,L1 addi x10,x0,1 addi sp,sp,16 jalr x0,0(x1) L1: addi x10,x10,-1 jal x1,fact addi x6,x10,0 ld x10,0(sp) ld x1,8(sp) addi sp,sp,16 mul x10,x10,x6 jalr x0,0(x1) long long int fact (long long int n) { if (n < 1) return 1; else return n * fact(n - 1); } Save return address and n on stack x5 = n - 1 if n >= 1, go to L1
Else, set return value to 1
Pop stack, don’t bother restoring values
Return
n = n -1
call fact(n-1), write next instruction’s address into x1, result will be in x10 move result of fact(n – 1) to x6
Restore caller’s n
Restore caller’s return address
Pop stack
return n * fact(n-1)
return
65
UC Davis EEC 170, Winter 2021 / © John Owens

Memory Layout
▪ Text: program code
▪ Static data: global variables
– e.g., static variables in C, constant arrays and strings
– x3 (global pointer) initialized to address allowing ±offsets into this segment
▪ Dynamic data: heap
– E.g., malloc in C, new in Java
▪ Stack: automatic storage
66
UC Davis EEC 170, Winter 2021 / © John Owens

Local Data on the Stack
▪ Local data allocated by callee – e.g., C automatic variables
▪ Procedure frame (activation record)
– Used by some compilers to manage stack storage
67
UC Davis EEC 170, Winter 2021 / © John Owens

Character Data
▪ Byte-encoded character sets – ASCII: 128 characters
– 95 graphic, 33 control – Latin-1: 256 characters
– ASCII, +96 more graphic characters ▪ Unicode: 32-bit character set
– Used in Java, C++ wide characters, …
– Most of the world’s alphabets, plus symbols
– UTF-8, UTF-16: variable-length encodings
68
UC Davis EEC 170, Winter 2021 / © John Owens
§2.9 Communicating with People

Byte/Halfword/Word Operations

RISC-V byte/halfword/word load/store
– Load byte/halfword/word: Sign extend to 64 bits in rd
-lb rd, offset(rs1) -lh rd, offset(rs1) -lw rd, offset(rs1)
– Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
-lbu rd, offset(rs1) -lhu rd, offset(rs1) -lwu rd, offset(rs1)
– Store byte/halfword/word: Store rightmost 8/16/32 bits
-sb rs2, offset(rs1) -sh rs2, offset(rs1) -sw rs2, offset(rs1)
69
UC Davis EEC 170, Winter 2021 / © John Owens

String Copy Example ▪ C code:
– Null-terminated string
void strcpy (char x[], char y[]) {
size_t i;
i = 0;
while ((x[i]=y[i]) != ‘\0′)
i += 1; }
// C idiom: while (*x++ = *y++);
70
UC Davis EEC 170, Winter 2021 / © John Owens

String Copy Example ▪ RISC-V code:
strcpy:
addi sp,sp,-8 // adjust stack for 1 doubleword sd x19,0(sp) // push x19
add x19,x0,x0 // i=0 (x19 contains i)
L1: add x5,x19,x10 // x10 = &y; x5 = addr of y[i]
lbu x6,0(x5) add x7,x19,x11 sb x6,0(x7) beq x6,x0,L2 addi x19,x19,1 jal x0,L1
L2: ld x19,0(sp)
addi sp,sp,8
jalr x0,0(x1)
// x6 = y[i]
// x11 = &x; x7 = addr of x[i]
// x[i] = y[i]
// if y[i] == 0 then exit
// i = i + 1
// next iteration of loop
// restore saved x19
// pop 1 doubleword from stack
// and return
71
UC Davis EEC 170, Winter 2021 / © John Owens