Lecture 3a:
Instructions: Language of the Computer (2/3)
John Owens
Introduction to Computer Architecture UC Davis EEC 170, Winter 2021
From last time …
▪ What instructions look like
-add, sub, ld, sw, addi
– RISC-V: 32 bit instructions, different types (R, I, S)
– RISC-V: Instructions either compute something or move something to/from memory
▪ Numbers
– Integers, signed/unsigned integers, sign extension
– Decimal, binary, hexadecimal
– Converting bits <-> numbers
31 UC Davis EEC 170, Winter 2021 / © John Owens
Representing Instructions
▪
Instructions are encoded in binary – Called “machine code”
–
RISC-V instructions
How do we get from add x5, x20, x21 to binary? ▪
– Encoded as 32-bit instruction words
– Big picture: We divide the 32-bit instruction word into “fields”, each of a few bits, and encode different pieces information from the instruction into each field
– Small number of formats encoding operation code (opcode), register numbers, …
– Regularity!
32 UC Davis EEC 170, Winter 2021 / © John Owens
§2.5 Representing Instructions in the Computer
Hexadecimal
▪ Base 16
– Compact representation of bit strings
– 4 bits (“nibble”) per hex digit
– 0x means “I’m hexadecimal”
0
0000
4
0100
8
1000
c
1100
1
0001
5
0101
9
1001
d
1101
2
0010
6
0110
a
1010
e
1110
3
0011
7
0111
b
1011
f
1111
▪ Example: 0x eca8 6420
– 1110 1100 1010 1000 0110 0100 0010 0000
33
UC Davis EEC 170, Winter 2021 / © John Owens
ReaD CYCLE upper Half I RDCYCLEH rd
I
RISC-V R-format Instructions
▪ Instruction fields
– opcode: operation code
– rd: destination register number
OR OR Immediate AND
– funct3: 3-bit function code (additional opcode)
– rs1: the first source register numberAND Immediate Compare Set <
- rs2: the second source register number
Set < Imm Unsigned
Branches Branch = SB - funct7: 7-bit function code (additional opcode)
Shift Left Immediate SLLI rd,rs1,shamt
Shift Right Immediate
Shift Right Arithmetic
Shift Right Arith Imm
Shift Right
R I R I
SRL rd,rs1,rs2
SRLI rd,rs1,shamt
SRA rd,rs1,rs2
SRAI rd,rs1,shamt
Arithmetic ADD ADD Immediate SUBtract
Load Upper Imm Add Upper Imm to PC
R I R
U U
ADD rd,rs1,rs2
ADDI rd,rs1,imm
SUB rd,rs1,rs2
LUI rd,imm
AUIPC rd,imm
Logical XOR XOR Immediate
R I
R I R I R I R I
XOR rd,rs1,rs2
XORI rd,rs1,imm
OR rd,rs1,rs2
ORI rd,rs1,imm
AND rd,rs1,rs2
ANDI rd,rs1,imm
SLT rd,rs1,rs2
SLTI rd,rs1,imm
SLTU rd,rs1,rs2
SLTIU rd,rs1,imm
BEQ rs1,rs2,imm
BNE rs1,rs2,imm
BLT rs1,rs2,imm
BGE rs1,rs2,imm
BLTU rs1,rs2,imm
,rs2,imm
JAL rd,imm
J7AbLitRs rd,rs1,imm FENCE
FENCE.I
SCALL
SBREAK
Set < Immediate Set < Unsigned
Branch ≠ SB
funct7
rs2
rs1
funcBtr3anc
h ≥ Unrsdigned
SB oBpGcEoUders1
7bits
5bits
5bits
Branch <
Branch ≥ Branch < Unsigned
Jump & Link J&L 3Jbuitmsp&LinkR5ebgitister
Synch Synch thread Synch Instr & Data
System System CALL System BREAK
SB SB SB
UJ UJ I I I I
34
Counters ReaD CYCLE I
UC Davis EEC 170, Winter 2021 / © John Owens
RDCYCLE rd
R-format Example
add x9,x20,x21
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
funct7
rs2
rs1
funct3
rd
opcode
0
21
20
0
9
51
0000000
10101
10100
000
01001
0110011
0000 0001 0101 1010 0000 0100 1011 0011two = 015A04B316
35
UC Davis EEC 170, Winter 2021 / © John Owens
funct7 rs2 rs1 funct3 rd opcode R-type
Opcode Map
imm[11:0] imm[11:5] rs2
rs1 funct3 rd
rs1 funct3 imm[4:0] rs1 funct3 imm[4:1|11]
rd rd
opcode opcode opcode opcode opcode
I-type S-type B-type U-type J-type
LUI AUIPC JAL JALR BEQ BNE BLT BGE BLTU BGEU LB
LH LW LBU LHU SB
SH SW ADDI SLTI SLTIU XORI ORI ANDI SLLI SRLI SRAI
SUB
SLL
SLT
UC Davis EEC 170, Winter 2021 / © John Owens
imm[12|10:5] rs2
imm[11:0]
imm[11:0]
imm[11:0]
imm[11:0]
imm[31:12] imm[20|10:1|11|19:12]
RV32I Base Instruction Set
imm[31:12]
imm[31:12]
rd
imm[20|10:1|11|19:12]
rd
rd
1101111
imm[11:0]
rs1
000
rd
1100111
000
rs1
010
rd
rd
imm[4:0]
rd
0000011
imm[11:0]
rs1
100
0000011
imm[11:0]
rs1
101
rd
0000011
imm[11:5]
rs2
rs1
000
0100011
imm[11:5]
rs2
rs1
001
imm[4:0]
0100011
imm[11:5]
rs2
rs1
010
imm[4:0]
0100011
imm[11:0]
rs1
000
rd
0010011
rs1
010
0010011
rs1
011
rd
rd
0010011
imm[11:0]
rs1
100
0010011
imm[11:0]
rs1
110
rd
0010011
imm[11:0]
rs1
111
rd
0010011
0000000
0100000
shamt
shamt
rs1
rs1
001
101
rd
rd
rd
0010011
0000000
shamt
rs1
101
0010011
0010011
0000000 rs2 rs1 000 rd 0110011 ADD
0100000
rs2
rs1
000
rd
0110011
0000000 rs2 rs1 001
rd 0110011
0000000 rs2 0000000 rs2 0000000 rs2
rs1 010 rs1 011
rs1 100
rd rd rd
0110011
0110011
0110011
SLTU
36
XOR
rd
0110111
0010111
imm[12|10:5]
imm[12|10:5]
rs2
rs1
imm[4:1|11]
imm[4:1|11]
1100011
imm[12|10:5]
rs2
rs1
001
imm[4:1|11]
1100011
imm[12|10:5]
rs2
rs1
100
imm[4:1|11]
1100011
rs2
rs1
101
1100011
imm[12|10:5]
rs2
rs1
110
imm[4:1|11]
1100011
imm[12|10:5]
rs2
rs1
111
imm[4:1|11]
1100011
imm[11:0]
rs1
000
rd
0000011
rs1
001
0000011
0000000 rs2 rs1 101 rd 0110011 SRL
Shift Right Arithmetic R SRA rd,rs1,rs2
- immediate: constant operand, or offset added to base
address
- 2s-complement, sign extended
- How big can this immediate be?
- Why did they pick this size?
Set < Unsigned Set < Imm Unsigned
Branches Branch = Branch ≠ Branch < Branch ≥ Branch < Unsigned Branch ≥ Unsigned
Jump & Link J&L Jump & Link Register
R SLTU I SLTIU
SB BEQ SB BNE SB BLT SB BGE SB BLTU SB BGEU UJ JAL
UJ JALR I FENCE
- Advantages/disadvantages of making it bigger/smaller?
Arithmetic ADD ADD Immediate SUBtract
Load Upper Imm Add Upper Imm to PC
I ANDI Compare Set< R SLT
Set < Immediate I SLTI
R I R
U U
ADD rd,rs1,rs2
ADDI rd,rs1,imm
SUB rd,rs1,rs2
LUI rd,imm
AUIPC rd,imm
▪ Immediate arithmetic and load instructions
- rs1: source or base address register number
Shift Right Arith Imm I
SRAI
rd,rs1,shamt
RISC-V I-format Instructions
Logical XOR R
XOR XORI
OR ORI AND
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rd,imm
rd,rs1,imm
XOR Immediate I OR R
I R
OR Immediate AND AND Immediate
Synch Synch thread Synch Instr & Data
I System System CALL I SCALL
FENCE.I
immediate
rs1
funct3
R ReaD TIME
rd
eaDTIME I RDT
opcode
pper Half I
RDT
12 bits
5 bits
3 bits
RDCYCLE rd RDCYCLEH rd IME rd IMEH rd I RDINSTRET rd
37
R UC Davis EEC 170, Winter 2021 / © John Owens
I
System BREAK I SBREAK
Counters ReaD CYCLE ReaD CYCLE upper Half
u
ReaD INSTR RETired
I I
5 bits 7 bits
ReaD INSTR upper Half I RDINSTRETH rd
32-bit Instruction Formats
RISC-V I-format vs. R-format
▪ ▪
▪
I-format:
R-format:
7 bits
12 bits
5 bits
3 bits
5 bits
7 bits
immediate
rs1
funct3
rd
opcode
funct7
rs2
rs1
funct3
rd
opcode
5 bits
5 bits
3 bits
5 bits
7 bits
Design Principle 3: Good design demands good compromises
- Different formats complicate decoding, but allow 32-bit
instructions uniformly
- Keep formats as similar as possible
38 UC Davis EEC 170, Winter 2021 / © John Owens
RISC-V S-format Instructions
▪ Different immediate format for store instructions
- rs1: base address register number
- rs2: source operand register number
- immediate: offset added to base address
- Split so that rs1 and rs2 fields always in the same place
imm[11:5]
rs2
rs1
funct3
imm[4:0]
opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
39 UC Davis EEC 170, Winter 2021 / © John Owens
RISC-V I-format vs. R-format vs. S-format ▪ I-format:
immediate
rs1
funct3
rd
opcode
▪ R-format: 7 bits
▪ S-format: 7 bits
5 bits
5 bits
3 bits
5 bits
7 bits
12 bits
5 bits
3 bits
5 bits
7 bits
funct7
rs2
rs1
funct3
rd
opcode
imm[11:5]
rs2
rs1
funct3
imm[4:0]
opcode
5 bits
5 bits
3 bits
5 bits
7 bits
40
UC Davis EEC 170, Winter 2021 / © John Owens
Logical Operations
▪ Instructions for bitwise manipulation
Operation
C
Java
RISC-V
Shift left
<<
<<
slli
Shift right
>>
>>>
srli
Bit-by-bit AND
&
&
and, andi
Bit-by-bit OR
|
|
or, ori
Bit-by-bit XOR
^
^
xor, xori
Bit-by-bit NOT
~
~
■ Useful for extracting and inserting groups of bits in a word
41
UC Davis EEC 170, Winter 2021 / © John Owens
§2.6 Logical Operations
Category Name Fmt RV32I Base
Shift Operations
▪ immed: how many positions to shift
▪ Shift left logical
– Shift left and fill with 0 bits
Loads Load Byte Load Halfword Load Word Load Byte Unsigned Load Half Unsigned
Stores Store Byte Store Halfword Store Word
Shifts Shift Left Shift Left Immediate Shift Right Shift Right Immediate Shift Right Arithmetic Shift Right Arith Imm
Arithmetic ADD ADD Immediate SUBtract
Load Upper Imm Add Upper Imm to PC
Logical XOR XOR Immediate
I LB I LH I LW I LBU I LHU S SB S SH S SW
rd,rs1,imm rd,rs1,imm rd,rs1,imm L rd,rs1,imm rd,rs1,imm L rs1,rs2,imm rs1,rs2,imm rs1,rs2,imm S
S S S S S S
rd,rs1,rs2 A rd,rs1,imm A
R I R I R I
SLL rd,rs1,rs2
SLLI rd,rs1,shamt
SRL rd,rs1,rs2
SRLI rd,rs1,shamt
SRA rd,rs1,rs2
SRAI rd,rs1,shamt
–
▪ Shift right logical
i slliby i bits multiplies by 2
R ADD I ADDI
R SUB U LUI
U AUIPC R XOR
I XORI R OR
I ORI R AND
I SLTI
– Shift right and fill with 0 bits
rd,imm
rd,imm rd,rs1,rs2 L rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rd,rs1,rs2
rd,rs1,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
rs1,rs2,imm
– –
OR Immediate srliby i bits divides by 2i (unsigned only) OR
Set < Immediate Set < Unsigned Set < Imm Unsigned
Compare Set < R SLT Also arithmetic right shifts that fill with sign bit (srai)
- Why not an arithmetic left shift?
R I
SLTU SLTIU
BEQ
BGE
6 bits
6 bits
5 bits
3 bits
Branch ≥ Branch < Unsigned Branch ≥ Unsigned
Jump & Link J&L
SB
42
UC Davis EEC 170, Winter 2021 / © John Owens
AND AND Immediate
I ANDI
Branches Branch = SB
rd,rs1,rs2
S
funct6
immed
rs1
funct3
rd
Bra Bra
nch≠ SBBNE opcode
nch < SB BLT
5 bits
7 bits
SB
SB BGEU rs1,rs2,imm
BLTU
UJ JAL rd,imm
Jump & Link Register UJ JALR rd,rs1,imm
C
S
A
AND Operations
▪ Useful to mask bits in a word
- Select some bits, clear others to 0
▪ and x9,x10,x11 x10
x11 x9
00000000 00000000 00000000 00000000 00000000 00000000 00001101 11000000
00000000 00000000 00000000 00000000 00000000 00000000 00111100 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00001100 00000000
43
UC Davis EEC 170, Winter 2021 / © John Owens
OR Operations
▪ Useful to include bits in a word
- Set some bits to 1, leave others unchanged or x9,x10,x11
x10 x11
x9
00000000 00000000 00000000 00000000 00000000 00000000 0
00000000 00000000 00000000 00000000 00000000 00000000 0
00000000 00000000 00000000 00000000 00000000 00000000 0
44
UC Davis EEC 170, Winter 2021 / © John Owens
0001101 11000000
0111100 00000000
0111101 11000000
XOR Operations
+RV{64,128}
Free & Open
Base Integer Instructions: RV32I, RV64I, and RV128I
Category Name Fmt RV32I Base ▪ Differencing operation
Categ
CSR A
At
Atom Atomic Chang En
R
MMU
Loads Load Byte I LB rd,rs1,imm Load Halfword I LH rd,rs1,imm
- Set some bits to 1, leave others unchanged Load Word I LW rd,rs1,imm
L{D|Q}
L{W|D}U
S{D|Q}
SLL{W|D}
rd,rs1,imm
rd,rs1,imm
rs1,rs2,imm
rd,rs1,rs2
Load Byte Unsigned I LBU rd,rs1,imm Load Half Unsigned I LHU rd,rs1,imm
xor x9,x10,x12 // NOT operation
Stores Store Byte S SB rs1,rs2,imm
Shifts
Shift Left R R
Store Halfword S Store Word S
SH SW
SLL
rs1,rs2,imm
rs1,rs2,imm
rd,rs1,rs2
Shift Left Immediate I SLLI rd,rs1,shamt SLLI{W|D} rd,rs1,shamt T 00000000 00000000 00000000 00000000 00000000 00000000 00001101 11000000
x10
x12 nr
Shift Right SRL rd,rs1,rs2 SRL{W|D} rd,rs1,rs2 Shift Right Immediate I SRLI rd,rs1,shamt SRLI{W|D} rd,rs1,shamt H
Shift Right Arithmetic R SRA rd,rs1,rs2 SRA{W|D} rd,rs1,rs2 I 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111
Shift Right Arith Imm I SRAI rd,rs1,shamt SRAI{W|D} rd,rs1,shamt Arithmetic ADD R ADD rd,rs1,rs2 ADD{W|D} rd,rs1,rs2
Redir ypervi
x9
SUBtract
Load Upper Imm Add Upper Imm to PC
R SUB U LUI
U AUIPC
XOR rd,rs1,rs2 XORI rd,rs1,imm
OR rd,rs1,rs2
ORI rd,rs1,imm
AND rd,rs1,rs2
ANDI rd,rs1,imm
11111111 11111111 11111111 11111111 11111111 11111111 11110010 00111111
Logical XOR XOR Immediate
OR OR Immediate AND AND Immediate
R I
R I R I
Compare
Set <
R SLT
rd,rs1,rs2
Stores Store Word CS C.SW
ADD Immediate I ADDI rd,rs1,imm ADDI{W|D} rd,rs1,imm
rd,rs1,rs2
rd,imm
SUB{W|D} rd,rs1,rs2
rd,imm
Optional Compressed (1 Category Name Fmt
Loads Load Word CL C.LW
rap
ter
45
Load Quad SP
Load Word SP
Load Double
Load Double SP
Load Quad
CI C.LWSP CL C.LD
CI C.LDSP
CL C.LQ
CI C.LQSP UC Davis EEC 170, Winter 2021 / © John Owens
Set < Immediate
I SLTI rd,rs1,imm
Store Word SP CSS C.SWSP
R
o
c
A i
e
v
e
e
s
u
6
Up to this point, we’ve made an assumption: what happens after we run instruction n?
46 UC Davis EEC 170, Winter 2021 / © John Owens
Conditional Operations
▪ Branch to a labeled instruction if a condition is true - Otherwise, continue sequentially
▪ beq rs1, rs2, L1
- if (rs1 == rs2) branch to instruction labeled L1
▪ bne rs1, rs2, L1
- if (rs1 != rs2) branch to instruction labeled L1
47
UC Davis EEC 170, Winter 2021 / © John Owens
§2.7 Instructions for Making Decisions
Compiling If Statements ▪ C code:
if (i==j) f = g+h;
else f = g-h;
- f, g, ... in x19, x20, ... ▪ Compiled RISC-V code:
bne x22, x23, Else
add x19, x20, x21
beq x0, x0, Exit // unconditional
Else: sub x19, x20, x21
Exit: ...
48
UC Davis EEC 170, Winter 2021 / © John Owens
Assembler calculates addresses
Compiling Loop Statements ▪ C code:
while (save[i] == k) i += 1;
- i in x22, k in x24, address of save in x25 ▪ Compiled RISC-V code:
Loop: slli x10, x22, 3
add x10, x10, x25
ld x9, 0(x10) // could we optimize this with an immediate? bne x9, x24, Exit
addi x22, x22, 1
beq x0, x0, Loop
Exit: ...
49
UC Davis EEC 170, Winter 2021 / © John Owens
$
Aside on addressing modes
The addressing mode determines, for an instruction that accesses a memory location, how the
address for the memory location is specified.
To summarize, the following diagram (from http://en.wikipedia.org/wiki/X86#Addressing_modes) shows the possible ways an address could
▪ x86 has many more addressing modes than RISC-V
$
be specified:
Each square bracket in the above diagram indicates an optional These parts (from left to right) are: A register used as a base add width (or scale) value to multiply the register by, and an displac integer. The address is computed as the sum of: the base registe the displacement.
The Intel & AT&T syntax for various addressing modes, depend diagram are used, is shown in the table below from
$
▪ RISC-V can do:
integer. The address is computed as the sum of: the base register, the index times the width, and
http://simon.baymoo.org/universe/tools/symset/symset.txt (sligh Each square bracket in the above diagram indicates an optional part of the address specification.
- register
| Absolute | Register | Reg + Off
| MOV EAX, [0100] | MOV EAX, [ESI]
| MOV EAX, [EBP-8]
the displacement.
+-------------+----------------------------+----
These parts (from left to right) are: A register used as a base address, a register used as an index, a
| Mode | Intel | AT&
width (or scale) value to multiply the register by, and an displacement (aka offset) which is an
+-------------+----------------------------+----
| mov
| mov
| mov
The Intel & AT&T syntax for vari|ouRs*aWdd+resOsfinfg mo|deMs,OdVepEeAnXd,ing[oEnBXw*h4ich+p0ar1t0s0o]f the ab|ovmeov
- reg+off
diagram are used, is shown in the table below from
https://cs.nyu.edu/courses/fall10/V22.0201-002/addressing_modes.pdf
50
UC Davis EEC 170, Winter 2021 / © John Owens
| B + R*W + O | MOV EAX, [EDX + EBX*4 + 8] | mov
http://simon.baymoo.org/universe/tools/symset/symset.txt (slightly modified):$ - (small) absolute
+-------------+----------------------------+----
+-------------+-----N-o-t-e-t-h-a-t-,-g-i-v-e-n-t-h-e--d-e-f-in-i-t+io-n--o-f-a--l-a-b-e-l$-x-$-in--t-h-e-$.--d-a-t--a-$-se-c-t+ion | Mode | Intel | AT&T |
| Register
(%esi), %eax |
to indicate the memory location, as in
+-------------+----------------------------+-----------------------------+
| Absolute
| MOV EAX, [0100] | movl
0x0100, %eax |
| MOVmEoAvX, [eESaIx],x #Intel| movl
| Reg + Off
| MOV EAX, [EBP-8]
| movl
-8(%ebp), %eax |
p r
e r
t
- T - l l l l l -
Basic Blocks
▪ A basic block is a sequence of instructions with
- No embedded branches (except at end)
- No branch targets (except at beginning)
■ ■
A compiler identifies basic blocks for optimization
An advanced processor can accelerate execution of basic blocks
51
UC Davis EEC 170, Winter 2021 / © John Owens
More Conditional Operations
▪ blt rs1, rs2, L1
- if (rs1 < rs2) branch to instruction labeled L1
▪ bge rs1, rs2, L1
- if (rs1 >= rs2) branch to instruction labeled L1
▪ Example
-if (a > b) a += 1; // a in x22, b in x23
bge x23, x22, Exit // branch if b >= a
addi x22, x22, 1
Exit:
52
UC Davis EEC 170, Winter 2021 / © John Owens
Signed vs. Unsigned
▪ Signed comparison: blt, bge
▪ Unsigned comparison: bltu, bgeu ▪ Example
– x22 = 1111 1111 1111 1111 1111 1111 1111 1111
– x23 = 0000 0000 0000 0000 0000 0000 0000 0001
–
– –1 < +1
x22 < x23 // signed x22 > x23 // unsigned
–
– +4,294,967,295 > +1
53
UC Davis EEC 170, Winter 2021 / © John Owens
Let’s say you write an awesome procedure in RISC-V and I want to call it. You use registers, I use registers. What could go wrong?
54 UC Davis EEC 170, Winter 2021 / © John Owens
Procedure Calling
▪ Steps required
– Place parameters in registers x10 to x17
– Transfer control to procedure
– Acquire storage for procedure
– “Storage” may be both register and memory space
– Perform procedure’s operations
– Place result in register for caller
– Return to place of call (address in x1)
55
UC Davis EEC 170, Winter 2021 / © John Owens
§2.8 Supporting Procedures in Computer Hardware
Procedure Call Instructions ▪ Procedure call: jump and link
jal x1, ProcedureLabel
– Address of following instruction put in x1
– Jumps to target address
▪ Procedure return: jump and link register
jalr x0, 0(x1)
– Like jal, but jumps to 0 + address in x1
– Use x0 as rd (x0 cannot be changed)
– Can also be used for computed jumps
– e.g., for case/switch statements
56 UC Davis EEC 170, Winter 2021 / © John Owens
Aside: Data Types in C
▪ The actual size of the integer types varies by implementation. The standard only requires size relations between the data types and minimum sizes for each data type:
▪ The relation requirements are that the long long is not smaller than long, which is not smaller than int, which is not smaller than short. As char’s size is always the minimum supported data type, no other data types (except bit-fields) can be smaller.
▪ The minimum size for char is 8 bits, the minimum size for short and int is 16 bits, for long it is 32 bits and long long must contain at least 64 bits.
▪ The type int should be the integer type that the target processor is most efficiently working with. This allows great flexibility: for example, all types can be 64-bit. However, several different integer width schemes (data models) are popular. Because the data model defines how different programs communicate, a uniform data model is used within a given operating system application interface.
▪ In practice, char is usually eight bits in size and short is usually 16 bits in size (as are their unsigned counterparts). This holds true for platforms as diverse as 1990s SunOS 4 Unix, Microsoft MS-DOS, modern Linux, and Microchip MCC18 for embedded 8-bit
PIC microcontrollers. POSIX requires char to be exactly eight bits in size.
https://en.wikipedia.org/wiki/C_data_types 57 UC Davis EEC 170, Winter 2021 / © John Owens
Leaf Procedure Example
▪
“leaf procedures” make no function calls
C code:
long long int leaf_example (
long long int g, long long int h,
long long int i, long long int j) {
long long int f;
f = (g + h) – (i + j);
return f;
long long int
guarantees at least a 64-bit integer
}
– f in x20
– temporaries x5, x6
– Arguments g, …, j in x10, …, x13
– Callee needs to save x5, x6, x20 on “stack” (magic data structure, we will describe shortly)
58
UC Davis EEC 170, Winter 2021 / © John Owens
Leaf Procedure Example ▪ RISC-V code:
leaf_example:
addi sp,sp,-24
sd x5,16(sp)
sd x6,8(sp)
sd x20,0(sp)
add x5,x10,x11
add x6,x12,x1
sub x20,x5,x6
addi x10,x20,0
ld x20,0(sp)
ld x6,8(sp)
ld x5,16(sp)
addi sp,sp,24
jalr x0,0(x1)
Save x5, x6, x20 on stack (caller might need those values)
x5 = g + h
x6 = i + j
f = x5 – x6
copy f to return register Restore x5, x6, x20 from stack
Return to caller
59
UC Davis EEC 170, Winter 2021 / © John Owens
What could a compiler do to optimize the previous code?
60 UC Davis EEC 170, Winter 2021 / © John Owens
Local Data on the Stack
▪ In RISC-V:
– The stack pointer points to the “top” of the stack (the most recently
used item)
– The stack grows downward
61
UC Davis EEC 170, Winter 2021 / © John Owens
Register Usage (RISC-V Convention)
▪ x5 – x7, x28 – x31: temporary registers – Not preserved by the callee
▪ x8 – x9, x18 – x27: saved registers
– If used, the callee saves and restores them
▪ Big picture: When a procedure call is made, some tasks are the responsibility of the caller and some are the responsibility of the callee
62
UC Davis EEC 170, Winter 2021 / © John Owens
Non-Leaf Procedures
▪ Procedures that call other procedures
▪ For nested call, caller needs to save on the stack:
– Its return address
– Any arguments and temporaries needed after the call ▪ Restore from the stack after the call
63 UC Davis EEC 170, Winter 2021 / © John Owens
Non-Leaf Procedure Example ▪ C code:
long long int fact (long long int n)
{
if (n < 1) return 1;
else return n * fact(n - 1);
}
- -
Argument n in x10 Result in x10
64 UC Davis EEC 170, Winter 2021 / © John Owens
Non-Leaf Procedure Example
▪ RISC-V code:
fact:
addi sp,sp,-16
sd x1,8(sp)
sd x10,0(sp)
addi x5,x10,-1
bge x5,x0,L1
addi x10,x0,1
addi sp,sp,16
jalr x0,0(x1)
L1: addi x10,x10,-1
jal x1,fact
addi x6,x10,0
ld x10,0(sp)
ld x1,8(sp)
addi sp,sp,16
mul x10,x10,x6
jalr x0,0(x1)
long long int fact (long long int n)
{
if (n < 1) return 1;
else return n * fact(n - 1);
}
Save return address and n on stack
x5 = n - 1
if n >= 1, go to L1
Else, set return value to 1
Pop stack, don’t bother restoring values
Return
n = n -1
call fact(n-1), write next instruction’s address into x1, result will be in x10 move result of fact(n – 1) to x6
Restore caller’s n
Restore caller’s return address
Pop stack
return n * fact(n-1)
return
65
UC Davis EEC 170, Winter 2021 / © John Owens
Memory Layout
▪ Text: program code
▪ Static data: global variables
– e.g., static variables in C, constant arrays and strings
– x3 (global pointer) initialized to address allowing ±offsets into this segment
▪ Dynamic data: heap
– E.g., malloc in C, new in Java
▪ Stack: automatic storage
66
UC Davis EEC 170, Winter 2021 / © John Owens
Local Data on the Stack
▪ Local data allocated by callee – e.g., C automatic variables
▪ Procedure frame (activation record)
– Used by some compilers to manage stack storage
67
UC Davis EEC 170, Winter 2021 / © John Owens
Character Data
▪ Byte-encoded character sets – ASCII: 128 characters
– 95 graphic, 33 control – Latin-1: 256 characters
– ASCII, +96 more graphic characters ▪ Unicode: 32-bit character set
– Used in Java, C++ wide characters, …
– Most of the world’s alphabets, plus symbols
– UTF-8, UTF-16: variable-length encodings
68
UC Davis EEC 170, Winter 2021 / © John Owens
§2.9 Communicating with People
Byte/Halfword/Word Operations
▪
RISC-V byte/halfword/word load/store
– Load byte/halfword/word: Sign extend to 64 bits in rd
-lb rd, offset(rs1) -lh rd, offset(rs1) -lw rd, offset(rs1)
– Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
-lbu rd, offset(rs1) -lhu rd, offset(rs1) -lwu rd, offset(rs1)
– Store byte/halfword/word: Store rightmost 8/16/32 bits
-sb rs2, offset(rs1) -sh rs2, offset(rs1) -sw rs2, offset(rs1)
69
UC Davis EEC 170, Winter 2021 / © John Owens
String Copy Example ▪ C code:
– Null-terminated string
void strcpy (char x[], char y[]) {
size_t i;
i = 0;
while ((x[i]=y[i]) != ‘\0′)
i += 1; }
// C idiom: while (*x++ = *y++);
70
UC Davis EEC 170, Winter 2021 / © John Owens
String Copy Example ▪ RISC-V code:
strcpy:
addi sp,sp,-8 // adjust stack for 1 doubleword sd x19,0(sp) // push x19
add x19,x0,x0 // i=0 (x19 contains i)
L1: add x5,x19,x10 // x10 = &y; x5 = addr of y[i]
lbu x6,0(x5) add x7,x19,x11 sb x6,0(x7) beq x6,x0,L2 addi x19,x19,1 jal x0,L1
L2: ld x19,0(sp)
addi sp,sp,8
jalr x0,0(x1)
// x6 = y[i]
// x11 = &x; x7 = addr of x[i]
// x[i] = y[i]
// if y[i] == 0 then exit
// i = i + 1
// next iteration of loop
// restore saved x19
// pop 1 doubleword from stack
// and return
71
UC Davis EEC 170, Winter 2021 / © John Owens