CS计算机代考程序代写 cache compiler arm c++ assembly RISC-V Java x86 assembler x86 Programming III CSE 351 Autumn 2016

x86 Programming III CSE 351 Autumn 2016

ACKNOWLEDGEMENT: These slides have been modified by your your CMPT 295 instructor and RISC-V ISA creators. However, please report all mistakes to your instructor.
1
car *c = malloc(sizeof(car));
c->miles = 100;
c->gals = 17;
float mpg = get_mpg(c);
free(c);
Car c = new Car();
c.setMiles(100);
c.setGals(17);
float mpg =
c.getMPG();

Java:
C:
Assembly language:
Machine code:
0111010000011000
100011010000010000000010
1000100111000010
110000011111101000011111
Computer system:
OS:

Memory & data
Arrays & structs
Integers & floats
RISC V assembly
Procedures & stacks
Executables
Memory & caches
Processor Pipeline
Performance
Parallelism

CMPT 295
L06 – RISC V – I

1

sum.c
sum.s
Compiler
C source
files
assembly
files
sum.o
Assembler
obj files
sum
Linker
executable
program
Executing
in
Memory
loader
process
exists on
disk
From Writing to Running
2
When most people say “compile” they mean
the entire process:
compile + assemble + link
“It’s alive!”
gcc -S
gcc -c
gcc -o

CMPT 295
L06 – RISC V – I
C compiler produces assembly files (contain RISCV assembly, pseudo-instructions, directives, etc.)
RISCV assembler produces object files (contain RISCV machine code, missing symbols, some layout information, etc.)
RISCV linker produces executable file (contains RISCV machine code, no missing symbols, some layout information)
OS loader gets it into memory and jumps to first instruction (machine code)
2

Mainstream ISAs

3

Macbooks & PCs
(Core i3, i5, i7, M)
x86 Instruction Set
Smartphone-like devices
(iPhone, iPad, Raspberry Pi)
ARM Instruction Set
Versatile and open-source
Relatively new, designed for cloud computing, high-end phones, small embedded sys.
RISCV Instruction Set

CMPT 295
L06 – RISC V – I
We have many choices as to which assembly language we could pick:
intel x86, which is the machine language used by most of our laptops.
Why don’t we?
SUPER long and complicated

Arm, which is used in our iphone

ARM stands for Advanced RISC Machine
3

Complex/Reduced Instruction Set Computing
Early trend: add more and more instructions to do elaborate operations – Complex Instruction Set Computing (CISC)
difficult to learn and comprehend language
super-complicated (slow?) hardware
Opposite philosophy later began to dominate: Reduced Instruction Set Computing (RISC)
Simpler (and smaller) instruction set makes it easier to build fast hardware
Let software do the complicated operations by composing simpler ones

4

CMPT 295
L06 – RISC V – I
x86 fell victim to the “early trend” — every time we want to do something complicated, let’s just implement it directly in the hardware! Downsides…
no one person can know the whole things
WHAT’S THE RULE ABOUT ASSEMBLY LANGUAGE?

New open-source, license-free ISA spec
Supported by growing shared software ecosystem
Appropriate for all levels of computing system, from microcontrollers to supercomputers
32-bit, 64-bit, and 128-bit variants (we’re using 32-bit in class, textbook uses 64-bit)
Why RISC-V instead of Intel 80×86?
RISC-V is simple, elegant. Don’t want to get bogged down in gritty details.
RISC-V has exponential adoption rate

RISC-V Green Card
IBM 360 Green Card

RISC-V Architecture
5

CMPT 295
L06 – RISC V – I
Registers — Summary
In high-level languages, number of variables limited only by available memory
ISAs have a fixed, small number of operands called registers
Special locations built directly into hardware
Benefit: Registers are EXTREMELY FAST
(faster than 1 billionth of a second)
Drawback: Operations can only be performed on these predetermined number of registers

6

CMPT 295
L06 – RISC V – I
Registers visible to compiler (or RISCV programmer).
6

Aside: Registers are Inside the Processor
Processor
Control
Datapath
PC

Registers

Arithmetic & Logic Unit
(ALU)
Memory
Input
Output

Bytes
Enable?
Read/Write
Address
Write Data
Read Data

Processor-Memory Interface

I/O-Memory Interfaces
Program
Data
7

CMPT 295
L06 – RISC V – I

RISCV — How Many Registers?
Tradeoff between speed and availability
more registers → can house more variables simultaneously; all registers are slower.
RISCV has 32 registers (x0-x31)
Each register is 32 bits wide and holds a word
6/27/2016

8

CMPT 295
L06 – RISC V – I
word = group of 32 bits (like how byte = group of 8 bits)
For the sake of comparison, ARM uses 16 registers.
How many bytes is a word?
what if we need more variables than that?!
8

RISC V Integer Registers – 32 bits wide
9

CMPT 295
L06 – RISC V – I
Names are largely arbitrary, reasons behind them are not super relevant to us
But there are conventions about how they are used
One is particularly important to not misuse: %sp
Stack pointer; we’ll get to this when we talk about procedures
Each is 4-bytes wide
What if we only want 4 bytes?
We can use these alternate names to use just half of the bytes

Memory vs. Registers
Addresses vs. Names
0x7FFFD024C3DC %x0
Big vs. Small
~ 8 GiB (16 x 8 B) = 128 B
Slow vs. Fast
~50-100 ns sub-nanosecond timescale
Dynamic vs. Static
Can “grow” as needed fixed number in hardware
while program runs
10

CMPT 295
L06 – RISC V – I

RISCV Registers
Register denoted by ‘x’ can be referenced by number (x0-x31) or name:
Registers that hold programmer variables:
s0-s1 ⬌ x8-x9
s2-s11 ⬌ x18-x27
Registers that hold temporary variables:
t0-t2 ⬌ x5-x7
t3-t6 ⬌ x28-x31
You’ll learn about the other 13 registers later
Registers have no type (C concept); the operation being performed determines how register contents are treated
11

CMPT 295
L06 – RISC V – I

11

C, Java variables vs. registers
In C (and most High Level Languages) variables declared first and given a type. E.g.,
int fahr, celsius;
char a, b, c, d, e;

Each variable can ONLY represent a value of the type it was declared as (cannot mix and match int and char variables).

In Assembly Language, the registers have no type
Operation determines how register contents are treated
12

CMPT 295
L06 – RISC V – I
RISCV Agenda
Basic Arithmetic Instructions
Comments
x0 (zero)
Immediates
Data Transfer Instructions
Decision Making Instructions
Bonus: C to RISCV Practice
Bonus: Additional Instructions
13

CMPT 295
L06 – RISC V – I

RISCV Instructions (1/2)
Instruction Syntax is rigid:
op dst, src1, src2
1 operator, 3 operands
op = operation name (“operator”)
dst = register getting result (“destination”)
src1 = first register for operation (“source 1”)
src2 = second register for operation (“source 2”)
Keep hardware simple via regularity
14

CMPT 295
L06 – RISC V – I

RISCV Instructions (2/2)
One operation per instruction,
at most one instruction per line
Assembly instructions are related to C operations (=, +, -, *, /, &, |, etc.)
Must be, since C code decomposes into assembly!
A single line of C may break up into several lines of RISCV
15

CMPT 295
L06 – RISC V – I

RISCV Instructions Example
Your very first instructions!
(assume here that the variables a, b, and c are assigned to registers s1, s2, and s3, respectively)
Integer Addition (add)
C: a = b + c
RISCV: add s1, s2, s3
Integer Subtraction (sub)
C: a = b – c
RISCV: sub s1, s2, s3
16

CMPT 295
L06 – RISC V – I

RISCV Instructions Example
17
Ordering of instructions matters (must follow order of operations)
Utilize temporary registers

Suppose a → s0,b → s1,c → s2,d → s3 and e → s4. Convert the following C statement to RISCV:
a = (b + c) – (d + e);

add t1, s3, s4
add t2, s1, s2
sub s0, t2, t1

CMPT 295
L06 – RISC V – I
add s0, s1, s2
add t0, s3, s4
sub s0, s0, t0

Suppose a -> s0, b->s1, c-> s2, d->s3 and e->s4. Convert the following C statement to RISCV:
a = (b + c) – (d + e);
add t1, s3, s4
add t2, s1, s2
sub s0, t2, t1
17

Assembly Instructions
In assembly language, each statement (called an Instruction), executes exactly one of a short list of simple commands
Unlike in C (and most other High Level Languages), each line of assembly code contains at most 1 instruction
Instructions are related to operations (=, +, -, *, /) in C or Java
Ok, enough already…gimme my RV32!
18

CMPT 295
L06 – RISC V – I
RISC V Assembly “Data Types”
Integral data of 1, 2, 4, or 8 bytes (we focus on 4 bytes)
Data values
Addresses
Floating point data of 4, 8, 10 or 2×8 or 4×4 or 8×2
Different registers for those (e.g. %f0, %f31)

No aggregate types such as arrays or structures
Just contiguously allocated bytes in memory
“AT&T”: used by our course, slides, textbook, gnu tools, …
19

Not covered
In 295

CMPT 295
L06 – RISC V – I
Assembly program doesn’t actually have notion of “data types,” but since encoding is so different between integral and floating point data, there are actually different instructions built into assembly because the hardware needs to deal with these numbers differently when performing arithmetic.

Three Basic Kinds of Instructions
Transfer data between memory and register
Load data from memory into register
%reg = Mem[address]
Store register data into memory
Mem[address] = %reg
Perform arithmetic operation on register or memory data
c = a + b; z = x << y; i = h & g; Control flow: what instruction to execute next Unconditional jumps to/from procedures Conditional branches 20 Remember: Memory is indexed just like an array of bytes! CMPT 295 L06 – RISC V - I Operand types Immediate: Constant integer data Examples: $0x400, $-533 Like C literal, but prefixed with ‘$’ Encoded with 1, 2, 4, or 8 bytes depending on the instruction Register: 22 integer registers Examples: %x9 … %x31 But %x0-x4 and x8 reserved for special use Others have special uses for particular instructions Memory: Consecutive bytes of memory at a computed address Simplest example: (%x18) 21 CMPT 295 L06 – RISC V - I RISC-V Addition and Subtraction (1/4) Syntax of Instructions: One two, three, four where: One = operation by name two = operand getting result (“destination”) three = 1st operand for operation (“source1”) four = 2nd operand for operation (“source2”) Syntax is rigid: 1 operator, 3 operands Why? Keep Hardware simple via regularity add x1, x2, x3 22 CMPT 295 L06 – RISC V - I Addition and Subtraction of Integers (2/4) Addition in Assembly Example: add x1,x2,x3 (in RISC-V) Equivalent to: a = b + c (in C) where C variables ⇔ RISC-V registers are: a ⇔ x1, b ⇔ x2, c ⇔ x3 Subtraction in Assembly Example: sub x3,x4,x5 (in RISC-V) Equivalent to: d = e - f (in C) where C variables ⇔ RISC-V registers are: d ⇔ x3, e ⇔ x4, f ⇔ x5 23 CMPT 295 L06 – RISC V - I Addition and Subtraction of Integers (3/4) How to do the following C statement? a = b + c + d - e; Break into multiple instructions add x10, x1, x2 # a_temp = b + c add x10, x10, x3 # a_temp = a_temp + d sub x10, x10, x4 # a = a_temp - e Notice: A single line of C may break up into several lines of RISC-V. Notice: Everything after the hash mark on each line is ignored (comments). Check Apollo-11 comments! 24 CMPT 295 L06 – RISC V - I Addition and Subtraction of Integers (4/4) How do we do this? f = (g + h) - (i + j); Use intermediate temporary register add x5, x20, x21 # a_temp = g + h add x6, x22, x23 # b_temp = i + j sub x19, x5, x6 # f = (g + h)- (i + j) 25 CMPT 295 L06 – RISC V - I RISCV Agenda Basic Arithmetic Instructions Comments x0 (zero) Immediates Data Transfer Instructions Decision Making Instructions Bonus: C to RISCV Practice Bonus: Additional Instructions 26 CMPT 295 L06 – RISC V - I Comments in Assembly Another way to make your code more readable: comments! Hash (#) is used for RISC-V comments anything from hash mark to end of line is a comment and will be ignored This is just like the C99 // Note: Different from C. C comments have format /* comment */ so they can span many lines 27 CMPT 295 L06 – RISC V - I The Zero Register Zero appears so often in code and is so useful that it has its own register! Register zero (x0 or zero) always has the value 0 and cannot be changed! i.e. any instruction with x0 as dst has no effect Example Uses: add s3, x0, x0 # c=0 add s1, s2, x0 # a=b 28 CMPT 295 L06 – RISC V - I RISCV Agenda Basic Arithmetic Instructions Comments x0 (zero) Immediates Data Transfer Instructions Decision Making Instructions Bonus: C to RISCV Practice Bonus: Additional Instructions 29 CMPT 295 L06 – RISC V - I Immediates Numerical constants are called immediates Separate instruction syntax for immediates: opi dst, src, imm Operation names end with ‘i’, replace 2nd source register with an immediate Example Uses: addi s1, s2, 5 # a=b+5 addi s3, s3, 1 # c++ Why no subi instruction? 30 CMPT 295 L06 – RISC V - I Goal of RISC is to minimize instruction set. subi dst, src, imm = addi dst, src, -imm. 30 Immediates Immediates are numerical constants. They appear often in code, so there are special instructions for them. Add Immediate: addi x3,x4,10 (in RISC-V) f = g + 10 (in C) where RISC-V registers x3,x4 are associated with C variables f, g Syntax similar to add instruction, except that last argument is a number instead of a register. 31 CMPT 295 L06 – RISC V - I Immediates There is no Subtract Immediate in RISC-V: Why? There are add and sub, but no addi counterpart Limit types of operations that can be done to absolute minimum if an operation can be decomposed into a simpler operation, don’t include it addi …, -X = subi …, X => so no subi
addi x3,x4,-10 (in RISC-V)
f = g – 10 (in C)
where RISC-V registers x3, x4 are associated with C variables f, g
32

CMPT 295
L06 – RISC V – I
Processor
Control
Datapath
Data Transfer:
Load from and Store to memory
PC

Registers

Arithmetic & Logic Unit
(ALU)
Memory
Input
Output

Bytes
Enable?
Read/Write
Address
Write Data = Store to memory
Read Data = Load from
memory

Processor-Memory Interface

I/O-Memory Interfaces
Program
Data
Much larger place
To hold values, but slower than registers!
Fast but limited place
To hold values
33

CMPT 295
L06 – RISC V – I
0
1
2
3

Memory Addresses are in Bytes
Data typically smaller than 32 bits, but rarely smaller than 8 bits (e.g., char type)–works fine if everything is a multiple of 8 bits
8 bit chunk is called a byte
(1 word = 4 bytes)
Memory addresses are really
in bytes, not words
Word addresses are 4 bytes apart
Word address is same as address of
rightmost byte – least-significant byte
(i.e. Little-endian convention)
34
Least-significant byte in a word
0
4
8
12

1
5
9
13

2
6
10
14

3
7
11
15

31 24
23 16
15 8
7 0
Least-significant byte
gets the smallest address

CMPT 295
L06 – RISC V – I
Big-endian and little-endian derive from Jonathan Swift’s Gulliver’s Travels in which the Big Endians were a political faction that broke their eggs at the large end (“the primitive way”) and rebelled against the Lilliputian King who required his subjects (the Little Endians) to break their eggs at the small end.
Big Endian vs. Little Endian
Big Endian
ADDR3 ADDR2 ADDR1 ADDR0
BYTE0 BYTE1 BYTE2 BYTE3
00000001 00000100 00000000 00000000

Little Endian
ADDR3 ADDR2 ADDR1 ADDR0
BYTE3 BYTE2 BYTE1 BYTE0
00000000 00000000 00000100 00000001

Consider the number 1025 as we normally write it:
BYTE3 BYTE2 BYTE1 BYTE0
00000000 00000000 00000100 00000001
The order in which BYTES are stored in memory
Bits always stored as usual. (E.g., 0xC2=0b 1100 0010)
en.wikipedia.org/wiki/Big_endian
35

CMPT 295
L06 – RISC V – I

Great Idea #3: Principle of Locality / Memory Hierarchy

36

CMPT 295
L06 – RISC V – I
Speed of Registers vs. Memory
Given that
Registers: 32 words (128 Bytes)
Memory (DRAM): Billions of bytes (2 GB to 64 GB on laptop)
and physics dictates…
Smaller is faster
How much faster are registers than DRAM??
About 100-500 times faster! (in terms of latency of one access)

37

CMPT 295
L06 – RISC V – I
Load from Memory to Register
C code
int A[100];
g = h + A[3];

Using Load Word (lw) in RISC-V:
lw x10,12(x15) # Reg x10 gets A[3]
add x11,x12,x10 # g = h + A[3]

Note: x15 – base register (pointer to A[0])
12 – offset in bytes
Offset must be a constant known at assembly time
Data flow
38

CMPT 295
L06 – RISC V – I

38

Store from Register to Memory
C code
int A[100];
A[10] = h + A[3];

Using Store Word (sw) in RISC-V:
lw x10,12(x15) # Temp reg x10 gets A[3]
add x10,x12,x10 # Temp reg x10 gets h + A[3]
sw x10,40(x15) # A[10] = h + A[3]

Note: x15 – base register (pointer)
12,40 – offsets in bytes
x15+12 and x15+40 must be multiples of 4
Data flow
39

CMPT 295
L06 – RISC V – I
80 – 5 (Clicker) – 3 (News/Administrativia) = 72 (36 slides max)
Clicker at 20 (half+)
39

Loading and Storing Bytes
In addition to word data transfers
(lw, sw), RISC-V has byte data transfers:
load byte: lb
store byte: sb
Same format as lw, sw
E.g., lb x10,3(x11)
contents of memory location with address = sum of “3” + contents of register x11 is copied to the low byte position of register x10.

byte
loaded
zzz zzzz
x

This bit
…is copied to “sign-extend”

xxxx xxxx xxxx xxxx xxxx xxxx

x10:
RISC-V also has “unsigned byte” loads (lbu) which zero extends to fill register. Why no unsigned store byte sbu?
40

CMPT 295
L06 – RISC V – I

40

RISCV Agenda
Basic Arithmetic Instructions
Comments
x0 (zero)
Immediates
Data Transfer Instructions
Decision Making Instructions
Bonus: C to RISCV Practice
Bonus: Additional Instructions
41

CMPT 295
L06 – RISC V – I

Decision Making Instructions
Branch If Equal (beq)
beq reg1,reg2,label
If value in reg1 = value in reg2, go to label
Branch If Not Equal (bne)
bne reg1,reg2,label
If value in reg1 ≠ value in reg2, go to label
Jump (j)
j label
Unconditional jump to label
42

CMPT 295
L06 – RISC V – I

Types of Branches
Branch – change of control flow

Conditional Branch – change control flow depending on outcome of comparison
branch if equal (beq) or branch if not equal (bne)
Also branch if less than (blt) and branch if greater than or equal (bge)

Unconditional Branch – always branch
a RISC-V instruction for this: jump (j), as in j label

43

CMPT 295
L06 – RISC V – I
Breaking Down the If Else
C Code:
if(i==j) {
a = b /* then */
} else {
a = -b /* else */
}
In English:
If TRUE, execute the THEN block
If FALSE, execute the ELSE block
RISCV (beq):
# i→s0, j→s1
# a→s2, b→s3
beq s0,s1,then
else:
sub s2, x0, s3
j end
then:
add s2, s3, x0
end:
44
???
This label unnecessary
???

CS61C Su18 – Lecture 5

CMPT 295
L06 – RISC V – I

Breaking Down the If Else
C Code:
if(i==j) {
a = b /* then */
} else {
a = -b /* else */
}
In English:
If TRUE, execute the THEN block
If FALSE, execute the ELSE block
RISCV (bne):
# i→s0, j→s1
# a→s2, b→s3
bne s0,s1,else
then:
add s2, s3, x0
j end
else:
sub s2, x0, s3
end:
45
???
???

CS61C Su18 – Lecture 5

CMPT 295
L06 – RISC V – I

Branching on Conditions other than (Not) Equal
Set Less Than (slt)
slt dst, reg1,reg2
If value in reg1 < value in reg2, dst = 1, else 0 Set Less Than Immediate (slti) slti dst, reg1,imm If value in reg1 < imm, dst = 1, else 0 46 CMPT 295 L06 – RISC V - I 46 Breaking Down the If Else C Code: if(i= value in reg2, go to label
48

CMPT 295
L06 – RISC V – I

48

Breaking Down the If Else
C Code:
if(i N*6 instructions

56
# t0 = *p
# *q = t0
# p = p + 1
# q = q + 1
# if *p==0, go to Exit
# go to Loop

CMPT 295
L06 – RISC V – I

C to RISCV Practice
Fill in lines:
# copy String p to q
# p→s0, q→s1 (pointers)
Loop: lb $t0,0($s0) # t0 = *p
sb $t0,0($s1) # *q = t0
addi $s0,$s0,1 # p = p + 1
addi $s1,$s1,1 # q = q + 1
beq $t0,$0,Exit # if *p==0, go to Exit
j Loop # go to Loop
Exit: # N chars in p => N*6 instructions

57
lb t0,0(s0)
sb t0,0(s1)
addi s0,s0,1
addi s1,s1,1
beq t0,0,Exit

CMPT 295
L06 – RISC V – I

C to RISCV Practice
Finished code:
# copy String p to q
# p→$s0, q→$s1 (pointers)
Loop: lb t0,0(s0) # t0 = *p
sb t0,0(s1) # *q = t0
addi s0,s0,1 # p = p + 1
addi s1,s1,1 # q = q + 1
beq t0,x0,Exit # if *p==0, go to Exit
j Loop # go to Loop
Exit: # N chars in p => N*6 instructions

58

CMPT 295
L06 – RISC V – I
Is this the only way to write out this function? Of course not.
58

C to RISCV Practice
Alternate code using bne:
# copy String p to q
# p→s0, q→s1 (pointers)
Loop: lb t0,0(s0) # t0 = *p
sb t0,0(s1) # *q = t0
addi s0,s0,1 # p = p + 1
addi s1,s1,1 # q = q + 1
bne t0,x0,Loop # if *p!=0, go to Loop
# N chars in p => N*5 instructions

59

CMPT 295
L06 – RISC V – I
Used fewer instructions!
Instead of “exit when equal to zero”, we are now using “loop when not equal to zero.”
59

RISCV Arithmetic Instructions
Multiplication (mul and mulh)
mul dst, src1, src2
mulh dst, src1, src2
src1*src2: lower 32-bits through mul, upper 32-bits in mulh
Division (div)
div dst, src1, src2
rem dst, src1, src2
src1/src2: quotient via div, remainder via rem
60

CMPT 295
L06 – RISC V – I

RISCV Bitwise Instructions
Note: a→s1, b→s2, c→s3
61
Instruction C RISCV
And a = b & c; and s1,s2,s3
And Immediate a = b & 0x1; andi s1,s2,0x1
Or a = b | c; or s1,s2,s3
Or Immediate a = b | 0x5; ori s1,s2,0x5
Exclusive Or a = b ^ c; xor s1,s2,s3
Exclusive Or Immediate a = b ^ 0xF; xori s1,s2,0xF

CMPT 295
L06 – RISC V – I

Shifting Instructions
In binary, shifting an unsigned number left is the same as multiplying by the corresponding power of 2
Shifting operations are faster
Does not work with shifting right/division
Logical shift: Add zeros as you shift
Arithmetic shift: Sign-extend as you shift
Only applies when you shift right (preserves sign)
Shift by immediate or value in a register
62

CMPT 295
L06 – RISC V – I

Shifting Instructions
63
Instruction Name RISCV
Shift Left Logical sll s1,s2,s3
Shift Left Logical Imm slli s1,s2,imm
Shift Right Logical srl s1,s2,s3
Shift Right Logical Imm srli s1,s2,imm
Shift Right Arithmetic sra s1,s2,s3
Shift Right Arithmetic Imm srai s1,s2,imm

When using immediate, only values 0-31 are practical
When using variable, only lowest 5 bits are used (read as unsigned)

CMPT 295
L06 – RISC V – I

Shifting Instructions
# sample calls to shift instructions
addi t0,x0 ,-256 # t0=0xFFFFFF00
slli s0,t0,3 # s0=0xFFFFF800
srli s1,t0,8 # s1=0x00FFFFFF
srai s2,t0,8 # s2=0xFFFFFFFF

addi t1,x0 ,-22 # t1=0xFFFFFFEA
# low 5: 0b01010
sll s3,t0,t1 # s3=0xFFFC0000
# same as slli s3,t0,10

64

CMPT 295
L06 – RISC V – I

Shifting Instructions
Example 1:
# lb using lw: lb s1,1(s0)
lw s1,0(s0) # get word
andi s1,s1,0xFF00 # get 2nd byte
srli s1,s1,8 # shift into lowest

65

CMPT 295
L06 – RISC V – I

Shifting Instructions
Example 2:
# sb using sw: sb s1,3(s0)
lw t0,0(s0) # get current word
andi t0,t0,0xFFFFFF # zero top byte
slli t1,s1,24 # shift into highest
or t0,t0,t1 # combine
sw t0,0(s0) # store back

66

CMPT 295
L06 – RISC V – I

Shifting Instructions
Extra for Experts:
Rewrite the two preceding examples to be more general
Assume that the byte offset (e.g. 1 and 3 in the examples, respectively) is contained in s2
Hint:
The variable shift instructions will come in handy
Remember, the offset can be negative
67

CMPT 295
L06 – RISC V – I

Switch Statement Example
Multiple case labels
Here: 5 & 6
Fall through cases
Here: 2
Missing cases
Here: 4

Implemented with:
Jump table
Indirect jump instruction
68
long switch_ex
(long x, long y, long z)
{
long w = 1;
switch (x) {
case 1:
w = y*z;
break;
case 2:
w = y/z;
/* Fall Through */
case 3:
w += z;
break;
case 5:
case 6:
w -= z;
break;
default:
w = 2;
}
return w;
}

CMPT 295
L06 – RISC V – I

Jump Table Structure
69
Code
Block 0
Targ0:
Code
Block 1
Targ1:
Code
Block 2
Targ2:
Code
Block n–1
Targn-1:



Targ0
Targ1
Targ2
Targn-1



JTab:
target = JTab[x];
goto target;
switch (x) {
case val_0:
Block 0
case val_1:
Block 1
• • •
case val_n-1:
Block n–1
}
Switch Form
Approximate Translation
Jump Table
Jump Targets

CMPT 295
L06 – RISC V – I

Jump Table Structure
70
switch (x) {
case 1:
break;
case 2:
case 3:
break;
case 5:
case 6:
break;
default:
}

Code
Blocks
Memory
Use the jump table when x 6:
if (x <= 6) target = JTab[x]; goto target; else goto default; C code: 0 1 2 3 4 5 6 Jump Table CMPT 295 L06 – RISC V - I .section .rodata .align 8 .L4: .quad .L8 # x = 0 .quad .L3 # x = 1 .quad .L5 # x = 2 .quad .L9 # x = 3 .quad .L8 # x = 4 .quad .L7 # x = 5 .quad .L7 # x = 6 Jump Table 71 Jump table switch(x) { case 1: // .L3 w = y*z; break; case 2: // .L5 w = y/z; /* Fall Through */ case 3: // .L9 w += z; break; case 5: case 6: // .L7 w -= z; break; default: // .L8 w = 2; } declaring data, not instructions 8-byte memory alignment this data is 64-bits wide CMPT 295 L06 – RISC V - I .rodata – read-only data section Handling Fall-Through 72 long w = 1; . . . switch (x) { . . . case 2: // .L5 w = y/z; /* Fall Through */ case 3: // .L9 w += z; break; . . . } case 3: w = 1; case 2: w = y/z; goto merge; merge: w += z; More complicated choice than “just fall-through” forced by “migration” of w = 1; Example compilation trade-off CMPT 295 L06 – RISC V - I