CS计算机代考程序代写 mips x86 compiler Java computer architecture cache arm assembly assembler Digital System Design 4

Digital System Design 4

Digital System Design 4
Lecture 9 – Instruction Sets 3

Computer Architecture

Dr Chang Liu

Course Outline
Week Lecture Topic Chapter Tutorial

1 1 Introduction

1 2 A Historical Perspective

2 3 Modern Technology and Types of Computer

2 4 Computer Perfomance 1

3 5 Digital Logic Review C

3 6 Instruction Set Architecture 1 2

4 7 Instruction Set Architecture 2 2

4 8 Processor Architecture 1 4

5 9 Instruction Set Architecture 3 2

5 10 Processor Architecture 2 4

Festival of Creative Learning

6 11 Processor Architecture 3 4

6 12 Processor Architecture 4 4Instruction Sets 3 – Chang Liu 2

This Lecture

• Addressing Modes

• RISC vs CISC

Instruction Sets 3 – Chang Liu 3

MIPS-32 ISA
• Instruction Categories

– Computational

– Load/Store

– Jump and Branch

– Floating Point
• coprocessor

– Memory Management

– Special

R0 – R31

PC

HI

LO

Registers

op

op

op

rs rt rd sa funct

rs rt immediate

jump target

3 Instruction Formats: all 32 bits wide

R format

I format

J format

Irwin, PSU, 2008
Instruction Sets 3 – Chang Liu 4

Register Usage

• $a0 – $a3: arguments (reg’s 4 – 7)

• $v0, $v1: result values (reg’s 2 and 3)

• $t0 – $t9: temporaries
– Can be overwritten by callee

• $s0 – $s7: saved
– Must be saved/restored by callee

• $gp: global pointer for static data (reg 28)

• $sp: stack pointer (reg 29)

• $fp: frame pointer (reg 30)

• $ra: return address (reg 31)

Instruction Sets 3 – Chang Liu 5

Actual MIPS memory addresses and contents of memory for those words. The changed addresses are highlighted

to contrast with the previous figure. Since MIPS addresses each byte, word addresses are multiples of 4: there are 4

bytes in a word.

Memory Addresses

Instruction Sets 3 – Chang Liu 6

Memory Layout

• Text: program code

• Static data: global variables
– e.g., static variables in C,

constant arrays and strings

– $gp initialized to address
allowing ±offsets into this
segment

• Dynamic data: heap
– E.g., malloc in C, new in Java

• Stack: automatic storage

Instruction Sets 3 – Chang Liu 7

Branch Addressing

• Branch instructions specify

– Opcode, two registers, target address

• Most branch targets are near branch

– Forward or backward
op rs rt constant or address

6 bits 5 bits 5 bits 16 bits

• PC-relative addressing

– Target address = PC + offset × 4

– PC already incremented by 4 by this time

Instruction Sets 3 – Chang Liu 8

Jump Addressing

• Jump (j and jal) targets could be anywhere
in text segment

– Encode full address in instruction

op address

6 bits 26 bits

• (Pseudo)Direct jump addressing

– Target address = PC31…28 : (address × 4)

Instruction Sets 3 – Chang Liu 9

Target Addressing Example

• Loop code from earlier example

– Assume Loop at location 80000
Loop: sll $t1, $s3, 2 80000 0 0 19 9 2 0

add $t1, $t1, $s6 80004 0 9 22 9 0 32

lw $t0, 0($t1) 80008 35 9 8 0

bne $t0, $s5, Exit 80012 5 8 21 2

addi $s3, $s3, 1 80016 8 19 19 1

j Loop 80020 2 20000

Exit: … 80024

Instruction Sets 3 – Chang Liu 10

Branching Far Away

• If branch target is too far to encode with 16-
bit offset, assembler rewrites the code

• Example

beq $s0,$s1, L1

bne $s0,$s1, L2
j L1

L2: …

Instruction Sets 3 – Chang Liu 11

Addressing Mode Summary

Instruction Sets 3 – Chang Liu 12

Pseudoinstructions
• mov $rt, $rs Copy contents of register s to

register t, i.e. R[t] = R[s].

• li $rs, immed Load immediate into to register s,
i.e. R[s] = immed. The way this is translated depends
on whether immed is 16 bits or 32 bits.

• la $rs, addr Load address into to register s,
i.e. R[s] = addr.

• lw $rt, big($rs) Load a word into memory with a 32-
bit offset (called big). Notice that this is normally not
allowed, because only 16-bit offsets are permitted.

• Similar pseudo-instructions exist for sw, etc.

http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Mips/pseudo.html

Instruction Sets 3 – Chang Liu 13

Pseudoinstructions

http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Mips/pseudo.html

Pseudoinstruction Translation

mov $rt, $rs addi $rt, $rs, 0

li $rs, small addi $rt, $rs, small

li $rs, big
lui $rs, upper( big )
ori $rs, $rs, lower( big )

la $rs, big
lui $rs, upper( big )
ori $rs, $rs, lower( big )

lw $rt, big($rs)

lui $t0, upper( big )
ori $t0, $t0, lower( big )
add $t0, $rs, $t0
lw $rt, 0($t0)

Instruction Sets 3 – Chang Liu 14

ARM & MIPS Similarities
• ARM: the most popular embedded core
• Similar basic set of instructions to MIPS

ARM MIPS

Date announced 1985 1985

Instruction size 32 bits 32 bits

Address space 32-bit flat 32-bit flat

Data alignment Aligned Aligned

Data addressing modes 9 3

Registers 15 × 32-bit 31 × 32-bit

Input/output Memory
mapped

Memory
mapped

Instruction Sets 3 – Chang Liu 15

Compare and Branch in ARM

• Uses condition codes for result of an
arithmetic/logical instruction

– Negative, zero, carry, overflow

– Compare instructions to set condition codes
without keeping the result

• Each instruction can be conditional

– Top 4 bits of instruction word: condition value

– Can avoid branches over single instructions

Instruction Sets 3 – Chang Liu 16

Instruction Encoding

Instruction Sets 3 – Chang Liu 17

RISC vs CISC Examples

RISC

• ARM

• MIPS

• Xilinx Microblaze

• (Pretty much everything)

CISC

• Intel x86

• Other obsolete stuff.. (VAX)

• Newer Intel chips use a
RISC-like microcode
anyway…

Instruction Sets 3 – Chang Liu 18

Register Model

Register-Register
• Common for RISC
• All data operands must be in CPU

registers
• Needs separate instruction to copy

data from memory into register
• Simplifies allowed instruction

formats
• (Need to spend more of memory

bandwidth on fetching instructions)
• May be easier to reorder

instructions
• Needs lots of general purpose

registers
• (Also called Load-store)

Register-Memory
• Common for CISC
• Can mix data in registers and

memory
• Makes instruction format

much more complicated
• Need fewer instructions to do

the same work
• (Makes better use of

instruction caches / memory
bandwidth)

• Makes repeated use of a few
(special purpose) registers

Instruction Sets 3 – Chang Liu 19

Basic x86 Registers

Instruction Sets 3 – Chang Liu 20

Basic x86 Addressing Modes
• Two operands per instruction

Source/dest operand Second source operand

Register Register

Register Immediate

Register Memory

Memory Register

Memory Immediate

• Memory addressing modes

– Address in register

– Address = Rbase + displacement

– Address = Rbase + 2
scale × Rindex (scale = 0, 1, 2, or 3)

– Address = Rbase + 2
scale × Rindex + displacement

Instruction Sets 3 – Chang Liu 21

Instruction Encoding

Fixed Length Instructions

• Inefficient for simple
instructions (like RISC)

• Makes some addressing
modes impossible

• Simplifies instruction
decoding

• Make worse use of memory
bandwidth

Variable Length Instructions

• More efficient for simple
instructions

• Required for complex
instructions / addressing
modes

• Complicates instruction
decoding

• May require microcode…

Instruction Sets 3 – Chang Liu 22

Implementing IA-32 (i386)

• Complex instruction set makes
implementation difficult

– Hardware translates instructions to simpler
microoperations

• Simple instructions: 1–1

• Complex instructions: 1–many

– Microengine similar to RISC

– Market share makes this economically viable

• Comparable performance to RISC

– Compilers avoid complex instructions
Instruction Sets 3 – Chang Liu 23

Fallacies

• Powerful instruction  higher performance

– Fewer instructions required

– But complex instructions are hard to implement
• May slow down all instructions, including simple ones

– Compilers are good at making fast code from simple
instructions

• Use assembly code for high performance

– But modern compilers are better at dealing with modern
processors

– More lines of code more errors and less productivity

Instruction Sets 3 – Chang Liu 24

Fallacies

• Backward compatibility  instruction set
doesn’t change

– But they do accrete more instructions

x86 instruction set

Instruction Sets 3 – Chang Liu 25

Next Lecture

• Putting it all together

• Datapath & Control Signals

• How it Works (for various instructions)

Instruction Sets 3 – Chang Liu 26