More Memory Instructions
• MIPS supports 8-bit (byte) load and stores. This is useful for text processing (ASCII characters). The rightmost bits of the register are used:
lb $t0, 0($s0) # copy 8-bit
sb $t0, 0($s1)
Copyright By PowCoder代写 加微信 powcoder
• MIPS support 16-bit (half-word) load and stores:
lh $t0, 0($s0) # copy 16-bits
sh $t0, 0($s1)
Core Memory Instructions
Instruction
MIPS Example
C Equivalent
lw $s1,4($t0)
Store word
sw $s1,4($t0)
Load half-word
lh $s1,4($t0)
Store half-word
sh $s1,4($t0)
lb $s1,4($t0)
Store byte
sb $s1,4($t0)
Design Example
• Write a program that copies a null-terminated string of bytes from source array to a destination array
• The program should use a loop.
Procedures
• A procedure (function or routine) allows a programmer to encapsulate specific tasks. This aids clarity and reduces code size.
• There is a special instruction (jump and link) which simultaneously jumps to the subroutine and saves the current address plus one (PC+4) in the return address register ($ra):
jal ProcedureAddress
• Then the procedure is complete, a jump register instruction is used to load the return address into the program counter.
Procedures
• MIPS defines special registers for procedure calls:
» $a0 – $a3 : argument registers to pass parameters » $v0 – $v1 : value register to return results
»$ra : return address
• The values in temporary registers need not be preserved by the called (callee) procedure:
»$t0 – $t9 : temporary
• The values in saved registers must be preserved: »$s0 – $s7 : saved registers
• If the temporary registers are all used, extra data must be spilled to memory.
• The stack provides hardware support for spilling.
Procedures
li $a1, 1 jal CalcSum add … syscall
add $v0, $a0, $a1 jr $ra
Procedures
• Stack is Last-In First-Out (LIFO) queue.
• Stack Pointer ($sp register) holds address of top of stack (last saved item). Stack grows top-down.
addi $sp, $sp, -12
sw $s2, 8($sp)
sw $s1, 4($sp)
sw $s0, 0($sp)
lw $s0, 0($sp)
lw $s1, 4($sp)
lw $s2, 8($sp)
addi $sp, $sp, 12
# move sp down
# push $s2
# push $s1
# push $s0
# regs available
# move sp back up
Procedures
• Note, the Stack Pointer and the Stack above the Stack Pointer must be preserved during a procedure call.
• When using a nested procedure call, the return address ($ra) must also be preserved. The simplest way is to push its contents to the stack.
Procedures
jal PrintString
… syscall
PrintString: … …
# push $ra
PrintChar: … …
jal PrintChar
Memory Model
• Text segment contains program machine instructions.
• Static data segment contains constants and other static variables.
• Dynamic data segment is used for dynamically variable data structures (the heap). Java new and C malloc.
$sp -> top
Dynamic data
Static data
Assembler temporary
Result values
More temporary
Reserved for OS kernel
Global pointer
Stack pointer
Frame pointer
Return address
• You can refer to registers by number in assembly instructions…
lw $t1, 8($t0)
Beware of typos
What I meant:
What I typed:
What the assembler assembled:
$t2, 8($t0) $2, 8($t0) $v0, 8($t0)
• A library of functions is provided with the MIPS processor.
• To use one:
1. Load the service number into $v0.
2. Load the argument values (if any) into $a0, etc. 3. Issue syscall instruction.
4. Retrieve return value (if any).
li $v0, 1 # service 1 is print integer add $a0, $t0, $zero # load desired value syscall
Character Mapped Displays
• ASCII code stored in memory for every character position
• Hardware table stores pixel on/off pattern for every ASCII code
• Text only screen
• Low memory requirements, monochrome
Bit Mapped Displays
• Grayscale: 1 integer controlling intensity stored for every pixel
• With 8-bits • 0=BLACK
• 255=WHITE
• Colour: 3 integers stored for every pixel: Red, Green, Blue intensity
• (0,0,0)=BLACK
• (255,255,255)=WHITE
• The number of pixels determines the resolution of the image.
• The number of bits determines the number of colour levels in the image.
Bit Mapped Displays
• MIPS Bitmap Display is organised in `units’.
• Every word in memory controls one `unit’.
• The programmer can set the height and width of a `unit’ in terms of pixels.
• The display memory is organised left-to-right, top-to-bottom.
• 8-bits per colour:
Base address for display
MIP32 with Polled IO
Main Memory
address data
address data
instruction
PC Control Unit
Polled IO: Keyboard & Display
Receiver Control Register (RCR)
Receiver Data Register (RDR)
Transmitter Control Register (TCR)
Transmitter Data Register (TDR)
0xffff0000
0xffff0004
0xffff0008
0xffff000c
Ready Bit Least Significant Bit
Key press ASIC Least Significant Byte
Ready Bit Least Significant Bit
Display ASIC Least Significant Byte
Polled IO: Keyboard
address data
Producer (keyboard hardware):
Main Memory RCR & RDR
1. User presses a key
2. Stores the ASCII code for the key in the RDR 3. Sets the RCR Ready bit to 1
Consumer program (CPU):
1. Repeatedly reads the RCR until the Ready bit equals 1 2. Reads the ASCII code from the RDR
3. This automatically resets the RDR Ready bit to 0
Polled IO: Display
address data data control
Producer (CPU):
1. Repeatedly reads the TCR until the Ready bit equals 1
2. Stores the ASCII for the character to display in TDR 3. This automatically resets the TCR Ready bit to 0
Consumer program (display hardware): 1. Waits until the TCR Ready bit equals 0 2. Reads the TDR and displays character 3. Sets the TCR Ready bit to 1
Main Memory TCR & TDR
Interrupt-driven IO
•Polled IO isn’t great because the CPU wastes so much time in loops, checking the Ready bit.
•Interrupt-driven IO solves this problem…
•“An interrupt is an unscheduled procedure call”
MIP32 with Interrupt-Driven IO
Main Memory
Devices Devices
address data
address data
instruction
PC Control Unit
co-processor zero
MIP32 with Interrupt-Driven IO
• “A co-processor is an independent processor designed to supplement the capabilities of the primary processor.”
• A co-processor often contains registers and can execute it’s own instructions, from a very limited instruction set.
• A co-processor might have its own, independent memory.
• Most co-processors are designed to accelerate specific tasks by allowing the programmer to offload the tasks from the main processor to the co-processor.
• Two hardware connections are added between the MIPS CPU and all IO devices:
• Interrupt Request (IRQ)
• Interrupt Acknowledge (IACK)
• A coprocessor (CP0) is added containing an additional 32 registers, including Exception Program Counter ($4), Status ($12), Cause ($13). Special instructions are added to allow the CPU to access the new registers in CP0:
mfc0 $t0, $13
mtc0 $t1, $12
CPU register
# move data from CP0 Cause
# move data to CP0 Status
CP0 register
• When a device needs attention, it:
• Sets IRQ to 1
• Sets the bit corresponding to the device in the Cause register
• When the CPU detects IRQ equal to 1, it finishes executing the current instruction but instead of processing the next instruction, it:
•Saves the address of the next instruction in the Exception Program Counter (EPC)
•Disables all interrupts by resetting the Interrupt Enable (IE) bit in the Status register
•Jumps to the starting address of the Interrupt Service Routine (ISR)
• The ISR:
•Checks the Cause register to see which bit is set
•Deals with that device, i.e. sends or receives information, via memory accesses
•Sets IACK to 1 for 1 clock period
•Enables all interrupts
•Finally, jumps back to the address stored in EPC
•At the same time, the IO device: •Detects IACK==1 and resets IRQ to 0
Interrupt-driven IO
•MIPS interrupt-driven IO uses polled exception handling meaning that the processor must check the Cause register to find out which device needs attention.
•A faster approach used in other processors in vector exception handing meaning that there is an ISR for every IO device. A table in memory gives the mapping between the IO device’s interrupt number and the ISR starting address.
Some More Math
Signed Instructions
• Normally, negative numbers are stored in register or memory words in two’s complement format.
• The add and addi instructions work fine for positive and two’s complement numbers. These are called the signed addition instructions.
Signed Instructions
• The signed math instructions (add, addi) generate an exception when there is overflow.
• “Overflow occurs when the result of an operation is larger than can be represented in a given register or memory location.” E.g.
• in 32-bit 2s complement, the maximum positive value is +231- 1 (0111…1112) and the maximum negative value is -231 (100…0002)
Signed Instructions
• Overflow example
max positive
0111_1111_1111_1111_1111_1111_1111_11112 + 0000_0000_0000_0000_0000_0000_0000_00012
1000_0000_0000_0000_0000_0000_0000_00002 =-23110 !!!!!!
max negative
2s Complement Number Wheel
• Example 4 bits
+1 0010 +2
-7 1001 1000
2s Complement Number Wheel
• Example 4 bits
negative numbers
positive numbers
0001 -2 +1 +2
-7 1001 1000
Signed Instructions
• When overflow occurs for the signed addition instructions, and exception is generated.
• “An exception is an internally generated interrupt.” i.e. generated within the CPU.
• For arithmetic overflow, Cause = 0x30
Unsigned Instructions
• The unsigned MIPS instructions (addu, addiu) do NOT generate an exception when an overflow occurs.
• Note, for addiu, the constant is still sign extended (i.e. you can still use addiu for subtraction).
• The programmer should detect and deal with overflow themselves by comparing the sign bits of the inputs and output…
Sign Bit(A)
Sign Bit(B)
Sign Bit(Y)
MIPS32 Multiply Unit
Main Memory
address data
instruction
PC Control Unit
Multiply Unit
Integer Multiplication
• Perform signed multiplication using the mult instruction, ad unsigned multiplication using multu.
• The inputs to the multiplication are 32-bit so the result is 64-bit
• The result is stored in 2 special 32-bit registers •$hi &$lo
• The values in $hi & $lo can be copied back to the CPU registers using the instructions mfhi and mflo.
li $a1, 3 mult $a0, $a1 mfhi $t0 mflo $t1
# 32 most significant bits of multiplication to $t0 # 32 least significant bits of multiplication to $t1
Integer Division
• Special instructions: div (signed) and divu (unsigned) • $lo = quotient
• $hi = remainder
• Again, mfhi and mflo are used to retrieve the results.
Multiply Unit
• Multiply and divide operations are slower than the addition and logical options performed in the ALU.
• Therefore, there is a minimum wait time between issuing multiply and divide instructions and reading back the results from hi and lo.
• The CPU can perform other instructions while it waits for the multiply and divide to finish.
MIP32 Floating Point Coprocessor
Main Memory
address data address instruction
PC Control Unit
Floating Point
• Single precision 32-bits:
exponent (E)
fraction (F)
• Double precision 64-bits:
exponent (E)
fraction (F)
fraction (continued)
Floating Point
• CP1 has 32 32-bit registers: $f0,…, $f31
• They can be used in pairs as 64-bit registers. The 64-bit register uses the even register number in its name.
• CP1 has a floating point ALU
• Data is loaded and stored from main memory:
lwc1 $f4, 0($sp)
lwc1 $f6, 4($sp)
add.s $f2, $f4, $f6
swc1 $f2, 8($sp)
Arithmetic
Data transfer
Conditional
Instruction
FP add single
FP subtract single
FP multiply single
FP divide single
FP add double
FP subtract double
FP multiply double
FP divide double
store word
FP compare single {eq,ne,lt,le,gt,ge}
FP compare double {eq,ne,lt,le,gt,ge}re double
add.s $f2,$f4,$f7
sub.s $f2,$f4,$f7
mul.s $f2,$f4,$f7
div.s $f2,$f4,$f7
add.d $f2,$f4,$f7
sub.d $f2,$f4,$f7
mul.d $f2,$f4,$f7
div.d $f2,$f4,$f7
lwc1 $f1,0($s2)
swc1 $f1,0($s2)
c.lt.s $f2,$f4
c.lt.d $f2,$f4
MIPS Machine Language
Assembly Language to Machine Language
Representing Instructions
• Assembly language -> Machine language
• “The instruction format defines how the information in the assembly language instruction is coded into a binary machine language word.”
• A 32-bit word is divided into a number of segments or fields. Each field is used to represent part of the instruction.
• To ensure that all instructions are 32-bits, MIPS designers chose to support a number of different instruction formats. The formats are distinguished using the opcode field.
Representing Instructions
• R-format (register format)
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
• op = operation (opcode), e.g. arithmetic
• rs = 1st register source operand (5 bits select from 32
• rt = 2nd register source operand
• rd = register destination operand
• shamt = shift amount
• funct = function, selects variant of operation in opcode field
Representing Instructions
• E.g. add $t0, $s1, $s2
arithmetic $s1 $s2 $t0 0 add
Representing Instructions
• I-format (immediate format)
6 bits 5 bits 5 bits 16 bits
• op = operation (opcode), e.g. lw, sw • rs = base register
• rt = data register
• constant or address
constant or address
Binary / Hexadecimal
• Reading binary numbers is hard for humans.
• So people often display binary numbers in hexadecimal format.
• Hexadecimal is base 16
• So there are 16 possible digits, 0-F
• So, a hexadecimal digit can be used to represent 4 binary bits
Binary / Hexadecimal
Hexadecimal
Binary to Hexadecimal
• To convert binary to hexadecimal, split the binary number to groups of 4 bits and convert each group to its hexadecimal equivalent, e.g.
‣ 00010100010001012
‣ 0001_0100_1010_11112 ‣ 14AF16
• Manually assemble
lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, 48($s3)
# load a[8]
# calculate
# stores result
op rs rt rd shamt funct
35 19 8 32
0 18 8 8 0 32 =02484020
= 8e680020
43 19 8 48 = ae680030
Disassembler
[31] [0] 0000 0000 1010 1111 1000 0000 0010 0000
op rs rt rd shamt funct 000000 00101 01111 10000 00000
add $s0,$a1,$t7
• Disassemble 0x21090003
0010_0001_0000_1001_0000_0000_0000_0011 001000 = 8 => R format 001000,01000,01001,0000_0000_0000_0011 6,8,9,3
addi $t1, $t0, +3
Representing Instructions
• Conditional branch instructions & set immediate instructions use I-format
• Jump instruction uses J-format
6 bits 26 bits
• To save bits, instruction address = data address / 4
(word address) (byte address)
instruction address
To increase address space, jump and branch
instructions use word addressing, not byte
Addressing
Data address
Byte addressing Allows access to characters
Little Endian (Intel)
Instruction address
Word addressing Allows lots
of instructions
LSB 8 bits
Representing Instructions
• What if the jump destination address is greater than 3FFFFFF16? i.e. larger than can be stored in 26 bits.
• Store the address in a register & used jump register. • Allows 32-bit jumps destinations.
Representing Instructions
• What about branches?
• They are I-format, so only a 16-bit address is allowed.
• Use relative addressing.
• Relative to the address of the next instruction.
Representing Instructions
Word address relative to instruction after branch
$s3, 2 #-4 $t1, $s6 #-3 0($t1) #-2 $s5, Exit #-1 $s3, 1 #0
Byte address
[0x00400000] [0x00400004] [0x00400008] [0x0040000c] [0x00400010] [0x00400014] [0x00400018]
sll $t1, add $t1, lw $t0, bne $t0, add $s3, j Loop …
Byte Address
Relative Address
0x0040_0000
0xFFFC (-4)
0x0040_0018
0x0002 (+2)
• What do you do if you need a conditional branch to an address displaced by 20 bits relative to the Program Counter?
beq $s0, $s1, L1 # BUT L1 is too far away!
bne $s0,$s1,L2 j L1
# if not equal avoid jump
# jump unconditional
# 24 bit address
Review of Addressing Modes
• Register addressing: The data on which the instruction operates is held in a register.
• Used in add
• Immediate addressing: The data on which the instruction operates is held in the instruction.
• Used in addi
• Absolute addressing. The full address of the data on which the instruction operates is stored in the instruction.
• Used in jumps
• Relative addressing. The address of the data on which the instruction operates is specified relative to a known address in memory, often the address of the next successive instruction.
• Used in branches
• Base or displacement addressing: The address of the data on which the instruction operates is specified relative to an address held in a register.
• Used in load and store
MIPS Tool Flow
Assembly language source
Assembly language source
Assembly language source
Object module
System library
Object module
Object module
Assembly listing
Disassembler
Instruction Set Simulator e.g. MARS
Converts assembly instructions to machine instructions.
1. Read instruction.
2. Check syntax. Ignore comments.
3. Put any labels or symbols in the symbol table.
4. Look up the instruction format.
5. Look up the instruction and register codes.
6. Write the machine instruction to the object file.
7. Once all instructions have been assembled, labels and symbols are resolved and actual values (where possible) are put into the machine code program.
Assembler: Pseudo Instructions
• “A pseudo instruction is a commonly used instruction that is not in the target instruction set but which the compiler accepts and translates to the target ISA equivalent”, e.g.:
move $t0, $t1
# move data
# actual instruction
# no operation
# actual instruction
$t0, $zero, $t1
Assembler: Pseudo Instructions
• li (load immediate) is a pseudo instruction • For a 16 bit unsigned constants:
li $t0, 10
ori $t0, $zero, 10
• For a 16 bit
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com