程序代写代做代考 computer architecture x86 arm compiler assembly mips assembler cache Compilers and computer architecture: The MIPS processor

Compilers and computer architecture: The MIPS processor
Martin Berger
November 2015

Recall the function of compilers

Introduction
In previous lectures, we focussed on generating code for simple architectures like the stack machine, or accumulator machines.
Now we want to do something more interesting, generating code for a real CPU.

CISC vs RISC
Processors can roughly be classified as
􏹩 CISC(complexinstructionsetcomputer) 􏹩 RISC(reducedinstructionsetcomputer)

CISC vs RISC
Processors can roughly be classified as
􏹩 CISC(complexinstructionsetcomputer) 􏹩 RISC(reducedinstructionsetcomputer)
What is the instruction set?

Instruction set architecture
In a CPU we distinguish between
􏹩 Instructionsetarchitecture,thatisexternallyvisible aspects like the supported data types (e.g. 32 bit Ints, 80 bit floats etc), instructions, number and kinds of registers, addressing modes, memory architecture, interrupt etc. That which the programmer can access through code.
􏹩 Microarchitecture,whichhowtheinstructionsetis implemented. The microarchitecture is not visible to the programmer, and CPUs with different microarchitectures can share a common instruction set. For example Intel and AMD support very similar instruction sets (x86 derived) but have very different microarchitectures.

Instruction set architecture
There is a semantic gap between high-level programming languages and machine languages: the former have powerful features (e.g. method invocation) that translate to a large number of simple machine instructions.
In the past it was though that making machine commands more powerful would close or narrow the semantic gap, making compiled code faster. Examples of powerful machine commands include directly enabling constructs such as procedure calls, or complicated array access in single instructions.
CPUs with such instruction sets are called CISC (complex instruction set computer). Complex because the instructions do complicated things and are complex to implement in hardware.

Instruction set architecture
In some cases CISC architectures led to faster compiled code. But in the 1970s researchers began to study instruction set architecture and compiled code carefully and noticed two things.
􏹩 Compilersrarelymakeuseofthecomplexinstructions provided by CISC machines.
􏹩 Complexoperationstendedtobeslowerthanasequence of simpler operations doing the same thing.
􏹩 Oftenreal-worldprogramsspendmostoftheirtime executing simple operations.
􏹩 Implementingcomplexoperationsleadstocomplicated CPU architecture that can slow down the execution of simple instructions.

RISC
These empirical insights lead to a radical rethink of instruction set architecture.

RISC
These empirical insights lead to a radical rethink of instruction set architecture.
Now the focus was on providing just a few very simple operations, but make them exceedingly fast.

RISC
These empirical insights lead to a radical rethink of instruction set architecture.
Now the focus was on providing just a few very simple operations, but make them exceedingly fast.
This makes the task of the compiler (much) harder, but the compiler has to compile a program only once, whereas the CPU would have to support complex instructions all the time.

RISC
RISC processors like MIPS are the outcome.

RISC
RISC processors like MIPS are the outcome.
One key feature of RISC is that external memory was only accessible by a load or store instruction. All other instructions were limited to internal registers.
This drastically simplifies processor design: allowing instructions to be fixed-length, simplifying pipelines, and isolating the logic for dealing with the delay in completing a memory access (cache miss, etc.) to only two instructions. Hence RISC is also called load/store architecture.

RISC vs CISC today
Despite RISC being technically better, still the most popular desktop/server family of chips (Intel x86) is not RISC. Reasons:
􏹩 Largeamountoflegacyx86code(e.g.Microsoftproducts), locking PC users into x86 CISC.
􏹩 IntelearnsmuchmoremoneythanproducersofRISC chips so can spend much more on research, design and manufacturing, keeping CISC chips competitive with RISC.
􏹩 ’Underthehood’modernIntelprocessorsarealsoRISC: the complicated x86 machine instructions are translated at run-time into much simpler underlying RISC microcode (which is not user-visible).

The MIPS processor
MIPS stands for Microprocessor without Interlocked Pipeline Stages. Was envisioned by John Hennessy in Stanford, and then further developed by a startup called MIPS Technologies (founded by Hennessy). MIPS was bought by Imagination Technologies (which is partially owned by Intel and Apple).
MIPS / Imagination Technology are a design company, they don’t produce the chips (like ARM). They license the design to third-party vendors. This has been successful primarily in the market for embedded microprocessors, but also game consoles (e.g. Sony PlayStation 2 and PlayStation Portable). Architectures like ARM are strongly influenced by MIPS.

MIPS in space: New Horizons
The New Horizons Pluto problem uses radiation-hardened versions of the MIPS 32 bit CPU.

MIPS in China
China is pushing hard to build their own high-performance CPUs that can compete with Intel CPUs. For this purpose, they have developed a new processor family called Loongson based on a 64-bit variant of MIPS. The main target of these CPUs is high-performance computing in supercomputers.

The MIPS processor
Originally the MIPS architectures were 32-bit, and later versions were 64-bit. Multiple revisions of the MIPS instruction set exist, including MIPS I, MIPS II, MIPS III, MIPS IV, MIPS V, MIPS32, and MIPS64. The current revisions are MIPS32 (for 32-bit implementations) and MIPS64 (for 64-bit implementations)
We will use MIPS32.
We use MIPS via a simulator such as
􏹩 SPIMhttp://spimsimulator.sourceforge.net/
􏹩 MARShttp://courses.missouristate.edu/ KenVollmar/MARS/index.htm

SPIM

MARS

SPIM & MARS
You need to learn SPIM or MARS by yourself and in tutorials (SPIM backwards reads MIPS). I prefer MARS …

MIPS
Here is a basic overview of MIPS. We’ll only cover issues that are relevant for code generation. You are expected to familiarise yourself with MIPS programming on your own.
This should not be difficult with SPIM/MARS, as MIPS is an exceptionally clean architecture.

rchitecture: An Introduction
MIPS regFigsuret2e-8 rCPsU Registers
MIPS has the following registers (all are 32 bits).
􏹩 32generalpurposeregisters
􏹩 Apairofspecial-purpose registers to hold the results of integer multiply, divide, and multiply-accumulate operations (HI and LO)
􏹩 Aprogramcounter(PC).
31 031 0
r0 (hardwired to zero)
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
r23
r24
r25
r26
r27
r28
r29
r30
r31
HI
LO
31 0
PC

cture: An Introduction
MIPSreFigurei2s-8tCePUrResgisters
Two of the CPU general-purpose registers have assigned functions:
􏹩 r0ishard-wiredto0,andcanbe used as target register for any instruction whose result is to be discarded. r0 can also be used as a source of 0 if needed.
􏹩 r31isthedestinationregister (return address) used by JAL, BLTZAL, BLTZALL, BGEZAL, and BGEZALL without being explicitly specified in the instruction word. Otherwise r31 is used as a normal register.
􏹩 r1-r30areforgeneralpurpose use.
31 031 0
r0 (hardwired to zero)
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
r23
r24
r25
r26
r27
r28
r29
r30
r31
HI
LO
31 0
PC
e

hitecture: An Introduction
MIPS regFigiusre 2t-e8 CrPUsR:egisPtersC
The program counter (PC) register, points to the instruction to be executed next. The PC cannot directly be written or read using load/store instructions. It can only be influenced by executing instructions which change the PC as a side-effect.
31 031 0
r0 (hardwired to zero)
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
r23
r24
r25
r26
r27
r28
r29
r30
r31
HI
LO
31 0
PC
c

hitecture: An Introduction
MIPSregFigiusre2t-e8 CrPUsR:egisHtersI/LO
When we multiply or divide or add or subtract two 32 bit integers, we get results that don’t fit into 32. To deal with this, the HI and LO registers are used. For example: During a multiply operation, the HI and LO registers store the product of integer multiply.
31 031 0
r0 (hardwired to zero)
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
r23
r24
r25
r26
r27
r28
r29
r30
r31
HI
LO
31 0
PC
c

MIPS Datatypes
􏹩 Bit(b)
􏹩 Byte(8bits,B)
􏹩 Halfword(16bits,H) 􏹩 Word(32bits,W)

MIPS Datatypes
􏹩 Bit(b)
􏹩 Byte(8bits,B)
􏹩 Halfword(16bits,H)
􏹩 Word(32bits,W)
The CPU uses byte addressing for halfword and word accesses with the following alignment constraints:
􏹩 Halfwordaccessesmustbealignedonanevenbyte boundary (0, 2, 4…).
􏹩 Wordaccessesmustbealignedonabyteboundary divisible by four (0, 4, 8…).
The assembler will help you with this.

MIPS instructions
CPU instructions are organized into the following functional groups:
􏹩 Loadandstore
􏹩 Computational
􏹩 Jumpandbranch 􏹩 Miscellaneous
􏹩 Coprocessor
Each instruction is 32 bits long in memory.
MIPS processors use a simple load/store architecture; all operations are performed on operands held in processor registers. Main memory is accessed only through load and store instructions.

MIPS instructions
The command
lw reg1 offset(reg2 )
(where offset is a 16-bit integer) reads the number in integer reg2 adds the 16 bit value offset to that number, obtaining a new number n, and then looks up the 32 bit value stored in memory at n. That value is then loaded into register reg1 as a signed integer.
The sum of reg2 and offset must be word aligned (i.e. the two least significant bits must be 0), otherwise an error will occur.

MIPS instructions
r0 r1 r2 r3 r31 2000 …
lw r2 100(r3)
pc
… 17 2100
r0 r1 r2 r3 r31 17 2000 …
lw r2 100(r3)
… 17 2100
pc

MIPS instructions
The command
add reg1 reg2 reg3
Adds the contents of registers reg2 and reg3, and stores the result in reg1.

MIPS instructions
r0 r1 r2 r3 r31 666 999 …
add r1 r2 r3
pc
…
r0 r1 r2 r3 r31 1665 666 999 …
add r1 r2 r3
…
pc

MIPS instructions
The command
sw reg1 offset(reg2 )
(where offset is an integer) stores the 32 bit word currently in reg1 at the address obtained by adding the 16 bit value offset to the content of register reg2.
The sum of reg2 and offset must be word aligned (i.e. the two least significant bits must be 0), otherwise an error will occur.

MIPS instructions
r0 r1 r2 r3 r31 33 2000 …
sw r1 100(r2)
pc
…
2100
r0 r1 r2 r3 r31
33 2000 sw r1 100(r2)
…
… 33
2100
pc

MIPS instructions
The command
addiu reg1 reg2 imm
Adds the 16 bit signed integer imm to the word currently in reg2, storing the result in register reg1. Here the ’u’ in addiu means unsigned. In first approximation that means overflow is not checked (no error is cause when overflowing).
Not checking overflow is useful e.g. when you want ’wrap around’ a sum at 0 or 232 − 1. You want this e.g. when doing cryptography. In addition we consider e.g. the SP an unsigned integer.

MIPS instructions
r0 r1 r2 r3 r31 2000 …
addiu r1 r2 55
pc
…
r0 r1 r2 r3 r31
2055 2000 addiu r1 r2 55
…
…
pc

MIPS instructions
The pseudo instruction li reg imm
Stores the 32 bit integer imm in register reg.
It is a pseudo instruction in that there is no MIPS assembly command that directly implements this (MPIS cannot load 32 bit words directly), instead the MIPS assembler wil automatically expand li reg imm into a sequence of real assembler commands. When compiling you can easily treat pseudo instructions as real instructions.

MIPS instructions
r0 r1 r2 r3 r31 …
li r2 123
pc
…
r0 r1 r2 r3 r31
123
…
li r2 123
…
pc

Note on learning assembler
To understand MIPS assembly code you will have to read the documentation. The following texts are good.
􏹩 MIPSArchitectureForProgrammersVolumeII:The MIPS32 Instruction Set, which you can download for free from http://www.imgtec.com/mips/ architectures/mips32.asp for free (registration required) has a clear and simple explanation of all commands.
􏹩 Assemblers,Linkers,andtheSPIMSimulatorbyJ.Larus, the author of SPIM, linked from Study Direct.
Most commands are not needed in the simple compiler we a writing.
In my experience the best way of learning an assembler is to write a simple simulator for it. This is surprisingly easy if you don’t want it to be usable (e.g. no GUI), or indeed precise (e.g. trapping overflows), or comprehensive (implementing all commands).

Our first MIPS program
Let’s write the program 7+5, we want the result in register r5.
li r6 7
li r5 5
add r5 r5 r6

Our second MIPS program
Let’s write 7+5, in accumulator machine form. 􏹩 Oneargumentisintheaccumulator.
􏹩 Remainingargumentsonthestack.
􏹩 Resultshouldbeinaccumulator.

Our second MIPS program
Let’s write 7+5, in accumulator machine form. 􏹩 Oneargumentisintheaccumulator.
􏹩 Remainingargumentsonthestack.
􏹩 Resultshouldbeinaccumulator.
MIPS doesn’t have an explicit SP and explicit accumulator?
No problem, because every general purpose register can serve as stack pointer and as accumulator! Just choose them as you like (but then be consistent about it).
Convention: we use register r29 as stack pointer, and register r4 as accumulator.
It is customary in MIPS assembly to write $sp for the stack pointer (r29) and $a0 for register r4.

Our second MIPS program
Recall that in the accumulator machine model, memory operations work only via the accumulator.

Our second MIPS program
Recall that in the accumulator machine model, memory operations work only via the accumulator. With this in mind, here is the program 7+5 we are seeking to translate to MIPS in pseudo-code.
acc <- 7 push acc acc <- 5 acc <- acc + top of stack pop Our second MIPS program To translate acc <- 7 push acc acc <- 5 acc <- acc + top of stack pop into MIPS we adhere to the conventions that Our second MIPS program To translate acc <- 7 push acc acc <- 5 acc <- acc + top of stack pop into MIPS we adhere to the conventions that 􏹩 Thestackgrowsdownwards(i.e.fromhightolow addresses). Our second MIPS program To translate acc <- 7 push acc acc <- 5 acc <- acc + top of stack pop into MIPS we adhere to the conventions that 􏹩 Thestackgrowsdownwards(i.e.fromhightolow addresses). 􏹩 Thestackpointer$sppointstothefirstfreememorycell below (in terms of addresses) the top of the stack. Our second MIPS program 􏹩 Thestackgrowsdownwards(i.e.fromhightolow addresses). 􏹩 Thestackpointer$sppointstothefirstfreememorycell below (in terms of addresses) the top of the stack. 1500 1496 1492 1488 1484 Top of stack element 166 99 66 22 ... SP = 1484 Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Note that the program on the right is really doing almost exactly what we did a few weeks ago when we looked at the accumulator machine, except that Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Note that the program on the right is really doing almost exactly what we did a few weeks ago when we looked at the accumulator machine, except that 􏹩 weuseatemporaryt1 Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Note that the program on the right is really doing almost exactly what we did a few weeks ago when we looked at the accumulator machine, except that 􏹩 weuseatemporaryt1 􏹩 weuseMIPSassembly Our second MIPS program acc <- 7 push acc acc <- 5 acc <- acc+topOfStack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 Note that the program on the right is really doing almost exactly what we did a few weeks ago when we looked at the accumulator machine, except that 􏹩 weuseatemporaryt1 􏹩 weuseMIPSassembly 􏹩 wehavetoadjustthestack’byhand’,ratherthanusing built-in push and pop MIPS We will soon write a compiler that compiles a simple language with procedures to MIPS code. MIPS We will soon write a compiler that compiles a simple language with procedures to MIPS code. To understand this, you need to familiarise yourself with MIPS in the tutorials and in self study. MIPS We will soon write a compiler that compiles a simple language with procedures to MIPS code. To understand this, you need to familiarise yourself with MIPS in the tutorials and in self study. MIPS machine code is really straightforward, and not really different from the pseudo machine code we used a few weeks back, except that the assembler syntax is slightly different. Interlude on (MIPS) assembler Assembler language is a programming language that is close to machine language but not the same. Interlude on (MIPS) assembler Assembler language is a programming language that is close to machine language but not the same. Why bother with yet another language? Why not program straight in machine language? That’s why 001001111011110111111111111000001010111110111111000000 000001010010101111101001000000000000100000101011111010 010100000000001001001010111110100000000000000001100010 101111101000000000000000011100100011111010111000000000 000111001000111110111000000000000001100000000001110011 100000000000011001001001011100100000000000000000010010 100100000001000000000110010110101111101010000000000000 011100000000000000000001111000000100100000001100001111 110010000010000100010100001000001111111111110111101011 111011100100000000000110000011110000000100000100000000 000010001111101001010000000000011000000011000001000000 000000111011000010010010000100000001000011000010001111 101111110000000000010100001001111011110100000000001000 000000001111100000000000000000100000000000000000000001 000000100001 That’s why Here is same code written in assembly language, but no symbolic labels are used as name of registers or memory locations. addiu $29, $29, -32 sw $31, 20($29) sw $4, 32($29) sw $5, 36($29) sw $0, 24($29) sw $0, 28($29) lw $14, 28($29) lw $24, 24($29) multu $14, $14 addiu $8, $14, 1 slti $1, $8, 101 sw $8, 28($29) mflo $15 addu $25, $24, $15 bne $1, $0, -9 sw $25, 24($29) lui $4, 4096 lw $5, 24($29) jal 1048812 addiu $4, $4, 1072 lw $31, 20($29) addiu $29, $29, 32 jr $31 move $2, $0 That’s why It gets even better with symbolic names such as $sp or loop. .text .align 2 .globl main main: subu $sp, $sp, 32 sw $ra, 20($sp) sd $a0, 32($sp) sw $0, 24($sp) sw $0, 28($sp) loop: lw $t6, 28($sp) mul $t7, $t6, $t6 lw $t8, 24($sp) str: addu $t9, $t8, $t7 sw $t9, 24($sp) addu $t0, $t6, 1 sw $t0, 28($sp) ble $t0, 100, loop la $a0, str lw $a1, 24($sp) jal printf move $v0, $0 lw $ra, 20($sp) addu $sp, $sp, 32 jr $ra .data .align 0 .asciiz "The sum from 0 .. 100 is %d\ Assembler vs assembly language We must carefully distinguish between 􏹩 Assemblylanguage,thesymbolicrepresentationofa computer’s binary machine language. 􏹩 Assembler,aprogram(amini-compiler)thattranslates assembly language into real machine code (long sequences of 0s and 1s). Assembler, the program The assembler primarily does two things. 􏹩 Translatecommandsinassemblylanguagelikeaddiu $t3 $t6 $t8 into machine code. 􏹩 Convertsymbolicaddressessuchasmainorloopinto machine addresses such as 100011010011010011010011010101001. This task is sometimes deferred to the linker. Assembler, the program The assembler primarily does two things. 􏹩 Translatecommandsinassemblylanguagelikeaddiu $t3 $t6 $t8 into machine code. 􏹩 Convertsymbolicaddressessuchasmainorloopinto machine addresses such as 100011010011010011010011010101001. This task is sometimes deferred to the linker. The symbolic addresses in assembly language name commonly occurring bit patterns, such as opcodes and register names, so humans can read and remember them. In addition, assembly language permits programmers to use labels to identify and name particular memory words that hold instructions or data, or that the program can jump to. Assembler, the program Programmer writes source files. They contain labels that are not defined in the source file, reference to external code (e.g. print). Source file Source file Source file Assembler Assembler Assembler Object file Object file Object file Linker Library Executable Assembler, the program Assembler translates source files to object files, which are machine code, but contains ’holes’ (basically references to external code). Because of holes, object files cannot be executed directly. The holes arise because the assembler translates each file separately. Source file Source file Source file Assembler Assembler Assembler Object file Object file Object file Linker Library Executable Assembler, the program The linker gets all object files and libraries and puts the right addresses into holes, yielding an executable. Source file Source file Source file Assembler Assembler Assembler Object file Object file Object file Linker Library Executable Assembler, the program Here is an example of using names: main is a global name in the sense that other programs can use it. OTOH loop is a local name: it can only be used (jumped to) inside this program. .text .align 2 .globl main main: subu $sp, $sp, 32 sw $ra, 20($sp) ... loop: lw $t6, 28($sp) ... ble $t0, 100, loop It is the declaration (assembler directive) .globl main that makes main global. Assembler, the program The assembler processes a source file line by line, translating assembly commands. It keeps track of the size of each command. loop: subu $sp, $sp, 32 sw $ra, 20($sp) When the assembler encounters a line starting with a label, like loop: ... it calculates what address in memory the command just below would be at, and stores the pair of label and address in its symbol table. If it encounters this label later, e.g. ble $t0, 100, loop, the assembler replaces the label with the address (if local, otherwise the linker does this). Helpers Assembly languages typically offer various features making assembly programming easier. Here are some MIPS examples. 􏹩 Datalayoutdirectives 􏹩 Pseudoinstructions 􏹩 Alignmentinstructions Data layout directives Data layout directives describe data in a more concise and natural manner than its binary representation. Example: .asciiz "The sum from 0 .. 100 is %d\n" stores characters from the string in memory. Alternatively we can use the .byte directive to obtain the same effect. .byte 84, 104, 101, 32, 115, 117, 109, 32 .byte 102, 114, 111, 109, 32, 48, 32, 46 .byte 46, 32, 49, 48, 48, 32, 105, 115 .byte 32, 37, 100, 10, 0 The .asciiz directive is easier to read for text strings. Pseudo instructions You remember li reg imm? Pseudo instructions You remember li reg imm? Turns out that li is not a MIPS assembly command. Pseudo instructions You remember li reg imm? Turns out that li is not a MIPS assembly command. MIPS cannot load a 32 bit word in one instruction. Two instructions are needed (one for the lower 16 bits of imm, and another one for the upper 16 bits.) Pseudo instructions You remember li reg imm? Turns out that li is not a MIPS assembly command. MIPS cannot load a 32 bit word in one instruction. Two instructions are needed (one for the lower 16 bits of imm, and another one for the upper 16 bits.) Instead it is a pseudo command: the assembler replaces each occurrency of li appropriately. Pseudo instructions You remember li reg imm? Turns out that li is not a MIPS assembly command. MIPS cannot load a 32 bit word in one instruction. Two instructions are needed (one for the lower 16 bits of imm, and another one for the upper 16 bits.) Instead it is a pseudo command: the assembler replaces each occurrency of li appropriately. The MARS simulator shows pseudo instructions and the instructions that the former translate to together. MIPS register naming conventions Register Assembler name Typical usage 0 1 2-3 4-7 8-15 16-23 24-25 26-27 28 29 30 31 $zero $at $v0-v1 $a0-a3 $t0-t7 $s0-s7 $t8-t9 $k0-$k1 $gp $sp $fp $ra always equal to 0 used by the assembler Return value from a function call First four procedure parameters Temporary variables; need not be preserved Function variables; must be preserved Two more temporary variables Kernel registers for interrupts; may change unexpectedly Global pointer Stack pointer Stack frame pointer Return address of last call Don’t use registers $at, $k0, $k1

Related Posts