MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture
Document Number: MD00082 Revision 6.01
August 20, 2014
Strictly Confidential. Neither the whole nor any part of this document/material, nor the product described herein, may be adapted or reproduced in any material form except with the written permission of . All logos, products
and trade marks are the property of their respective owners. This document may only be distributed subject to the terms of an applicable Non-Disclosure or Licence Agreement with MIPS.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Contents
Chapter 1: About This Book …………………………………………………………………………………………………. 12
1.1: Typographical Conventions ……………………………………………………………………………………………………….. 13 1.1.1: Italic Text………………………………………………………………………………………………………………………… 13 1.1.2: Bold Text………………………………………………………………………………………………………………………… 13 1.1.3: Courier Text ……………………………………………………………………………………………………………………. 13 1.1.4: Colored Text……………………………………………………………………………………………………………………. 13
1.2: UNPREDICTABLE and UNDEFINED …………………………………………………………………………………………. 13 1.2.1: UNPREDICTABLE…………………………………………………………………………………………………………… 13 1.2.2: UNDEFINED …………………………………………………………………………………………………………………… 14 1.2.3: UNSTABLE …………………………………………………………………………………………………………………….. 14
1.3: Special Symbols in Pseudocode Notation……………………………………………………………………………………. 15 1.4: Notation for Register Field Accessibility ………………………………………………………………………………………. 18 1.5: For More Information ………………………………………………………………………………………………………………… 20
Chapter 2: Overview of the MIPS® Architecture…………………………………………………………………….. 21
2.1: Historical Perspective ……………………………………………………………………………………………………………….. 21 2.2: Components of the MIPS® Architecture………………………………………………………………………………………. 22 2.2.1: MIPS Instruction Set Architecture (ISA) ………………………………………………………………………………. 22 2.2.2: MIPS Privileged Resource Architecture (PRA) …………………………………………………………………….. 22 2.2.3: MIPS Modules and Application Specific Extensions (ASEs)…………………………………………………… 23 2.2.4: MIPS User Defined Instructions (UDIs)……………………………………………………………………………….. 23 2.3: Evolution of the Architecture………………………………………………………………………………………………………. 23 2.3.1: MIPS I through MIPS V Architectures …………………………………………………………………………………. 24 2.3.2: MIPS32 Architecture Release 2 …………………………………………………………………………………………. 25 2.3.3: MIPS32 Architecture Releases 2.5+ …………………………………………………………………………………… 26 2.3.4: MIPS32 Release 3 Architecture (MIPSr3TM) ………………………………………………………………………… 26 2.3.5: MIPS32 Architecture Release 5 …………………………………………………………………………………………. 27 2.3.6: MIPS32 Architecture Release 6 …………………………………………………………………………………………. 28 2.4: Compliance and Subsetting……………………………………………………………………………………………………….. 30 2.4.1: Subsetting of Non-Privileged Architecture …………………………………………………………………………… 30 2.4.2: Subsetting of Privileged Architecture ………………………………………………………………………………….. 32
Chapter 3: Modules and Application Specific Extensions ……………………………………………………… 35
3.1: Description of Optional Components…………………………………………………………………………………………… 35 3.2: Application Specific Instructions …………………………………………………………………………………………………. 36 3.2.1: MIPS16eTM Application Specific Extension …………………………………………………………………………. 37 3.2.2: MDMXTM Application Specific Extension …………………………………………………………………………….. 37 3.2.3: MIPS-3D® Application Specific Extension ………………………………………………………………………….. 37 3.2.4: SmartMIPS® Application Specific Extension ………………………………………………………………………. 37 3.2.5: MIPS® DSP Module ……………………………………………………………………………………………………….. 37 3.2.6: MIPS® MT Module ………………………………………………………………………………………………………….. 37 3.2.7: MIPS® MCU Application Specific Extension ………………………………………………………………………. 37 3.2.8: MIPS® Virtualization Module ……………………………………………………………………………………………. 38 3.2.9: MIPS® SIMD Architecture Module …………………………………………………………………………………….. 38
Chapter 4: CPU Programming Model…………………………………………………………………………………….. 39
3 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
4.1: CPU Data Formats …………………………………………………………………………………………………………………… 39 4.2: Coprocessors (CP0-CP3)………………………………………………………………………………………………………….. 39 4.3: CPU Registers…………………………………………………………………………………………………………………………. 40
4.3.1: CPU General-Purpose Registers ……………………………………………………………………………………….. 40
4.3.2: CPU Special-Purpose Registers ………………………………………………………………………………………… 40 4.4: Byte Ordering and Endianness…………………………………………………………………………………………………… 43 4.4.1: Big-Endian Order …………………………………………………………………………………………………………….. 43 4.4.2: Little-Endian Order …………………………………………………………………………………………………………… 43 4.4.3: MIPS Bit Endianness ……………………………………………………………………………………………………….. 43 4.5: Memory Alignment……………………………………………………………………………………………………………………. 44 4.5.1: Addressing Alignment Constraints ……………………………………………………………………………………… 44 4.5.2: Unaligned Load and Store Instructions (Removed in Release 6) ……………………………………………. 45 4.6: Memory Access Types ……………………………………………………………………………………………………………… 45 4.6.1: Uncached Memory Access ……………………………………………………………………………………………….. 46 4.6.2: Cached Memory Access …………………………………………………………………………………………………… 46 4.6.3: Uncached Accelerated Memory Access …………………………………………………………………………….. 46 4.7: Implementation-Specific Access Types……………………………………………………………………………………….. 47 4.8: Cacheability and Coherency Attributes and Access Types …………………………………………………………….. 47 4.9: Mixing Access Types………………………………………………………………………………………………………………… 48 4.10: Instruction Fetch …………………………………………………………………………………………………………………….. 48 4.10.1: Instruction Fields ……………………………………………………………………………………………………………. 48 4.10.2: MIPS32 and MIPS64 Instruction Placement and Endianness ………………………………………………. 48 4.10.3: Instruction Fetch Using Uncached Access Without Side-effects …………………………………………… 49 4.10.4: Instruction Fetch Using Uncached Access With Side-effects ……………………………………………….. 50 4.10.5: Instruction Fetch Using Cacheable Access………………………………………………………………………… 50 4.10.6: Instruction Fetches and Exceptions ………………………………………………………………………………….. 50 4.10.6.1: Precise Exception Model for Instruction Fetches ……………………………………………………….. 50 4.10.6.2: Instruction Fetch Exceptions on Branch Delay Slots and Forbidden Slots……………………… 51 4.10.7: Self-modified Code…………………………………………………………………………………………………………. 51
Chapter 5: CPU Instruction Set …………………………………………………………………………………………….. 52
5.1: CPU Load and Store Instructions ……………………………………………………………………………………………….. 52 5.1.1: Types of Loads and Stores ……………………………………………………………………………………………….. 52 5.1.2: Load and Store Access Types …………………………………………………………………………………………… 53 5.1.3: List of CPU Load and Store Instructions ……………………………………………………………………………… 53
5.1.3.1: PC-relative Loads (Release 6) ………………………………………………………………………………….. 55 5.1.4: Loads and Stores Used for Atomic Updates ………………………………………………………………………… 55 5.1.5: Coprocessor Loads and Stores………………………………………………………………………………………….. 55
5.2: Computational Instructions ………………………………………………………………………………………………………… 56 5.2.1: ALU Immediate and Three-Operand Instructions …………………………………………………………………. 57 5.2.2: ALU Two-Operand Instructions………………………………………………………………………………………….. 58 5.2.3: Shift Instructions………………………………………………………………………………………………………………. 58 5.2.4: Width Doubling Multiply and Divide Instructions (Removed in Release 6) ……………………………….. 59 5.2.5: Same-Width Multiply and Divide Instructions (Release 6) ……………………………………………………… 60
5.3: Jump and Branch Instructions ……………………………………………………………………………………………………. 61 5.3.1: Types of Jump and Branch Instructions………………………………………………………………………………. 61 5.3.2: Branch Delay Slots and Branch Likely versus Compact Branches and
Forbidden Slots ………………………………………………………………………………………………………………………… 61
5.3.2.1: Control Transfer Instructions in Delay Slots and Forbidden Slots …………………………………… 62 5.3.2.2: Exceptions and Delay and Forbidden Slots…………………………………………………………………. 62 5.3.2.3: Delay Slots and Forbidden Slots Performance Considerations………………………………………. 62 5.3.2.4: Examples of Delay Slots and Forbidden Slots …………………………………………………………….. 63 5.3.2.5: Deprecation of Branch Likely Instructions …………………………………………………………………… 64
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 4
5.3.3: Jump and Branch Instructions……………………………………………………………………………………………. 64 5.3.3.1: Release 6 Compact Branch and Jump Instructions ……………………………………………………… 64 5.3.3.2: Delayed Branch instructions……………………………………………………………………………………… 66
5.4: Address Computation and Large Constant Instructions (Release 6) ……………………………………………….. 67 5.5: Miscellaneous Instructions ………………………………………………………………………………………………………… 68 5.5.1: Instruction Serialization (SYNC and SYNCI)………………………………………………………………………… 68 5.5.2: Exception Instructions ………………………………………………………………………………………………………. 69 5.5.3: Conditional Move Instructions ……………………………………………………………………………………………. 70 5.5.4: Prefetch Instructions ………………………………………………………………………………………………………… 70 5.5.5: NOP Instructions ……………………………………………………………………………………………………………… 71 5.6: Coprocessor Instructions…………………………………………………………………………………………………………… 71 5.6.1: What Coprocessors Do …………………………………………………………………………………………………….. 71 5.6.2: System Control Coprocessor 0 (CP0)…………………………………………………………………………………. 72 5.6.3: Floating Point Coprocessor 1 (CP1) …………………………………………………………………………………… 72 5.6.3.1: Coprocessor Load and Store Instructions …………………………………………………………………… 72 5.7: CPU Instruction Formats …………………………………………………………………………………………………………… 73 5.7.1: Advanced Instruction Encodings (Release 6) ………………………………………………………………………. 73 5.7.2: CPU Instruction Field Formats …………………………………………………………………………………………… 74
Chapter 6: FPU Programming Model …………………………………………………………………………………….. 77
6.1: Enabling the Floating Point Coprocessor …………………………………………………………………………………….. 77 6.2: IEEE Standard 754…………………………………………………………………………………………………………………… 77 6.3: FPU Data Types ………………………………………………………………………………………………………………………. 78
6.3.1: Floating Point Formats ……………………………………………………………………………………………………… 78 6.3.1.1: Normalized and Denormalized Numbers…………………………………………………………………….. 81 6.3.1.2: Reserved Operand Values—Infinity and NaN ……………………………………………………………… 81 6.3.1.3: Infinity and Beyond ………………………………………………………………………………………………….. 81 6.3.1.4: Signalling Non-Number (SNaN) ………………………………………………………………………………… 81 6.3.1.5: Quiet Non-Number (QNaN) ………………………………………………………………………………………. 82 6.3.1.6: Paired-Single Exceptions …………………………………………………………………………………………. 83 6.3.1.7: Paired-Single Condition Codes …………………………………………………………………………………. 83
6.3.2: Fixed Point Formats …………………………………………………………………………………………………………. 83 6.4: Floating Point Registers ……………………………………………………………………………………………………………. 84 6.4.1: FPU Register Models ……………………………………………………………………………………………………….. 84 6.4.2: Binary Data Transfers (32-Bit and 64-Bit) ……………………………………………………………………………. 86 6.4.3: FPRs and Formatted Operand Layout ………………………………………………………………………………… 87 6.5: Floating Point Control Registers (FCRs) ……………………………………………………………………………………… 87 6.5.1: Floating Point Implementation Register (FIR, CP1 Control Register 0) ……………………………………. 87
6.5.2: User Floating Point Register Mode Control (UFR, CP1 Control Register 1)
(Release 5 Only) ………………………………………………………………………………………………………………………. 90 6.5.3: User Negated FP Register Mode Control (UNFR, CP1 Control Register 4)
(Removed in Release 6) ……………………………………………………………………………………………………………. 91 6.5.4: Floating Point Control and Status Register (FCSR, CP1 Control Register 31)………………………….. 92 6.5.5: Floating Point Condition Codes Register (FCCR, CP1 Control Register 25)
(pre-Release 6) ………………………………………………………………………………………………………………………… 96 6.5.6: Floating Point Exceptions Register (FEXR, CP1 Control Register 26) …………………………………….. 96 6.5.7: Floating Point Enables Register (FENR, CP1 Control Register 28)…………………………………………. 97
6.6: Formats and Sizes of Floating Point Data ……………………………………………………………………………………. 97 6.6.1: Formats of Values Used in FP Registers …………………………………………………………………………….. 97 6.6.2: Sizes of Floating Point Data………………………………………………………………………………………………. 98
6.7: FPU Exceptions……………………………………………………………………………………………………………………….. 98 6.7.1: Precise Exception Mode …………………………………………………………………………………………………… 98 6.7.2: Exception Conditions ……………………………………………………………………………………………………….. 99
5
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
6.7.2.1: Invalid Operation Exception…………………………………………………………………………………….. 100 6.7.2.2: Division By Zero Exception……………………………………………………………………………………… 100 6.7.2.3: Underflow Exception………………………………………………………………………………………………. 101 6.7.2.4: Alternate Flush to Zero Underflow Handling………………………………………………………………. 101 6.7.2.5: Overflow Exception ……………………………………………………………………………………………….. 102 6.7.2.6: Inexact Exception …………………………………………………………………………………………………. 102 6.7.2.7: Unimplemented Operation Exception……………………………………………………………………….. 102
Chapter 7: FPU Instruction Set……………………………………………………………………………………………. 104
7.1: Binary Compatibility ………………………………………………………………………………………………………………… 104 7.2: FPU Instructions …………………………………………………………………………………………………………………….. 104 7.2.1: Data Transfer Instructions……………………………………………………………………………………………….. 105 7.2.1.1: Data Alignment in Loads, Stores, and Moves ……………………………………………………………. 105 7.2.1.2: Addressing Used in Data Transfer Instructions ………………………………………………………….. 105 7.2.2: Arithmetic Instructions…………………………………………………………………………………………………….. 107 7.2.2.1: FPU IEEE Arithmetic Instructions…………………………………………………………………………….. 107 7.2.2.2: FPU non-IEEE-approximate Arithmetic Instructions……………………………………………………. 107 7.2.2.3: FPU Multiply-add Instructions………………………………………………………………………………….. 108 7.2.2.4: FPU Fused Multiply-Accumulate instructions (Release 6) …………………………………………… 109 7.2.2.5: Floating Point Comparison Instructions…………………………………………………………………….. 109 7.2.3: Conversion Instructions…………………………………………………………………………………………………… 110 7.2.4: Formatted Operand-Value Move Instructions …………………………………………………………………….. 111 7.2.5: FPU Conditional Branch Instructions. ……………………………………………………………………………….. 113 7.2.6: Miscellaneous Instructions (Removed in Release 6) …………………………………………………………… 114 7.3: Valid Operands for FPU Instructions …………………………………………………………………………………………. 115 7.4: FPU Instruction Formats………………………………………………………………………………………………………….. 117
Appendix A: Pipeline Architecture………………………………………………………………………………………. 122
A.1: Pipeline Stages and Execution Rates ……………………………………………………………………………………….. 122 A.2: Parallel Pipeline …………………………………………………………………………………………………………………….. 122 A.3: Superpipeline ………………………………………………………………………………………………………………………… 123 A.4: Superscalar Pipeline ………………………………………………………………………………………………………………. 123
Appendix B: Misaligned Memory Accesses…………………………………………………………………………. 126
B.1: Terminology ………………………………………………………………………………………………………………………….. 126 B.2: Hardware Versus Software Support for Misaligned Memory Accesses………………………………………….. 127 B.3: Detecting Misaligned Support ………………………………………………………………………………………………….. 129 B.4: Misaligned Semantics …………………………………………………………………………………………………………….. 129
B.4.1: Misaligned Fundamental Rules: Single-Thread Atomic, but not Multi-thread………………………….. 129 B.4.2: Permissions and Misaligned Memory Accesses…………………………………………………………………. 129 B.4.3: Misaligned Memory Accesses Past the End of Memory………………………………………………………. 131 B.4.4: TLBs and Misaligned Memory Accesses…………………………………………………………………………… 131 B.4.5: Memory Types and Misaligned Memory Accesses …………………………………………………………….. 132 B.4.6: Misaligneds, Memory Ordering, and Coherence ………………………………………………………………… 133
B.4.6.1: Misaligneds are Single-Thread Atomic …………………………………………………………………….. 133 B.4.6.2: Misaligneds are not Multiprocessor/Multithread Atomic ………………………………………………. 134 B.4.6.3: Misaligneds and Multiprocessor Memory Ordering …………………………………………………….. 135
B.5: Pseudocode ………………………………………………………………………………………………………………………….. 135 B.5.1: Pseudocode Distinguishing Actually Aligned from Actually Misaligned………………………………….. 136 B.5.2: Actually Aligned …………………………………………………………………………………………………………….. 136 B.5.3: Byte Swapping………………………………………………………………………………………………………………. 136 B.5.4: Pseudocode Expressing Most General Misaligned Semantics …………………………………………….. 137
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 6
B.5.5: Example Pseudocode for Possible Implementations…………………………………………………………… 138 B.5.5.1: Example Byte-by-byte Pseudocode …………………………………………………………………………. 138 B.5.5.2: Example Pseudocode Handling Splits and non-Splits Separately ………………………………… 139
B.6: Misalignment and MSA Vector Memory Accesses ……………………………………………………………………… 139 B.6.1: Semantics …………………………………………………………………………………………………………………….. 139 B.6.2: Pseudocode for MSA Memory Operations with Misalignment………………………………………………. 140
Appendix C: Revision History …………………………………………………………………………………………….. 143
7 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Figures
Figure 2.1: MIPS Architecture Evolution ………………………………………………………………………………………………….. 24 Figure 3.1: MIPS ISAs, ASEs, and Modules…………………………………………………………………………………………….. 36 Figure 4.1: CPU Registers for MIPS32 ……………………………………………………………………………………………………. 42 Figure 4.2: Big-Endian Byte Ordering ……………………………………………………………………………………………………… 43 Figure 4.3: Little-Endian Byte Ordering……………………………………………………………………………………………………. 43 Figure 4.4: Big-Endian Data in Doubleword Format ………………………………………………………………………………….. 44 Figure 4.5: Little-Endian Data in Doubleword Format ………………………………………………………………………………… 44 Figure 4.6: Big-Endian Misaligned Word Addressing…………………………………………………………………………………. 45 Figure 4.7: Little-Endian Misaligned Word Addressing ………………………………………………………………………………. 45 Figure 4.8: Two Instructions Placed in a 64-bit Wide, Little-endian Memory …………………………………………………. 49 Figure 4.9: Two instructions Placed in a 64-bit Wide, Big-endian Memory……………………………………………………. 49 Figure 5.1: Register (R-Type) CPU Instruction Format………………………………………………………………………………. 74 Figure 5.2: Immediate (I-Type) CPU Instruction Formats (Release 6) …………………………………………………………. 74 Figure 5.3: Immediate (I-Type) Imm16 CPU Instruction Format ………………………………………………………………….. 74 Figure 5.4: Immediate (I-Type) Off21 CPU Instruction Format (Release 6)…………………………………………………… 75 Figure 5.5: Immediate (I-Type) Off26 CPU Instruction Format (Release 6)…………………………………………………… 75 Figure 5.6: Immediate (I-Type) Off11 CPU Instruction Format (Release 6)…………………………………………………… 75 Figure 5.7: Immediate (I-Type) Off9 CPU Instruction Format (Release 6)…………………………………………………….. 75 Figure 5.8: Jump (J-Type) CPU Instruction Format …………………………………………………………………………………… 75 Figure 6.1: Single-Precision Floating Point Format (S)………………………………………………………………………………. 80 Figure 6.2: Double-Precision Floating Point Format (D) …………………………………………………………………………….. 80 Figure 6.3: Paired-Single Floating Point Format (PS)………………………………………………………………………………… 80 Figure 6.4: Word Fixed Point Format (W) ………………………………………………………………………………………………… 83 Figure 6.5: Longword Fixed Point Format (L) …………………………………………………………………………………………… 83 Figure 6.6: FPU Word Load and Move-to Operations ………………………………………………………………………………. 86 Figure 6.7: FPU Doubleword Load and Move-to Operations ………………………………………………………………………. 86 Figure 6.8: Single Floating Point or Word Fixed Point Operand in an FPR ………………………………………………….. 87 Figure 6.9: Double Floating Point or Longword Fixed Point Operand in an FPR……………………………………………. 87 Figure 6.10: Paired-Single Floating Point Operand in an FPR (Removed in Release 6)…………………………………. 87 Figure 6.11: FIR Register Format ………………………………………………………………………………………………………….. 88 Figure 6.12: UFR Register Format (pre-Release 6) ………………………………………………………………………………….. 91 Figure 6.13: UNFR Register Format (pre-Release 6) ………………………………………………………………………………… 91 Figure 6.14: FCSR Register Format ………………………………………………………………………………………………………. 92 Figure 6.15: FCCR Register Format ………………………………………………………………………………………………………. 96 Figure 6.16: FEXR Register Format ……………………………………………………………………………………………………….. 96 Figure 6.17: FENR Register Format ………………………………………………………………………………………………………. 97 Figure 7.1: I-Type (Immediate) FPU Instruction Format …………………………………………………………………………… 119 Figure 7.2: R-Type (Register) FPU Instruction Format …………………………………………………………………………….. 119 Figure 7.3: Register-Immediate FPU Instruction Format ………………………………………………………………………….. 119 Figure 7.4: Condition Code, Immediate FPU Instruction Format (Removed in Release 6) ……………………………. 119 Figure 7.5: Formatted FPU Compare Instruction Format (Removed in Release 6)………………………………………. 119 Figure 7.6: FP Register Move, Conditional Instruction Format (Removed in Release 6)2……………………………… 119 Figure 7.7: Four-Register Formatted Arithmetic FPU Instruction Format (Removed in Release 6)2 ……………….. 119 Figure 7.8: Register Index FPU Instruction Format (Removed in Release 6)………………………………………………. 120 Figure 7.9: Register Index Hint FPU Instruction Format (Removed in Release 6) ……………………………………….. 120 Figure 7.10: Condition Code, Register Integer FPU Instruction Format (Removed in Release 6)3 …………………. 120 Figure A.1: One-Deep Single-Completion Instruction Pipeline ………………………………………………………………….. 122
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 8
Figure A.2: Four-Deep Single-Completion Pipeline …………………………………………………………………………………. 123 Figure A.3: Four-Deep Superpipeline ……………………………………………………………………………………………………. 123 Figure A.4: Four-Way Superscalar Pipeline……………………………………………………………………………………………. 124 Figure B.1: LoadPossiblyMisaligned/StorePossiblyMisaligned Pseudocode……………………………………………….. 136 Figure B.2: LoadAligned / StoreAligned Pseudocode ………………………………………………………………………………. 136 Figure B.3: LoadRawMemory Pseudocode Function ………………………………………………………………………………. 137 Figure B.4: StoreRawMemory Pseudocode Function ………………………………………………………………………………. 137 Figure B.5: Byte Swapping Pseudocode Functions …………………………………………………………………………………. 137 Figure B.6: LoadMisaligned most general pseudocode ……………………………………………………………………………. 138 Figure B.7: Byte-by-byte Pseudocode for LoadMisaligned / StoreMisaligned ……………………………………………… 138 Figure B.8: LoadTYPEVector / StoreTYPEVector used by MSA specification …………………………………………….. 140 Figure B.9: Pseudocode for LoadVector ………………………………………………………………………………………………… 141 Figure B.10: Pseudocode for StoreVector ……………………………………………………………………………………………… 141
9 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Tables
Table 1.1: Symbols Used in Instruction Operation Statements……………………………………………………………………. 15 Table 1.2: Read/Write Register Field Notation …………………………………………………………………………………………. 18 Table 4.1: Unaligned Load and Store Instructions…………………………………………………………………………………….. 45 Table 4.2: Speculative Instruction Fetches ………………………………………………………………………………………………. 50 Table 5.1: Load and Store Operations…………………………………………………………………………………………………….. 53 Table 5.2: Naturally Aligned CPU Load/Store Instructions …………………………………………………………………………. 54 Table 5.3: Unaligned CPU Load and Store Instructions …………………………………………………………………………….. 54 Table 5.4: PC-relative Loads …………………………………………………………………………………………………………………. 55 Table 5.5: Atomic Update CPU Load and Store Instructions………………………………………………………………………. 55 Table 5.6: Coprocessor Load and Store Instructions…………………………………………………………………………………. 56 Table 5.7: FPU Load and Store Instructions Using Register + Register Addressing………………………………………. 56 Table 5.8: ALU Instructions With a 16-bit Immediate Operand……………………………………………………………………. 57 Table 5.9: Three-Operand ALU Instructions …………………………………………………………………………………………….. 57 Table 5.10: Two-Operand ALU Instructions……………………………………………………………………………………………… 58 Table 5.11: Shift Instructions …………………………………………………………………………………………………………………. 58 Table 5.12: Multiply/Divide Instructions …………………………………………………………………………………………………… 59 Table 5.13: Same-width Multiply/Divide Instructions (Release 6)………………………………………………………………… 60 Table 5.14: Release 6 Compact Branch and Jump Instructions (Release 6) ……………………………………………….. 65 Table 5.15: Unconditional Jump Within a 256-Megabyte Region ………………………………………………………………… 66 Table 5.16: Unconditional Jump using Absolute Address…………………………………………………………………………… 66 Table 5.17: PC-Relative Conditional Branch Instructions Comparing Two Registers……………………………………… 66 Table 5.18: PC-Relative Conditional Branch Instructions Comparing With Zero……………………………………………. 66 Table 5.20: Address Computation and Large Constant Instructions ……………………………………………………………. 67 Table 5.19: Deprecated Branch Likely Instructions …………………………………………………………………………………… 67 Table 5.21: Serialization Instruction………………………………………………………………………………………………………… 69 Table 5.22: System Call and Breakpoint Instructions ………………………………………………………………………………… 69 Table 5.23: Trap-on-Condition Instructions Comparing Two Registers ………………………………………………………… 69 Table 5.24: Trap-on-Condition Instructions Comparing an Immediate Value ………………………………………………… 69 Table 5.25: CPU Conditional Move Instructions (Removed in Release 6)…………………………………………………….. 70 Table 5.26: CPU Conditional Select Instructions (Release 6)……………………………………………………………………… 70 Table 5.27: Prefetch Instructions ……………………………………………………………………………………………………………. 71 Table 5.28: NOP Instructions…………………………………………………………………………………………………………………. 71 Table 5.29: Coprocessor Definition and Use in the MIPS Architecture…………………………………………………………. 71 Table 5.30: CPU Instruction Format Fields………………………………………………………………………………………………. 73 Table 6.1: Parameters of Floating Point Data Types …………………………………………………………………………………. 78 Table 6.2: Value of Single or Double Floating Point Data Type Encoding…………………………………………………….. 80 Table 6.3: Value Supplied When a New Quiet NaN Is Created…………………………………………………………………… 82 Table 6.4: FPU Register Models Availability and Compliance…………………………………………………………………….. 85 Table 6.5: FIR Register Field Descriptions ………………………………………………………………………………………………. 88 Table 6.6: UFR Register Field Descriptions (pre-Release 6)………………………………………………………………………. 91 Table 6.7: UNFR Register Field Descriptions (pre-Release 6) ……………………………………………………………………. 91 Table 6.8: FCSR Register Field Descriptions …………………………………………………………………………………………… 92 Table 6.9: Cause, Enable, and Flag Bit Definitions …………………………………………………………………………………… 95 Table 6.10: Rounding Mode Definitions…………………………………………………………………………………………………… 95 Table 6.11: FCCR Register Field Descriptions …………………………………………………………………………………………. 96 Table 6.12: FEXR Register Field Descriptions………………………………………………………………………………………….. 97 Table 6.13: FENR Register Field Descriptions …………………………………………………………………………………………. 97
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 10
Table 6.14: Default Result for IEEE Exceptions Not Trapped Precisely ……………………………………………………… 100 Table 7.1: FPU Data Transfer Instructions …………………………………………………………………………………………….. 105 Table 7.2: FPU Loads and Stores Using Register+Offset Address Mode …………………………………………………… 106 Table 7.3: FPU Loads and Using Register+Register Address Mode (Removed in Release 6) ………………………. 106 Table 7.4: FPU Move To and From Instructions ……………………………………………………………………………………… 106 Table 7.5: FPU IEEE Arithmetic Operations …………………………………………………………………………………………… 107 Table 7.6: FPU-Approximate Arithmetic Operations ………………………………………………………………………………… 108 Table 7.7: FPU Multiply-Accumulate Arithmetic Operations (Removed in Release 6) ………………………………….. 108 Table 7.8: FPU Fused Multiply-Accumulate Arithmetic Operations (Release 6)…………………………………………… 109 Table 7.9: Floating Point Comparison Instructions ………………………………………………………………………………….. 110 Table 7.10: FPU Conversion Operations Using the FCSR Rounding Mode ……………………………………………….. 110 Table 7.11: FPU Conversion Operations Using a Directed Rounding Mode ………………………………………………. 111 Table 7.12: FPU Formatted Unconditional Operand Move Instructions………………………………………………………. 112 Table 7.13: FPU Conditional Move on True/False Instructions (Removed in Release 6)………………………………. 112 Table 7.14: FPU Conditional Move on Zero/Nonzero Instructions (Removed in Release 6)………………………….. 112 Table 7.15: FPU Conditional Select Instructions (Release 6)……………………………………………………………………. 113 Table 7.16: FPU Conditional Branch Instructions (Removed in Release6) …………………………………………………. 113 Table 7.18: FPU Conditional Branch Instructions (Release 6)…………………………………………………………………… 114 Table 7.19: Miscellaneous Instructions (Removed in Release 6) ………………………………………………………………. 114 Table 7.17: Deprecated FPU Conditional Branch Likely Instructions (Removed in Release 6)………………………. 114 Table 7.20: FPU Operand Format Field (fmt, fmt3) Encoding …………………………………………………………………… 115 Table7.21:ValidFormatsforFPUOperations ………………………………………………………………………………………115 Table 7.22: FPU Instruction Fields………………………………………………………………………………………………………… 117
11 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
opyright © 2001-2003,2005,2008-2014 MIPS Technologies Inc. All rights reserved.
Chapter 1
About This Book
The MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture comes as part of a multi-volume set.
• Volume I-A describes conventions used throughout the document set, and provides an introduction to the MIPS32® Architecture
• Volume I-B describes conventions used throughout the document set, and provides an introduction to the microMIPS32TM Architecture
• Volume II-A provides detailed descriptions of each instruction in the MIPS32® instruction set
• Volume II-B provides detailed descriptions of each instruction in the microMIPS32TM instruction set
• Volume III describes the MIPS32® and microMIPS32TM Privileged Resource Architecture which defines and governs the behavior of the privileged resources included in a MIPS® processor implementation
• Volume IV-a describes the MIPS16eTM Application-Specific Extension to the MIPS32® Architecture. Begin- ning with Release 3 of the Architecture, microMIPS is the preferred solution for smaller code size. Release 6 removes MIPS16e: MIPS16e cannot be implemented with Release 6.
• Volume IV-b describes the MDMXTM Application-Specific Extension to the MIPS64® Architecture and microMIPS64TM. It is not applicable to the MIPS32® document set nor the microMIPS32TM document set. With Release 5 of the Architecture, MDMX is deprecated. MDMX and MSA can not be implemented at the same time. Release 6 removes MDMX: MDMX cannot be implemented with Release 6.
• Volume IV-c describes the MIPS-3D® Application-Specific Extension to the MIPS® Architecture. Release 6 removes MIPS-3D: MIPS-3D cannot be implemented with Release 6.
• Volume IV-d describes the SmartMIPS®Application-Specific Extension to the MIPS32® Architecture and the microMIPS32TM Architecture . Release 6 removes SmartMIPS: SmartMIPS cannot be implemented with Release 6.
• Volume IV-e describes the MIPS® DSP Module to the MIPS® Architecture.
• Volume IV-f describes the MIPS® MT Module to the MIPS® Architecture.
• Volume IV-h describes the MIPS® MCU Application-Specific Extension to the MIPS® Architecture.
• Volume IV-i describes the MIPS® Virtualization Module to the MIPS® Architecture.
• Volume IV-j describes the MIPS® SIMD Architecture Module to the MIPS® Architecture.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 12
About This Book
1.1 Typographical Conventions
This section describes the use of italic, bold and courier fonts in this book. 1.1.1 Italic Text
• is used for emphasis.
• is used for bits, fields, and registers that are important from a software perspective (for instance, address bits used by software and programmable fields and registers), and various floating-point instruction formats, such as S and D.
• is used for the memory access types, such as cached and uncached. 1.1.2 Bold Text
• represents a term that is being defined.
• is used for bits and fields that are important from a hardware perspective (for instance, register bits, which are
not programmable but accessible only to hardware).
• is used for ranges of numbers; the range is indicated by an ellipsis. For instance, 5..1 indicates numbers 5 through 1.
• is used to emphasize UNPREDICTABLE and UNDEFINED behavior, as defined below. 1.1.3 Courier Text
Courier fixed-width font is used for text that is displayed on the screen, and for examples of code and instruction pseudocode.
1.1.4 Colored Text
RegisterGreen color and italic font are used for CP0 registers and CP0 register bits and fields. RegisterGreen color andi italicsubscript fonts are used for CP0 register bits and fields when appended to the register name.
1.2 UNPREDICTABLE and UNDEFINED
The terms UNPREDICTABLE and UNDEFINED are used throughout this book to describe the behavior of the processor in certain cases. UNDEFINED behavior or operations can occur only as the result of executing instructions in a privileged mode (i.e., in Kernel Mode or Debug Mode, or with the CP0 usable bit set in the Status register). Unprivileged software can never cause UNDEFINED behavior or operations. Conversely, both privileged and unprivileged software can cause UNPREDICTABLE results or operations.
1.2.1 UNPREDICTABLE
UNPREDICTABLE results may vary from processor implementation to implementation, instruction to instruction, or as a function of time on the same implementation or instruction. Software can never depend on results that are
13
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
UNPREDICTABLE. UNPREDICTABLE operations may cause a result to be generated or not. If a result is gener- ated, it is UNPREDICTABLE. UNPREDICTABLE operations may cause arbitrary exceptions.
UNPREDICTABLE results or operations have several implementation restrictions:
• Implementations of operations generating UNPREDICTABLE results must not depend on any data source
(memory or internal state) which is inaccessible in the current processor mode
• UNPREDICTABLE operations must not read, write, or modify the contents of memory or internal state which is inaccessible in the current processor mode. For example, UNPREDICTABLE operations executed in user mode must not access memory or internal state that is only accessible in Kernel Mode or Debug Mode or in another process
• UNPREDICTABLE operations must not halt or hang the processor 1.2.2 UNDEFINED
UNDEFINED operations or behavior may vary from processor implementation to implementation, instruction to instruction, or as a function of time on the same implementation or instruction. UNDEFINED operations or behavior may vary from nothing to creating an environment in which execution can no longer continue. UNDEFINED opera- tions or behavior may cause data loss.
UNDEFINED operations or behavior has one implementation restriction:
• UNDEFINED operations or behavior must not cause the processor to hang (that is, enter a state from which there is no exit other than powering down the processor). The assertion of any of the reset signals must restore the processor to an operational state.
• UNDEFINED behavior in privileged modes such as Kernel mode becomes UNPREDICTABLE behavior when virtualized and executed in Guest Kernel mode, as described in Volume IV-i, the MIPS® Virtualization Module to the MIPS® Architecture.
1.2.3 UNSTABLE
UNSTABLE results or values may vary as a function of time on the same implementation or instruction. Unlike UNPREDICTABLE values, software may depend on the fact that a sampling of an UNSTABLE value results in a legal transient value that was correct at some point in time prior to the sampling.
UNSTABLE values have one implementation restriction:
• Implementations of operations generating UNSTABLE results must not depend on any data source (memory or
internal state) which is inaccessible in the current processor mode
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 14
1.2 UNPREDICTABLE and UNDEFINED
About This Book
1.3 Special Symbols in Pseudocode Notation
In this book, algorithmic descriptions of an operation are described using a high-level language pseudocode resem- bling Pascal. Special symbols used in the pseudocode notation are listed in Table 1.1.
Table 1.1 Symbols Used in Instruction Operation Statements
Symbol
Meaning
←
Assignment
=, ≠
Tests for equality and inequality
||
Bit string concatenation
xy
A y-bit string formed by y copies of the single-bit value x
b#n
A constant value n in base b. For instance 10#100 represents the decimal value 100, 2#100 represents the binary value 100 (decimal 4), and 16#100 represents the hexadecimal value 100 (decimal 256). If the “b#” prefix is omitted, the default base is 10.
0bn
A constant value n in base 2. For instance 0b100 represents the binary value 100 (decimal 4).
0xn
A constant value n in base 16. For instance 0x100 represents the hexadecimal value 100 (decimal 256).
xy..z
Selection of bits y through z of bit string x. Little-endian bit notation (rightmost bit is 0) is used. If y is less than z, this expression is an empty (zero length) bit string.
x.bit[y]
Bit y of bitstring x. Alternative to the traditional MIPS notation xy.
x.bits[y..z]
Selection of bits y through z of bit string x. Alternative to the traditional MIPS notation xy..z.
x.byte[y]
Byte y of bitstring x. Equivalent to the traditional MIPS notation x8*y+7..8*y.
x.bytes[y..z]
Selection of bytes y through z of bit string x. Alternative to the traditional MIPS notation x8*y+7..8*z.
x.halfword[y] x.word[i] x.doubleword[i]
Similar extraction of particular bitfields (used in e.g., MSA packed SIMD vectors).
x.bit31, x.byte0, etc.
Examples of abbreviated form of x.bit[y], etc. notation, when y is a constant.
x.fieldy
Selection of a named subfield of bitstring x, typically a register or instruction encoding.
More formally described as “Field y of register x”.
For example, FIRD = “the D bit of the Coprocessor 1 Floating-point Implementation Register (FIR)”.
+, −
2’s complement or floating point arithmetic: addition, subtraction
*, ×
2’s complement or floating point multiplication (both used for either)
div
2’s complement integer division
mod
2’s complement modulo
/
Floating point division
<
2’s complement less-than comparison
>
2’s complement greater-than comparison
≤
2’s complement less-than or equal comparison
≥
2’s complement greater-than or equal comparison
nor
Bitwise logical NOR
xor
Bitwise logical XOR
and
Bitwise logical AND
or
Bitwise logical OR
15 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Table 1.1 Symbols Used in Instruction Operation Statements (Continued)
1.3 Special Symbols in Pseudocode Notation
Symbol
Meaning
not
Bitwise inversion
&&
Logical (non-Bitwise) AND
<<
Logical Shift left (shift in zeros at right-hand-side)
>>
Logical Shift right (shift in zeros at left-hand-side)
GPRLEN
The length in bits (32 or 64) of the CPU general-purpose registers
GPR[x]
CPU general-purpose register x. The content of GPR[0] is always zero. In Release 2 of the Architecture, GPR[x] is a short-hand notation for SGPR[ SRSCtlCSS, x].
SGPR[s,x]
In Release 2 of the Architecture and subsequent releases, multiple copies of the CPU general-purpose regis- ters may be implemented. SGPR[s,x] refers to GPR set s, register x.
FPR[x]
Floating Point operand register x
FCC[CC]
Floating Point condition code CC. FCC[0] has the same value as COC[1]. Release 6 removes the floating point condition codes.
FPR[x]
Floating Point (Coprocessor unit 1), general register x
CPR[z,x,s]
Coprocessor unit z, general register x, select s
CP2CPR[x]
Coprocessor unit 2, general register x
CCR[z,x]
Coprocessor unit z, control register x
CP2CCR[x]
Coprocessor unit 2, control register x
COC[z]
Coprocessor unit z condition signal
Xlat[x]
Translation of the MIPS16e GPR number x into the corresponding 32-bit GPR number
BigEndianMem
Endian mode as configured at chip reset (0 → Little-Endian, 1 → Big-Endian). Specifies the endianness of the memory interface (see LoadMemory and StoreMemory pseudocode function descriptions) and the endi- anness of Kernel and Supervisor mode execution.
BigEndianCPU
The endianness for load and store instructions (0 → Little-Endian, 1 → Big-Endian). In User mode, this endianness may be switched by setting the RE bit in the Status register. Thus, BigEndianCPU may be com- puted as (BigEndianMem XOR ReverseEndian).
ReverseEndian
Signal to reverse the endianness of load and store instructions. This feature is available in User mode only, and is implemented by setting the RE bit of the Status register. Thus, ReverseEndian may be computed as (SRRE and User mode).
LLbit
Bit of virtual state used to specify operation for instructions that provide atomic read-modify-write. LLbit is set when a linked load occurs and is tested by the conditional store. It is cleared, during other CPU operation, when a store to the location would no longer be atomic. In particular, it is cleared by exception return instruc- tions.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 16
About This Book
Table 1.1 Symbols Used in Instruction Operation Statements (Continued)
Symbol
Meaning
I:, I+n:, I-n:
This occurs as a prefix to Operation description lines and functions as a label. It indicates the instruction time during which the pseudocode appears to “execute.” Unless otherwise indicated, all effects of the current instruction appear to occur during the instruction time of the current instruction. No label is equivalent to a time label of I. Sometimes effects of an instruction appear to occur either earlier or later — that is, during the instruction time of another instruction. When this happens, the instruction operation is written in sections labeled with the instruction time, relative to the current instruction I, in which the effect of that pseudocode appears to occur. For example, an instruction may have a result that is not available until after the next instruction. Such an instruction has the portion of the instruction operation description that writes the result register in a section labeled I+1.
The effect of pseudocode statements for the current instruction labeled I+1 appears to occur “at the same time” as the effect of pseudocode statements labeled I for the following instruction. Within one pseudocode sequence, the effects of the statements take place in order. However, between sequences of statements for different instructions that occur “at the same time,” there is no defined order. Programs must not depend on a particular order of evaluation between such sections.
PC
The Program Counter value. During the instruction time of an instruction, this is the address of the instruc- tion word. The address of the instruction that occurs during the next instruction time is determined by assign- ing a value to PC during an instruction time. If no value is assigned to PC during an instruction time by any pseudocode statement, it is automatically incremented by either 2 (in the case of a 16-bit MIPS16e instruc- tion) or 4 before the next instruction time. A taken branch assigns the target address to the PC during the instruction time of the instruction in the branch delay slot.
In the MIPS Architecture, the PC value is only visible indirectly, such as when the processor stores the restart address into a GPR on a jump-and-link or branch-and-link instruction, or into a Coprocessor 0 register on an exception. Release 6 adds PC-relative address computation and load instructions. The PC value con- tains a full 32-bit address, all of which are significant during a memory reference.
ISA Mode
In processors that implement the MIPS16e Application Specific Extension or the microMIPS base architec- tures, the ISA Mode is a single-bit register that determines in which mode the processor is executing, as fol- lows:
In the MIPS Architecture, the ISA Mode value is only visible indirectly, such as when the processor stores a combined value of the upper bits of PC and the ISA Mode into a GPR on a jump-and-link or branch-and-link instruction, or into a Coprocessor 0 register on an exception.
Encoding
Meaning
0
The processor is executing 32-bit MIPS instructions
1
The processor is executing MIIPS16e or microMIPS instructions
PABITS
The number of physical address bits implemented is represented by the symbol PABITS. As such, if 36 physical address bits were implemented, the size of the physical address space would be 2PABITS = 236 bytes.
FP32RegistersMode
Indicates whether the FPU has 32-bit or 64-bit floating point registers (FPRs). In MIPS32 Release 1, the FPU has 32, 32-bit FPRs, in which 64-bit data types are stored in even-odd pairs of FPRs. In MIPS64, (and optionally in MIPS32 Release 2 and Release 3) the FPU has 32 64-bit FPRs in which 64-bit data types are stored in any FPR.
In MIPS32 Release 1 implementations, FP32RegistersMode is always a 0. MIPS64 implementations have a compatibility mode in which the processor references the FPRs as if it were a MIPS32 implementation. In such a case FP32RegisterMode is computed from the FR bit in the Status register. If this bit is a 0, the pro- cessor operates as if it had 32, 32-bit FPRs. If this bit is a 1, the processor operates with 32 64-bit FPRs.
The value of FP32RegistersMode is computed from the FR bit in the Status register.
17 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Table 1.1 Symbols Used in Instruction Operation Statements (Continued)
1.4 Notation for Register Field Accessibility
Symbol
Meaning
InstructionInBranchDe- laySlot
Indicates whether the instruction at the Program Counter address was executed in the delay slot of a branch or jump. This condition reflects the dynamic state of the instruction, not the static state. That is, the value is false if a branch or jump occurs to an instruction whose PC immediately follows a branch or jump, but which is not executed in the delay slot of a branch or jump.
SignalException(excep- tion, argument)
Causes an exception to be signaled, using the exception parameter as the type of exception and the argument parameter as an exception-specific argument). Control does not return from this pseudocode function—the exception is signaled at the point of the call.
1.4 Notation for Register Field Accessibility
In this document, the read/write properties of register fields use the notations shown in Table 1.2. Table 1.2 Read/Write Register Field Notation
Read/Write Notation
Hardware Interpretation
Software Interpretation
R/W
A field in which all bits are readable and writable by software and, potentially, by hardware.
Hardware updates of this field are visible by software read. Software updates of this field are visible by hardware read.
If the Reset State of this field is ‘‘Undefined’’, either software or hardware must initialize the value before the first read will return a predictable value. This should not be confused with the formal definition of UNDEFINED behavior.
R
A field which is either static or is updated only by hardware.
If the Reset State of this field is either ‘‘0’’, ‘‘Pre- set’’, or ‘‘Externally Set’’, hardware initializes this field to zero or to the appropriate state, respec- tively, on power-up. The term ‘‘Preset’’ is used to suggest that the processor establishes the appropri- ate state, whereas the term ‘‘Externally Set’’ is used to suggest that the state is established via an exter- nal source (e.g., personality pins or initialization bit stream). These terms are suggestions only, and are not intended to act as a requirement on the imple- mentation.
If the Reset State of this field is ‘‘Undefined’’, hardware updates this field only under those condi- tions specified in the description of the field.
A field to which the value written by software is ignored by hardware. Software may write any value to this field without affecting hardware behavior. Software reads of this field return the last value updated by hardware.
If the Reset State of this field is ‘‘Undefined’’, soft- ware reads of this field result in an UNPREDICT- ABLE value except after a hardware update done under the conditions specified in the description of the field.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 18
About This Book
Table 1.2 Read/Write Register Field Notation (Continued)
Read/Write Notation
Hardware Interpretation
Software Interpretation
R0
Reserved, read as zero, ignore writes by software.
Hardware ignores software writes to an R0 field. Neither the occurrence of such writes, nor the val- ues written, affects hardware behavior.
Hardware always returns 0 to software reads of R0 fields.
The Reset State of an R0 field must always be 0.
If software performs an mtc0 instruction which writes a non-zero value to an R0 field, the write to the R0 field will be ignored, but permitted writes to other fields in the register will not be affected.
Architectural Compatibility: R0 fields are reserved, and may be used for not-yet-defined purposes in future revisions of the architecture.
When writing an R0 field, current software should only write either all 0s, or, preferably, write back the same value that was read from the field.
Current software should not assume that the value read from R0 fields is zero, because this may not be true on future hardware.
Future revisions of the architecture may redefine an R0 field, but must do so in such a way that software which is unaware of the new definition and either writes zeros or writes back the value it has read from the field will continue to work correctly.
Writing back the same value that was read is guaran- teed to have no unexpected effects on current or future hardware behavior. (Except for non-atomicity of such read-writes.)
Writing zeros to an R0 field may not be preferred because in the future this may interfere with the oper- ation of other software which has been updated for the new field definition.
0
Release 6
Release 6 legacy “0” behaves like R0 – read as zero, nonzero writes ignored.
Legacy “0” should not be defined for any new control register fields; R0 should be used instead.
HW returns 0 when read. Only zero should be written, or, value read from reg- HW ignores writes. ister.
pre-Release 6
pre-Release 6 legacy “0” – read as zero, nonzero writes are UNDEFINED.
A field which hardware does not update, and for which hardware can assume a zero value.
A field to which the value written by software must be zero. Software writes of non-zero values to this field may result in UNDEFINED behavior of the hardware. Software reads of this field return zero as long as all previous software writes are zero.
If the Reset State of this field is ‘‘Undefined’’, soft- ware must write this field with zero before it is guar- anteed to read as zero.
19 MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Table 1.2 Read/Write Register Field Notation (Continued)
1.5 For More Information
Read/Write Notation
Hardware Interpretation
Software Interpretation
R/W0
Like R/W, except that writes of non-zero to a R/W0 field are ignored (e.g., write to StatusNMI).
Hardware may set or clear an R/W0 bit.
Hardware ignores software writes of nonzero to an R/W0 field. Neither the occurrence of such writes, nor the values written, affects hardware behavior.
Software writes of 0 to an R/W0 field may have an effect.
Hardware may return 0 or nonzero to software reads of an R/W0 bit.
If software performs an mtc0 instruction which writes a non-zero value to an R/W0 field, the write to the R/W0 field will be ignored, but permitted writes to other fields in the register will not be affected.
Software can only clear an R/W0 bit.
Software writes 0 to an R/W0 field to clear the field.
Software writes nonzero to an R/W0 bit in order to guarantee that the bit is not affected by the write.
W0
Like R/W0, except that the field cannot be read directly, but only through a level of indirection. An exam- ple is the UNFR COP1 register. Writes of non-zero to a W0 field are ignored.
Hardware may clear a W0 bit.
Hardware ignores software writes of nonzero to a W0 field. Neither the occurrence of such writes, nor the values written, affects hardware behavior.
Software can only clear a W0 bit.
Software writes 0 to a W0 field to clear the field.
Software writes nonzero to an W0 bit in order to guar- antee that the bit is not affected by the write.
1.5 For More Information
MIPS processor manuals and additional information about MIPS products can be found at http://www.mips.com. .
.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 20
Chapter 2
Overview of the MIPS® Architecture
MIPS32® and MIPS64® architectures are high performance, industry-standard architectures that provide a
robust and streamlined instruction set, with scalability from 32-bits to 64-bits, and are supported by a broad
array of hardware and software development tools, including compilers, debuggers, in-circuit emulators, middleware, application platforms, and reference designs.
The MIPS architecture is based on a fixed-length, regularly encoded instruction set and uses a load/store data model, in which all operations are performed on operands in processor registers, and main memory is accessed only by load and store instructions. The load/store model reduces the number of memory accesses, thus easing memory bandwidth requirements, simplifies the instruction set, and makes it easier for compilers to optimize register allocation.
2.1 Historical Perspective
The MIPS Architecture has evolved over time from the original MIPS ITM, through the MIPS VTM, to the current MIPS32, MIPS64, and microMIPSTM architectures. Throughout the evolution of the architecture, each new ISA has been backward-compatible with previous ISAs. In the MIPS IIITM ISA, 64-bit integers and addresses were added to the instruction set. The MIPS IVTM and MIPS VTM ISAs added improved floating-point operations and a new set of instructions that improved the efficiency of generated code and of data movement. Because of the strict backward- compatible requirement of ISAs, such changes were unavailable to 32-bit implementations of the ISA that were, by definition, MIPS ITM or MIPS IITM implementations. The MIPS32 Release 6 ISA maintains backward-compatibility, with the exception of a few rarely used instructions, though the use of trap-and-emulate or trap-and-patch; all pre- Release 6 binaries can execute under binary translation.
While the user-mode ISA was always backward-compatible, the PRA and the privileged-mode ISA were allowed to change on a per-implementation basis. As a result, the R3000® privileged environment was different from the R4000® privileged environment, and subsequent implementations, while similar to the R4000 privileged environ- ment, included subtle differences. Because the privileged environment was never part of the MIPS ISA, an imple- mentation had the flexibility to make changes to suit that particular implementation. Unfortunately, this required kernel software changes to every operating system or kernel environment on which that implementation was intended to run.
Many of the original MIPS implementations were targeted at computer-like applications such as workstations and servers. In recent years MIPS implementations have had significant success in embedded applications. Today, most of the MIPS parts that are shipped go into some sort of embedded application. Such applications tend to have differ- ent trade-offs than computer-like applications including a focus on cost of implementation, and performance as a function of cost and power.
The MIPS32 and MIPS64 Architectures are intended to address the need for a high-performance but cost-sensitive MIPS instruction set. The MIPS32 Architecture is based on the MIPS II ISA, adding selected instructions from MIPS III, MIPS IV, and MIPS V to improve the efficiency of generated code and of data movement. The MIPS64 Architec- ture is based on the MIPS V ISA and is backward compatible with the MIPS32 Architecture. Both the MIPS32 and MIPS64 Architectures bring the privileged environment into the Architecture definition to address the needs of oper- ating systems and other kernel software. Both also include provision for adding optional components—Modules of
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture Revision 6.01 21
the base architecture, MIPS Application Specific Extensions (ASEs), User Defined Instructions (UDIs), and custom coprocessors to address the specific needs of particular markets.
The MIPS32 and MIPS64 Architectures provide a substantial cost/performance advantage over microprocessor implementations based on traditional architectures. This advantage is a result of improvements made in several con- tiguous disciplines: VLSI process technology, CPU organization, system-level architecture, and operating system and compiler design.
The microMIPS32 and microMIPS64 Architectures provide the same functionality as MIPS32 and MIPS64, with the additional benefit of smaller code size. The microMIPS architectures are supersets of MIPS32/MIPS64 architectures, with almost the same sets of 32-bit sized instructions and additional 16-bit instructions that reduce code size. Unlike the earlier versions of the architecture, microMIPS provides assembler-source code compatibility with its predeces- sors instead of binary compatibility.
2.2 Components of the MIPS® Architecture 2.2.1 MIPS Instruction Set Architecture (ISA)
The MIPS32 and MIPS64 Instruction Set Architectures define a compatible family of instructions that handle 32-bit data and 64-bit data (respectively) within the framework of the overall MIPS Architecture. Included in the ISA are all instructions, both privileged and unprivileged, by which the programmer interfaces with the processor. The ISA guar- antees object-code compatibility for unprivileged programs executing on any MIPS32 or MIPS64 processor; all instructions in the MIPS64 ISA are backward compatible with those instructions in the MIPS32 ISA. In many cases, privileged programs are also object-code compatible—using conditional compilation or assembly language macros, it is often possible to write privileged programs that run on both MIPS32 and MIPS64 implementations.
In Release 6 implementations, object-code compatibility is not guaranteed when directly executing pre-Release 6 code, because certain pre-Release 6 instruction encodings are allocated to completely different instructions on Release 6. Nevertheless, there is a useful subset of instructions that have the same encodings in both Release 6 and pre-Release 6, and an even larger subset that can be trapped and emulated. Furthermore, using conditional compila- tion or assembly language macros, it is often possible to write software that runs on both Release 6 and pre-Release 6 implementations. Binary compatibility can be obtained by binary translation; Release 6 is designed so that simple instruction replacement can accomplish all such binary translation, minimizing remapping of instruction addresses.
For example, to binary translate/patch a pre-Release 6 binary, the Release 6 compact branch instructions, with no delay slots, mean that any instruction can be replaced by a BALC single instruction call to an emulation function – assuming that the emulation function can be reached by the branch target with its 26 bit / 256MB span, and that the link register can be overwritten – which a binary translator can usually arrange. A single BC instruction avoids using the link register, at the cost of more emulation entry points. JC/JIALC can also be used, although their smaller 16-bit offset probably requires the binary translator to use an extra register.
Release 6 and subsequent releases will be backward compatible going forward, i.e., Release 6 code will run on all subsequent releases.
2.2.2 MIPS Privileged Resource Architecture (PRA)
The MIPS32 and MIPS64 Privileged Resource Architectures define a set of environments and capabilities on which the ISA operates. The effects of some components of the PRA are visible to unprivileged programs; for instance, the virtual memory layout. Many other components are visible only to privileged programs and the operating system. The PRA provides the mechanisms necessary to manage the resources of the processor: virtual memory, caches, excep- tions, user contexts, etc.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 22
2.2 Components of the MIPS® Architecture
Overview of the MIPS® Architecture
2.2.3 MIPS Modules and Application Specific Extensions (ASEs)
The MIPS32 and MIPS64 Architectures provide support for optional components – known as either Modules or appli- cation specific extensions. As optional extensions to the base architecture, the Modules/ASEs do not burden every implementation of the architecture with instructions or capability that are not needed in a particular market. An ASE/ Module can be used with the appropriate ISA and PRA to meet the needs of a specific application or an entire class of applications.
2.2.4 MIPS User Defined Instructions (UDIs)
In addition to support for ASEs and Modules as described above, the MIPS32 and MIPS64 Architectures define spe- cific instructions for use by each implementation. The Special2 and/or COP2 major opcodes and Coprocessor 2 are reserved for capabilities defined by each implementation. In Release 6, use of the Special2 opcode is not permitted.
2.3 Evolution of the Architecture
The evolution of an architecture is a dynamic process that takes into account both the need to provide a stable plat- form for implementations, as well as new market and application areas that demand new capabilities. Enhancements to an architecture are appropriate when they:
• are applicable to a wide market
• provide long-term benefit
• maintain architectural scalability
• are standardized to prevent fragmentation
• are a superset of the existing architecture
Taking into account these criteria, architects at MIPS Technologies constantly evaluate suggestions for architectural changes and enhancements, and new releases of the architecture, while infrequent, have been made at appropriate points:
• Release 1, the original version of the MIPS32 Architecture, released in 1985
• Release 2, added in 2002
• Release 3 (MIPSr3TM), added in 2010
• Release 4, added in 2012. For internal use only.
• Release 5, added in 2013
• Release 6 added in 2014
The evolution of the MIPS architecture is summarized in Figure 2.1
23
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
Overview of the MIPS® Architecture
25
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
offset be aligned, but the implication is that the base register is also aligned, and this is more consistent with the indexed load/store instructions which have no offset field. The restriction that the base register be naturally- aligned is eliminated in the MIPS32 Architecture, leaving the restriction that the effective address be naturally- aligned.
• Early MIPS implementations required two instructions separating a MFLO or MFHI from the next integer multi- ply or divide operation. This hazard was eliminated in the MIPS IV ISA, although the MIPS RISC Architecture Specification does not clearly explain this fact. The MIPS32 Architecture explicitly eliminates this hazard and requires that the hi and lo registers be fully interlocked in hardware for all integer multiply and divide instruc- tions (including, but not limited to, the MADD, MADDU, MSUB, MSUBU, and MUL instructions introduced in this specification).
• The Implementation and Programming Notes included in the instruction descriptions for the MADD, MADDU, MSUB, MSUBU, and MUL instructions should be applied to all integer multiply and divide instructions in the MIPS RISC Architecture Specification.
2.3.2 MIPS32 Architecture Release 2
Enhancements in Release 2 of the MIPS32 Architecture are:
•
•
•
•
• •
• •
Vectored interrupts: This enhancement provides the ability to vector interrupts directly to a handler for that inter- rupt. Vectored interrupts are an option in Release 2 implementations and the presence of that option is denoted by the Config3VInt bit.
Support for an external interrupt controller: This enhancement reconfigures the on-core interrupt logic to take full advantage of an external interrupt controller. This support is an option in Release 2 implementations and the presence of that option is denoted by the Config3EIC bit.
Programmable exception vector base: This enhancement allows the base address of the exception vectors to be moved for exceptions that occur when StatusBEV is 0. Doing so allows multi-processor systems to have separate exception vectors for each processor, and allows any system to place the exception vectors in memory that is appropriate to the system environment. This enhancement is required in a Release 2 implementation.
Atomic interrupt enable/disable: Two instructions have been added to atomically enable or disable interrupts, and return the previous value of the Status register. These instructions are required in a Release 2 implementa- tion.
The ability to disable the Count register for highly power-sensitive applications. This enhancement is required in a Release 2 implementation.
GPR shadow registers: This addition provides the addition of GPR shadow registers and the ability to bind these registers to a vectored interrupt or exception. Shadow registers are an option in Release 2 implementations and the presence of that option is denoted by a non-zero value in SRSCtlHSS. While shadow registers are most useful when either vectored interrupts or support for an external interrupt controller is also implemented, neither is required.
Field, Rotate and Shuffle instructions: These instructions add additional capability in processing bit fields in reg- isters. These instructions are required in a Release 2 implementation.
Explicit hazard management: This enhancement provides a set of instructions to explicitly manage hazards, in place of the cycle-based SSNOP method of dealing with hazards. These instructions are required in a Release 2 implementation.
• Access to a new class of hardware registers and state from an unprivileged mode. This enhancement is required in a Release 2 implementation.
• Coprocessor 0 Register changes: These changes add or modify CP0 registers to indicate the existence of new and optional state, provide L2 and L3 cache identification, add trigger bits to the Watch registers, and add support for 64-bit performance counter count registers. This enhancement is required in a Release 2 implementation.
• Support for 64-bit coprocessors with 32-bit CPUs: These changes allow a 64-bit coprocessor (including an FPU) to be attached to a 32-bit CPU. This enhancement is optional in a Release 2 implementation.
• New Support for Virtual Memory: These changes provide support for a 1KByte page size. This change is optional in Release 2 implementations, and support is denoted by Config3SP.
2.3.3 MIPS32 Architecture Releases 2.5+
Some optional features were added after Revision 2.5:
• Support for a MMU with more than 64 TLB entries. This feature aids in reducing the frequency of TLB misses.
• Scratch registers within Coprocessor0 for kernel mode software. This feature aids in quicker exception handling by not requiring the saving of usermode registers onto the stack before kernelmode software uses those registers.
• A MMU configuration which supports both larger set-associative TLBs and variable page-sizes. This feature aids in reducing the frequency of TLB misses.
• The CDMM memory scheme for the placement of small I/O devices into the physical address space. This scheme allows for efficient placement of such I/O devices into a small memory region.
• An EIC interrupt mode where the EIC controller supplies a 16-bit interrupt vector. This allows different inter- rupts to share code.
• The PAUSE instruction to deallocate a (virtual) processor when arbitration for a lock doesn’t succeed. This allows for lower power consumption as well as lower snoop traffic when multiple (virtual) processors are arbi- trating for a lock.
• More flavors of memory barriers that are available through stype field of the SYNC instruction. The newer mem- ory barriers attempt to minimize the amount of pipeline stalls while doing memory synchronization operations.
2.3.4 MIPS32 Release 3 Architecture (MIPSr3TM)
MIPSr3TM is a family of architectures that includes Release 3.0 of the MIPS32 Architecture and the first release of the microMIPS32 architecture.
Enhancements in the MIPS Release 3 Architecture are:
• microMIPS instruction set.
• This instruction set contains both 16-bit and 32-bit sized instructions.
• The microMIPS32 ISA has all of the functionality of MIPS32 with smaller code sizes.
• microMIPS32 is assembler source code compatible with MIPS32.
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 26
2.3 Evolution of the Architecture
Overview of the MIPS® Architecture
27
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
• microMIPS32 replaces the MIPS16e ASE.
• microMIPS32 is an additional base instruction set architecture that is supported along with MIPS32.
• A device can implement either the base ISA or both. The ISA field of the Config3 register denotes which ISA is implemented.
• A device can implement any other Module/ASE with either base architecture.
• microMIPS32 shares the same privileged resource architecture with MIPS32.
• Branch Likely instructions are not supported in the microMIPS hardware architecture. The microMIPS tool- chain replaces these instructions with equivalent code sequences.
• A more flexible version of the Context Register that can point to any power-of-two sized data structure. This optional feature is denoted by the CTXTC field of Config3.
• Additional protection bits in the TLB entries that allow for non-executable and write-only virtual pages. This optional feature is denoted by the RXI field of Config3.
• A more programmable virtual address space map without fixed cacheability and mapability attributes is intro- duced as an optional feature. This allows implementations to decide how large/small uncached/unmapped seg- ments need to be. These capabilities are implemented through the Segmentation Control registers. This optional feature is denoted by the SC field of Config3.
• Along with a programmable virtual address map, it is possible to create separate user-mode & kernel-mode views of segments. This allows a larger kernel virtual address space to be defined. To access both this larger ker- nel address space and the overlapping user-space, additional load/store instructions are introduced. These new optional instructions are denoted by the EVA field of Config5.
• Support for certain IEEE-754-2008 FPU behaviors (as opposed to behaviors of the older IEEE-754-1985 stan- dard) is now defined. These behaviors are indicated by the Has2008 field of the FIR register in the FPU and bits ABS2008 or NAN2008 in the FCSR register.
• Optional TLB invalidate instructions are introduced. These are required for Segmentation Control that allows creation of a virtual address map without unmapped segments.
2.3.5 MIPS32 Architecture Release 5
Release 5 is a family of architectures (MIPS32, MIPS64, microMIPS32, and microMIPS64) that adds the following capabilities:
• The Multi-threading module is now an optional component of all of the base architectures. Previously the MT ASE was licensed as a separate architecture product.
• The DSP module is now an optional component of all the base architectures. Previously, the DSP ASE was licensed as a separate architecture product.
• The Virtualization module is now an optional component of all the base architectures.
• The MIPS SIMD Architecture (MSA) module is now an optional component of all the base architectures. Release 5 has the following changes:
• The MDMX ASE is formally deprecated. The equivalent functionality is provided by the MSA module.
• The 64-bit versions of the DSP ASE are formally deprecated. The equivalent functionality is provided by the MSA module.
• If an FPU is present, it must be a 64-bit FPU.
• The MIPS32 and MIPS64 Release 5 architectures provide no features that support IEEE-754-2008 fused multi- ply-add without intermediate rounding. (In Release 6, unfused multiply-adds are removed, and fused multiply- adds are added.)
2.3.6 MIPS32 Architecture Release 6
Release 6 is a family of architectures (MIPS32 and MIPS64) that adds the following capabilities:
• The instruction set has been simplified by removing infrequently used instructions and rearranging instruction encodings so as to free a significant part of the opcode map for future expansion.
• CPU Enhancements
• Some 3-source instructions (conditional moves) are removed.
• Branch Likely instructions are removed (they were deprecated in earlier releases).
• A powerful family of compact branches with no delay slot, including: unconditional branch (BC) and branch-and-link (BALC) with a very large 26-bit offset (+/- 128 MB span); conditional branch on zero/non- zero with a large 21-bit offset (+/- 4MB span); a full set of signed and unsigned conditional branches that compare two registers (e.g., BGTUC), or which compare a register against zero (e.g., BGTZC); and a full set of branch and link instructions that compare a register against zero (e.g., BGTZALC).
• Integer overflow: Some trapping instructions are removed, specifically those with 16-bit immediate fields (e.g., ADDI (trap on integer overflow), TGEI (compare and trap)), mitigated by compact branches on over- flow / no- overflow, which is easier to use by software.
• Compact Indexed Jump instructions with no delay slot: Designed to support large absolute addresses.
• Instructions to generate large constants, loading (adding) constants to bits 16-31, 32-47, and 48-63.
• PC-relative instructions: In addition to branches and jumps, loads of 32- and 64-bit data and address genera- tion with large relative offsets. Release 6 has true PC+offset relative-addressing control-transfer instructions that can span up to 26 bits (256MB), without the alignment restriction of the Jump (J) instruction (which can still be used in Release 6).
• Integer accumulator instructions and the HI/LO registers are removed from the Release 6 base instruction set and moved to the DSP Module.
• Bit-reversal and byte-alignment instructions migrated from DSP to Release 6 base instruction set.
• Multiply and Divide instructions are redefined to produce a single GPR result.
• The unaligned memory instructions are removed (e.g., LWL/LWR) and replaced by requiring misaligned memory access for most ordinary load/store instructions (possibly via trap-and-emulate).
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01 28
2.3 Evolution of the Architecture
Overview of the MIPS® Architecture
29
MIPS® Architecture For Programmers Volume I-A: Introduction to the MIPS32® Architecture, Revision 6.01
•
• New instruction BALIGN can be used to emulate a misaligned load without using LWL/LWR, following a pair of ordinary load words.
• CPU truth values changed from single-bit to multi-bit: pre-Release 6 instructions that only looked at bit 0 of the register containing a truth value are replaced by Release 6 instructions that generate truth values of all zeroes or all ones (suitable for logical operations involving masks) and interpret all zeroes or any non-zero bit as true or false, which is compatible with programming languages such as C. There are also related changes to branches and conditional move instructions.
• Indexed addressing is removed for FPU loads and stores (e.g., LWXC1), mitigated by left shift add instruc- tions (e.g., LSA rd:=rs<