CONDITION FLAGS
Processor Status Register (PSR)
This subset is the Application Processor Status Register (APSR)
31 30 29 28 27 26 0
Copyright By PowCoder代写 加微信 powcoder
DSP overflow and saturation flag: A value of 1 indicates that a saturated arithmetic instruction limited its result.
Overflow flag: A value of 1 indicates a 2’s complement overflow during an addition, subtraction or compare.
Carry or Borrow flag: A value of 1 indicates a carry out from addition or NO borrow out during subtraction.
Zero flag: A value of 1 indicates a result (or difference) of zero Negative flag: A value of 1 indicates a negative result
ADDITION AND SUBTRACTION
0011 +1010
2’s Complement
(+3) +(-6)
A single ADD (or SUB) instruction works for both unsigned and 2’s comp.
Carries and Overflow
(-5) +(+6)
C4 C3 C2 Carries
Overflow Overflow
Overflow Overflow
Overflow detection: Unsigned: C flag = 1 2’s Comp: V flag = 1
SUBTRACTION
Carries and Overflow
Unsigned 12
2’s Comp (─4)
Overflow Overflow
Overflow Overflow
C4 C3 C2 C1 C0
Overflow detection: Unsigned: C flag = 0 2’s Comp: V flag = 1
ADDITION AND SUBTRACTION
Instruction
ADD{S} Rd,Rn,Op2
RdRn + Op2
Add with Carry
ADC{S} Rd,Rn,Op2
RdRn + Op2 + Carry
SUB{S} Rd,Rn,Op2
RdRn − Op2
Subtract with Carry
SBC{S} Rd,Rn,Op2
RdRn − Op2 − ~Carry
Reverse Subtract
RSB{S} Rd,Rn,Op2
RdOp2 − Rn
“Op2” may only be a constant, a register, or a shifted register.
“S” must be appended to affect the flags!
ARM Thumb-2 Instruction Encodings
• Two instruction sizes: 16-bits and 32-bits
• Some operations are available in both versions
• The assembler always uses a 16-bit version if available
ADDS Rd,Rn,imm3
ADD Immediate
Registers R0-R7 3-bit constants “S” required
Registers R0-R15 12-bit constants “S” optional
ADD{S} Rd,Rn,imm12
ADD Register
Registers R0-R7 “S” required Rm not shifted
Registers R0-R15 “S” optional
Shifted Rm optional
ADDS Rd,Rn,Rm
ADD{S} Rd,Rn,Rm{,shift bits}
Reducing Code Size
• Use 16-bit instructions whenever possible.
• Most 16-bit instructions always affect the flags
(“S” appended)
– Append the letter “S” – even if you don’t need the flags
– Don’t append the letter “S” if flags must be unchanged.
• If in doubt, add the “narrow” (.N) suffix, as in ADD.N
– If 16-bit version is not available, assembler will issue an error message
ADDITION AND SUBTRACTION y = x+ 5 ;
// This works but is inefficient
// Reuse registers when possible
LDR R0,=x LDR R0,[R0] LDR R1,=5 ADDS.N R2,R0,R1 LDR R0,=y STR R2,[R0]
// R0 <-- x
// R1 <-- 5
// R2 <-- R0 + R1 // R0 <-- &y
// R2 --> y
LDR R0,=x LDR R0,[R0] ADDS.N R0,R0,5 LDR R1,=y STR R0,[R1]
// R0 <-- &x
// R0 <-- x
// R0 <-- R0 + 5 // R1 <-- &y
// R0 --> y
Common mistake:
LDR R0,x+5 //R0≠𝑥+5
“x+5” is an address expression evaluated at time of assembly, equivalent to “&x + 5” in C.
// Op2 may be a (small) constant:
LDR R0,=x LDR R0,[R0] ADDS.N R1,R0,5 LDR R2,=y STR R1,[R2]
// R0 <-- &x
// R0 <-- x
// R1 <-- R0 + 5 // R2 <-- &y
// R1 --> y
MULTIPLE-PRECISION ADDITION
// int64_t Add64(int64_t num1, int64_t num2) ;
.global Add64 .align .thumb_func
BX LR .end
// R0 = sum bits 31-0
// R1 = sum bits 63-32
ADDS.N R0,R0,R2
ADCS.N R1,R1,R3
Append “S” to ADD so it will record any carry out.
Use an ADC so that the carry is included in the second sum.
DECIMAL MULTIPLICATION
99 999 ×99 ×999
The product may require as many digits as the total # of digits in the two operands.
BINARY MULTIPLICATION
The product may require as many digits as the total # of digits in the two operands.
12 1100 ×13 ×1101 15610 = 100111002
3 0011 ×2 ×0010
A “double length product” uses the full product width: 2N bitsN bits × N bits
A “single length product”
keeps only least-significant
half: N bitsN bits × N bits
BINARY MULTIPLICATION
Unsigned Binary
12 1100 ×13 ×1101
15610 100111002
The signed and unsigned products are different for identical operand patterns.
But the least-significant halves of both products will always be the same.
Single-Length Products
Why Signed & Unsigned are Identical
𝑛−1 መ 𝐴𝑢 =+2𝑛−1𝑎𝑛−1 +(2𝑛−2𝑎𝑛−2+⋯+20𝑎0) =+2 𝑎𝑛−1+𝐴
𝐴𝑠 = −2𝑛−1𝑎𝑛−1 +( 2𝑛−2𝑎𝑛−2 + ⋯ + 20𝑎0 ) 𝑛−1 መ = −2 𝑎𝑛−1 + 𝐴
𝑛−1 መ 𝑛−1
𝐴𝐵 =(2 𝑎 +𝐴)(2 𝑏 +𝐵) 𝑢𝑢 𝑛−1 𝑛−1
2𝑛−2 መ = 2 𝑎𝑛−1𝑏𝑛−1 + + 𝐴𝐵
𝑛−1 መ 2 𝑎𝑛−1𝐵 + 𝑏𝑛−1𝐴
𝑛−1 መ𝑛−1
𝐴 𝐵 = (−2 𝑎 + 𝐴)(−2 𝑏 + 𝐵) 𝑠𝑠 𝑛−1 𝑛−1
2𝑛−2 መ = 2 𝑎𝑛−1𝑏𝑛−1 − + 𝐴𝐵
The difference between signed and
unsigned double-length products can
only be in their most-significant halves.
𝑛−1 መ 2 𝑎𝑛−1𝐵 + 𝑏𝑛−1𝐴
𝑛መ 𝐴 𝐵 − 𝐴 𝐵 = 2 (𝑎 𝐵 + 𝑏 𝐴)
𝑢𝑢 𝑠𝑠 𝑛−1 𝑛−1
MULTIPLICATION IN C
Consider how integer multiplication works in C:
C= a*b; int32_t
z= x * y ;
The data type (and size) of the product is the same as operands. Thus: 32 bits × 32 bits→32 bits.
int32_t int32_t
The result is often stored in a variable of the same type, so a single-length product is sufficient.
uint32_t uint32_t
Since the result is a single-length product, the same instruction can be used for signed and unsigned.
MULTIPLICATION IN C
Consider how integer multiplication works in C:
In C, operator results have the same data type as their operands.
→ C uses single-length products.
uint32_t a32, b32 ;
uint64_t c64 ;
c64 = a32 * b32 ;
Zero-exten6d4edbitos x6464bibtsits 32-bit product
c64 = a32 * c64 ;
Zero-extended to 64 bits to
match data type of c64 18
Storing 32-bit product in a 64-bit variable simply extends the 32-bit result.
a32 is promoted to 64-bits to match c64; the 64×64 product requires a function
MULTIPLICATION IN C
Consider how integer multiplication works in C:
On a 32-bit CPU, 8 and 16-bit operands are first promoted to 32 bits
int8_t a8 ; int16_t b16 ;
int32_t c32 ;
Each extended 32 bits x 32 bits
to 32 bits
Thus the product of a8 and b16 will becomes a 32×32 single-length product.
All integer multiplications produce either a single 32×32 instruction, or else a 64×64 library function call.
MULTIPLICATION
For Single-Length Products
Instruction
32-bit Multiply
MUL{S} Rd,Rn,Rm
Rd(int32_t) Rn×Rm
32-bit Multiply with Accumulate
MLA Rd,Rn,Rm,Ra
Rd Ra + (int32_t) Rn×Rm
32-bit Multiply & Subtract
MLS Rd,Rn,Rm,Ra
Rd Ra – (int32_t) Rn×Rm
MULS affects flags N and Z. No other multiply instruction affects the flags.
Note: MLA and MLS use the product of the middle two registers.
All multiply instructions require their operands to be in registers. No constants or memory operands.
MULTIPLICATION
For Double-Length Products
Instruction
64-bit Unsigned Multiply
UMULL Rdlo,Rdhi,Rn,Rm
RdhiRdlo (uint64_t) Rn×Rm
64-bit Unsigned Multiply with Accumulate
UMLAL Rdlo,Rdhi,Rn,Rm
RdhiRdlo RdhiRdlo + (uint64_t) Rn×Rm
64-bit Signed Multiply
SMULL Rdlo,Rdhi,Rn,Rm
RdhiRdlo (int64_t) Rn×Rm
64-bit Signed Multiply with Accumulate
SMLAL Rdlo,Rdhi,Rn,Rm
RdhiRdlo RdhiRdlo + (int64_t) Rn×Rm
MULTIPLICATION OVERFLOW
Overflow: The correct result exceeds the range that can be represented by the number of bits allocated to hold it.
Double-Length Products (signed or unsigned):
• Overflow is not possible
Single-Length Unsigned Products:
• Overflow occurs when the most-significant half
of the double-length product is non-zero.
Single-Length Signed Products:
• Overflow occurs when the most-significant half
of the double-length product is not a sign-extension of the least-significant half.
1110 14 0111 ×7 00102 9810
0111 ×(+7) 1111 00102 -1410
The overflow flag (V) is not affected. Recognizing overflow is virtually impossible if only a single-length product is available.
64×64-bit MULTIPLICATION
How to compute single-length product
AHI (Upper Half)
ALO (Lower Half)
AHIBHI AHIBLO ALOBHI
= 232AHI + ALO = 232BHI + BLO
BHI (Upper Half)
BLO (Lower Half)
= (232AHI + ALO)(232BHI + BLO)
= 264AHIBHI + 232(AHIBLO + ALOBHI) + ALOBLO
1LS:WMUofL(A B ) HII LO
× 232 × 232
2LS:WMoLfAA(A BB ) L OL O H HI I
fA B LO LO
MULTIPLICATION Single-Length 64×64-Bit Product
// int64_t Mult64x64(int64_t a, int64_t b) ;
.global Mult64x64
.thumb_func
Mult64x64:
// R1.R0 = a, R3.R2 = b
MUL R1,R1,R2 // R1
MLA R1,R0,R3,R1 // R1
UMULL R0,R2,R0,R2 // R2.R0 = Alo x Blo
ADDS.N R1,R1,R2 // R1 += MSHalf of Alo x Blo
BX LR .end
= Ahi x Blo
+= Alo x Bhi
Multiplication Summary
3232 x 32 (single length)
6432 x 32 (double length)
6464 x 64 (single length)
MUL R3,R1,R2
𝑠𝑖𝑔𝑛𝑒𝑑 𝑜𝑟 𝑢𝑛𝑠𝑖𝑔𝑛𝑒𝑑
UMULL R2,R3,R0,R1 (𝑢𝑛𝑠𝑖𝑔𝑛𝑒𝑑)
UMULL R4,R5,R0,R2 MLA R5,R1,R2,R5 MLA R5,R0,R3,R5
𝑠𝑖𝑔𝑛𝑒𝑑 𝑜𝑟 𝑢𝑛𝑠𝑖𝑔𝑛𝑒𝑑
SMULL R2,R3,R0,R1
R3R1 × R2
R3.R2R0 × R1
R5.R4R1.R0 × R3.R2
DIVISION IN C
Consider how integer division works in C:
All integer divisions produce either a single 32÷32 instruction, or else a library function call for 64÷64.
int8_t a8 ; int16_t b16 ; int32_t c32 ; int64_t d64 ;
… = a8 / b16 ;
… = d64 / c32 ;
c32 is promoted to 64 bits to match d64; this becomes a library function call for 64÷64 division that returns a 64-bit quotient.
8 and 16-bit operands are first promoted to 32 bits; this becomes a single 32÷32 divide instruction that produces a 32-bit quotient.
2’s complement:
(-16) ÷(+4)
SINGLE-LENGTH DIVISION
Two different instructions are required for signed versus unsigned division.
Instruction
Unsigned Divide
UDIV Rd,Rn,Rm
Rd(uint32_t) Rn ÷ Rm
Signed Divide
SDIV Rd,Rn,Rm
Rd (int32_t) Rn ÷ Rm
COMPUTING A REMAINDER remainder = dividend – divisor × quotient
LDR R0,[⋯]
LDR R1,[⋯]
SDIV R2,R0,R1
STR R2,[⋯]
MLS R3,R1,R2,R0 // R3 = – R1*R2 STR R3,[⋯] // R3 –> remainder
// R0 <-- // R1 <-- // R2=R0/R1 // R2 -->
(+14) ÷ (+3)
(+14) ÷ (-3)
(-14) ÷ (+3)
(-14) ÷ (-3)
DIVISION OVERFLOW
Overflow during division means that the result exceeds the quotient’s range of representation.
The smaller range of a single-length dividend drastically reduces the number of operand combinations that result in an overflow, leaving only the following possibilities:
• Unsigned or 2’s complement: Division by zero
• 2’s complement: Full-scale negative (-231) divided by -1,
There is no hardware detection of overflow during integer division. V flag (overflow) is not affected.
Summary of Instructions for Integer Arithmetic
Integer Addition • Instructions: ADD{S}, ADC{S}
• Format Examples:
– ADDS.N R0,R1,5
– ADDS.N R0,R1,R2
– ADD R0,R1,R2,LSL 2
// R0R1 + 5
// R0R1 + R2
// R0R1 + (R2 << 2)
• Flags: Append “S” to capture result characteristics in NCVZ
• Overflow:
– Unsigned: Carry Flag (C) = 1
– Signed: Overflow Flag (V) = 1 (Set when CN≠CN-1)
• Multiple precision addition: ADDS→ADC
Integer Subtraction
• Instructions: SUB{S}, SBC{S}, RSB{S} • Format Examples:
R0,R1,5 R0,R1,R2 R0,R1,R2,LSL 2
// R0R1 - 5
// R0R1 - R2
// R0R1 - (R2 << 2)
• Flags: Append “S” to capture result characteristics in NCVZ
• Overflow:
• Unsigned: Carry Flag (C) = 0
• Signed: Overflow Flag (V) = 1 (Set when CN≠CN-1)
• Multiple precision subtraction: SUBS→SBC
Single-Length 32x32 Integer Multiplication
• Instructions:MUL{S},MLA,MLS
• Signed vs. Unsigned: Use same instruction
• FormatExamples:
– MUL – MLA – MLS
R0,R1,R2 R0,R1,R2,R3 R0,R1,R2,R3
// R0R1 × R2
// R0R3 + R1 × R2 // R0R3 - R1 × R2
• Flags: Append “S” (only MUL) to capture result characteristics in NZ
• Overflow: May happen but impossible to detect
Double-Length 32x32 Integer Multiplication
• Instructions: UMULL, SMULL, UMLAL, SMLAL
• Signed vs. Unsigned: Use different instructions!
• Signed Instruction Formats: – SMULL R0,R1,R2,R3
– SMLAL R0,R1,R2,R3
• Unsigned Instruction Formats: – UMULL R0,R1,R2,R3
– UMLAL R0,R1,R2,R3
• Flags: Unaffected
• Overflow: Can’t happen
// R1.R0R2 × R3
// R1.R0R1.R0 + R2 × R3
// R1.R0R2 × R3
// R1.R0R1.R0 + R2 × R3
Single-Length 32÷32 Integer Division
• Instructions: SDIV, UDIV
• Signed vs. Unsigned: Use different instructions!
• Signed Instruction Format: – SDIV R0,R1,R2
• Unsigned Instruction Format: – UDIV R0,R1,R2
• Flags: Unaffected
• Overflow:
– Division by zero
// R0R1 ÷ R2 // R0R1 ÷ R2
– Full-scale negative divided by -1
• Remainder: Use MLS (dividend – divisor × quotient)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com