Introduction to ARMv7 Assembly Language Programming
Profs. Leod & ECE3375, Winter 2022
This lesson provides an introduction to the ARMv7 assembly language, which is used in the online simu- lator. As mentioned, almost all actual coding in this course can be done with C (and, at this level, C is almost the same as Java — mostly just some differences in syn- tax), so it isn’t crucial to memorize all of this material. But assembly language is fun! And we may ask you to interpret some sample code on the exam. This lesson covers the basics of ARMv7 assembly language, with re- spect to moving and manipulating data, evaluating tests and conditions, and branching.
Copyright By PowCoder代写 加微信 powcoder
Data Movement Instructions
These are the type of instructions that those who are only familiar with high-level computer languages may not even think of as in- structions. If you want to create a variable x with the value y + 5, then you would just enter the instruction x=y+5, right? In assembly it is not so simple. There are three basic instructions for data move- ment: moving data between registers, from memory into a register, and from a register into memory. These concepts were already dis- cussed last week, but I want to revisit them here briefly just to make sure you are familiar with ARMv7 syntax.
Mnemonic: To move data between registers, use the mov mnemonic seen previously. Two operands are always required, the first is the register the data is moved into, the second is the register the data is moved from.
Mnemonic: To load data into a register from memory, use the ldr mnemonic. Two operands are always required, the first is the register the data is moved into, the second is address in memory that the data is moved from.
Mnemonic: To store data from a register into memory, use the str mnemonic. Two operands are always required, the first is the register the data is moved from, the second is address in memory that the data is moved into.
As shown above, the second operand does not always have to be a register, it can be a decimal literal, or it can be the label of a pre- value defined value. Some examples:
mov r0 , #10
mov r1 , r0
mov r1 , big_word
big_word: .word 53400524
Note the use of directives: .text is used to identify the main code, and .word is used to identify a piece of data as a 32-bit word (rather than a .byte or some other size).
Using a register as the second operand will always use the value in that register as data. To use it as an address, enclose it in square brackets. This is needed for str and ldr. For example, to store the value in r0 into the memory address contained in r1:
str r0, [r1]
str r0, r1
str [r0], r1
str [r0], [r1] @now we are just being silly
@this works
@this will not compile @this also will not compile
Initializing a register with a literal (as in mov r1, #12) only works with “small” numbers. 1
• Use a “variable” (a label defined with a .word of data, as in the example above, for example) to get larger values into reg- isters.
Another trick is to use ldr with a literal as the second operand. The syntax is a bit different from using a literal with mov, and the literal must be a hexadecimal number:
ldr r1, =0xff200020
As previously mentioned, the program counter is a 32-bit regis- ter called pc. This can be read and written in code, but probably shouldn’t. For example:
mov r1 , pc
will move the address of the next instruction in the program, stored in pc, into the general register r1. This is a safe thing to do, al- though it may not be very useful. You can also have “fun” and try something like this:
mov r1, #140
mov pc, r1 @wow, this is a stupid thing to do! mov r1, r2 @this line will never execute!
1 This is because the opcode for mov has some spare bits that can be used to record the literal directly in the instruction.
As the second line of code overwrites the program counter to the address 0x140, the last line of code will (probably) never run. 2 Instead, whatever mystery number was stored at address 0x140 in memory (assuming that spot is memory mapped) will be executed as an opcode — probably with terrible results.
2 Unless 0x140 is actually the correct address for the next line of code, or 0x140 already contained the appropriate opcode to return back to the appropriate spot.
Memory Addressing Modes
Reading and writing ordered lists (or arrays) of data is a common process in computer programming. To simplify this in assembly language, there is more than one method for looking up an address in memory.
• Typically, the memory address for the starting point of some construct will be stored in a register.
Assume that register r0 stores such an address. In addition to the basic loading and storing data to/from that address, there are three other possibilities.
Definition: Addressing with an offset refers to accessing a memory ele- ment that is a given distance from the base address.
Definition: Pre-indexed addressing refers to changing the base address in the register before looking up the memory element at the new address.
Definition: Post-indexed addressing refers to looking up the memory element at the present address in the register, then shifting the address stored in that register.
These techniques are useful for reading/writing arrays of data, or for interacting with some memory-mapped I/O peripherals. These techniques work with any instruction that uses a register value as an address. As an example:
mov r1, ldr r0, str r1, str r1, str r1, str r1,
=0x00ff1000
[r0] @direct address [r0, #4] @offset
[r0, #4]! @pre-indexed [r0], #4 @post-indexed
The first line initializes register r1 with the value 10. The second line initializes register r0 with the memory address 0x00ff1000. Then the following things happen:
• Direct addressing is used to write the number 10 to the mem- ory address 0x00ff1000.
• Addressing with an offset is used to write the number 10 to the memory address 0x00ff1004. The content of register r0 is still 0x00ff1000.
• Pre-indexed addressing is used to write the number 10 to the memory address 0x00ff1004 (again!). The content of register r0 is changed to 0x00ff1004.
• Post-indexed addressing is used to write the number 10 to the memory address 0x00ff1004 (third time’s the charm!). The content of register r0 is then changed to 0x00ff1008.
Because pre-indexing and post-indexing change the base address, they can be used to efficiently cycle through an array of data. 3
3 Conceptually, it should be possible
to combine pre- and post-indexing as:
str r1, [r0 #4]!, #4, meaning the address is incremented by 4, the data is written to that address, then the address is incremented by 4 again, however ARMv7 won’t let you do this.
Memory Alignment
The memory width of a microcontroller is often smaller than the word size of the microprocessor. This is the case with the ARM®Cortex- A9: the registers are 4 bytes, but the memory cells are only 1 byte wide.
• The data movement instructions discussed above are all based on manipulating a word of data.
• Consequently, in the above examples, addresses are always given, and incremented, in multiples of 4 bytes.
Because of this difference, it is important to keep numbers aligned in memory. If you store an array of 32-bit values to memory, but when trying to read those values the base address is accidentally offset by 1 byte, the result is probably useless. To help protect data and instruction sets from becoming misaligned, the following rules must be obeyed:
• Numbers stored as words (32 bits) can only be at addresses that are integer multiples of 4.
• Numbers stored as half-words (16 bits) can only be at ad- dresses that are integer multiples of 2.
• Numbers stored as bytes (8 bits) can be at any address. Consequently, the operation:
mov r1 , #1 str r1, [r0]
uses 4 sequential bytes of memory to store the number 1. Reading from, or writing to, these 4 sequential bytes is handled automati- cally by ldr and str. If the address stored in r0 is not a multiple of 4, executing this code in a simulator will generate an error mes- sage. 4
4 The code will compile just fine, but the simula- tor will terminate once it tries to write a word to an “unaligned” memory address. What would happen in actual hardware? I don’t know, I haven’t tested it. But nothing good, I imagine.
Words, Half-Words, & Endian-ness
Storing numbers across multiple memory cells also raises the prob- lem of endian-ness: in what order do the contents of each memory cell combine to make the large number? Consider storing the num- ber 0x1A2B in a system with a 16-bit address space and 8-bit mem- ory cells, starting at address 0x1000. There are two ways to store this number, as shown in Figure 1.
Definition: A computer system is big endian if the most significant byte in a larger number is stored at the lowest memory address.
Definition: A computer system is little endian if the most significant byte in a larger number is stored at the highest memory address.
There are reasonable arguments for and against both methods of storing data.
• As shown in Figure 1, the big endian convention is easier for a human to read. If we examine the memory cells sequen- tially from the lowest address to the highest address, multi- byte numbers are presented in the same way we have learned to read them: from most significant digit to least significant digit.
• However, logically, big endian is silly because the largest part of the number is at the lowest address.
• Little endian is logically more sensible, because the least sig- nificant byte is stored at the lowest memory address, and the most significant byte is stored at the highest memory address.
0x1000 0x1001
0x1000 0x1001
Figure 1: Two ways to order bytes in memory for the half-word 0x1A2B: (a) big-endian, and (b) little-endian. (The data at address 0x1002 is for the next byte/half-word/word of data.)
• However little endian is less human readable.
By default, ARM processors are little endian, although they can be reconfigured if necessary. In most circumstances, programmers and users never know, or care, about the endian-ness of the system. If two microprocessors with different conventions need to exchange data, however, then some kind of hardware or software conversion is required to ensure data is read correctly by each system.
Loading and Storing Half-Words & Bytes
As mentioned above, the mnemonics str and ldr are all based on manipulating a word of data. For this course, usually that is suffi- cient, but sometimes you might want to access the contents of only a single byte or (probably even less frequently) manipulate 16-bit half-words.
Mnemonic: To load a single byte from memory, use the ldrb mnemonic. All unused bits in the register are filled with zeros. Any in- teger can be used as the memory address, not just integers evenly divisible by 4. Otherwise this mnemonic is the same as the ldr mnemonic discussed previously.
Mnemonic: To store a single byte to memory, use the strb mnemonic. Any integer can be used as the memory address, not just in- tegers evenly divisible by 4. Otherwise this mnemonic is the same as the str mnemonic discussed previously.
Mnemonic: To load a half-word from memory, use the ldrh mnemonic. All unused bits in the register are filled with zeros. Any even integer can be used as the memory address, not just integers evenly divisible by 4. Otherwise this mnemonic is the same as the ldr mnemonic discussed previously.
Mnemonic: To store a half-word to memory, use the strh mnemonic. Any even integer can be used as the memory address, not just in- tegers evenly divisible by 4. Otherwise this mnemonic is the same as the str mnemonic discussed previously.
When loading a word, it is irrelevant whether or not the data is signed or unsigned. However since the CPU registers are all 32- bits, it is important to know whether the data is signed or unsigned when loading a half-word or a byte.
• The representation of a signed number depends on the num- ber of bits available.
• For example, (−7)10 is represented in 8-bits as 5 0b1111 1001, while in 32-bits it is 0b1111 1111 1111 1001.
• Clearly for a signed negative number, all higher-order, unused bits in the register should be flipped to 1.
For this reason, there are special mnemonics for loading signed bytes and half-words.
Mnemonic: To load a single signed byte from memory, use the ldrsb mnemonic.
5 Using 2’s-complement convention, of course.
Mnemonic: To load a signed half-word from memory, use the ldrsh mnemonic.
The mnemonics to load signed bytes and half-words aren’t terribly important for this course, but as a conceptual matter hopefully everyone clearly understands why it doesn’t matter whether or not a word is signed, but it does matter whether or not a byte or a half- word is signed.
The ARMv7 language has mnemonics for storing signed bytes and signed half-words as well (namely, strsb and strsh) but the user manual states that these function exactly the same as an unsigned store.
Loading and Storing Multiple Words
In complement to loading and storing data smaller than a word, sometimes it is useful to load and store multiple words of data. Each register can only store a single word, so in principle this is just done by applying ldr or str multiple times to different regis- ters. However to streamline this process there is the option to move words to or from a set of registers and a sequential range of mem- ory.
Mnemonic: To load multiple words from a sequential range in mem-
ory to a set of registers use the ldmia, ldmda, ldmib, or ldmdb mnemonics. Unusually, the first operand is the register con- taining the base address in memory, and this operand should be preceded by a !. The second operand is a set of brace brackets with a list of registers to load data into: such as
{r0,r1,r3}.
Mnemonic: To store multiple words from a sequential range in mem- ory to a set of registers use the stmia, stmda, stmib, or stmdb mnemonics. This mnemonic has the same syntax as that for loading multiple words.
Four mnemonics each are listed above for loading or storing, the difference between these mnemonics is in regards to how the base address is incremented for each register. This is summarized in Table 1.
• “Increment” means the base address is increased by 4 after loading/storing each register, while “decrement” means the
Code Description
ia Increment After da Decrement After ib Increment Before db Decrement Before
Table 1: Summary of addressing codes used in storing or loading multiple registers.
base address is decreased by 4 after loading/storing each reg- ister.
• “After” means this address shift occurs after loading/storing to each register, “before” means this address shift occurs prior to loading/storing to each register.
The “before” and “after” codes are similar to pre-indexing and post-indexing discussed above. An example of storing data from multiple registers is given below.
ldr r0 , =0x2000
mov r1 , #12
mov r2 , #140
mov r5 , #10
stmia r0!, {r5, r1, r2}
The first line loads a memory address into r0, the next few lines ini- tialize some registers with values. Then these registers are written to memory.
• Interestingly, the ARMv7 assembler always sorts the registers in ascending numerical order, so r1 is stored to memory at the lowest address, and r5 is stored to memory at the highest address, regardless of how the registers are listed or whether the store operation is incrementing or decrementing. 6
• The mnemonic is for incrementing the base address after stor- ing, so r1 is stored to the address 0x2000.
6 I will never stop being peeved about this. What is the point of writing in Assembly if the assembler cleans up all the arbitrary and stupid things you want to do?
• r2 is then stored to address 0x2004, and r5 is stored to ad- dress 0x2008.
• After this operation is completed, r0 holds the value 0x200C. If the following code is then immediately executed:
ldmda r0!, {r2, r1, r5}
The data in memory starting at the address in r0 is loaded into the
registers.
• Again, the ARMv7 assembler always sorts the registers, but this time in descending numerical order, so r5 will be loaded the base address first (as it is at the highest address), and r1 is loaded from memory last (as it is at the lowest address).
• Following from the previous example, this means r5 will read the value stored in memory at 0x200C, r2 the value stored
in memory at 0x2008, and r1 the value stored in memory at 0x2004.
• After this operation is completed, r0 again holds the value 0x2000.
The point of this odd-seeming rearrangement of registers in these operations is such that commands like the following:
stmia r0!, {r3, r1, r2, r4} @ register order ldmdb r0!, {r4, r2, r3, r1} @ does not matter
stmia r0!, {r1, r2, r5}
ldmda r0!, {r5, r2, r1}
Figure 2: Visual demonstration of storing and loading multiple registers. Here the registers have been written in the appropriate order for each operation.
Result in the system being in the exact same state as when it started. These commands may seem strange, but they exist for a useful rea- son.
• In a higher-level code (C, Java, Python, etc.) you can always define extra user variables whenever you want.
• In assembly, however, you are limited by the number of regis- ters that physically exist in the CPU architecture.
• If your program needs to perform some complicated calcu- lation, and requires space for temporary variables, you can perform a stmia r7! {r0 – r5} (for example) to store the contents of several registers to memory. 7
• Then your program can use registers r0 to r5 for the compli- cated calculation.
• After obtaining the result, your program can restore it’s orig- inal state (i.e. overwrite the temporary variables with the original values) using ldmdb r7! {r0 – r5}.
This process of temporary storing registers to memory, then restor- ing them later, will be revisited in a later unit when we discuss stacks.
7 As shown in this sample code, you do not need to list every register if they fall in numeric sequence.
stmdb r0!, {r1 – r4}
0x1000 0x1004 0x1008 0x100C 0x1010 0x1014 0x1018 0x101C 0x1020
ldmia r0!, {r1 – r4} r0 r0
stmda r0!, {r1 – r4}
0x1000 0x1004 0x1008 0x100C 0x1010 0x1014 0x1018 0x101C 0x1020
ldmib r0!, {r1 – r4}
stmib r0!, {r1 – r4}
0x1000 0x1004 0x1008 0x100C 0x1010 0x1014 0x1018 0x101C 0x1020
r0 ldmda r0!, {r1 – r4} r0
stmia r0!, {r1 – r4}
0x1000 0x1004 0x1008 0x100C 0x1010 0x1014 0x1018 0x101C 0x1020
r0 ldmdb r0!, {r1 – r4}
Figure 3: Visual representation of matching pairs of store and load multiple instructions that return the system to the original state.
Data Manipulation Instructions
Now that you know how to get data into and out of memory, what to do with it? The basic ways of manipulating data are by arith- metic operations, logic operations, and bit shifting.
• Arithmetic and logic operations use the arithmetic logic unit (ALU) in the microprocessor to manipulate the contents of one or two registers.
• Bit shifting can also use the ALU, but in ARMv7 architecture a special piece of hardware called the barrel shifter is often used.
• These kinds of data manipulations are obviously important for performing calculations, but they can also be used for making comparisons between two numbers.
Most microprocessors support addition, subtraction, multiplication, and division. Multiplication and division will not be discussed in this course. Addition and subtraction were introduced last lesson, but just to revew:
Mnemonic: To add two numbers together, use the add mnemonic. Two operands are always required, the first is the register where the sum is stored, the second is the register with a number. If a third operand is provided, i
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com