Embedded Systems with ARM Cortex-M Microcontrollers in Assembly Language and C (Dr. Yifeng Zhu)
Chapter 1 Computer and Assembly Language
ECE3375B Electrical and Computer Engineering Western University
Winter 2019
Embedded Systems
2 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Amazon Warehouse
Kiva Robot
3
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Assembly Programs
http://www.andysinger.com/
4 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Why do we learn Assembly?
Assembly isn’t “just another language”.
Help you understand how does the processor work
Assembly program runs faster than high-level language. Performance critical codes must be written in assembly.
Use the profiling tools to find the performance bottle and rewrite that code section in assembly
Latency-sensitive applications, such as aircraft controller
Standard C compilers do not use some operations available on ARM processors, such ROR (Rotate Right) and RRX (Rotate Right Extended).
Hardware/processor specific code,
Processor booting code
Device drivers
A test-and-set atomic assembly instruction can be used to implement locks and semaphores.
Cost-sensitive applications
Embedded devices, where the size of code is limited, wash machine controller, automobile controllers
The best applications are written by those who’ve mastered assembly language or fully understand the low-level implementation of the high-level language statements they’re choosing.
5 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Why ARM processor
As of 2005, 98% of the more than one billion mobile phones sold each year used ARM processors
As of 2009, ARM processors accounted for approximately 90% of all embedded 32-bit RISC processors
In 2010 alone, 6.1 billion ARM-based processor, representing 95% of smartphones, 35% of digital televisions and set-top boxes and 10% of mobile computers
As of 2014, over 50 billion ARM processors have been produced
6 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
iPhone 7 Teardown
A10 processor:
• 64-bit system on chip (SoC)
• ARMv8-A core
7 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Apple Watch
Apple S1 Processor
32-bit ARMv7-A compatible # of Cores: 1
CMOS Technology: 28 nm
L1 cache L2 cache GPU
32 KB data
256 KB
PowerVR SGX543
8 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Kindle HD Fire
9 http://www.ifixit.com
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Texas Instruments OMAP 4460 dual- core processor
Fitbit Flex Teardown
STMicroelectronics 32L151C6 Ultra Low Power ARM Cortex M3 Microcontroller
Nordic Semiconductor nRF8001 Bluetooth Low Energy Connectivity IC
10 Embedded Systemwwsw.iftihxitA.cRoMm Cortex-M Microcontrollers (Dr.Y. Zhu)
Chapter 1
Samsung Galaxy Gear
source: ifixit.com
STMicroelectronics STM32F401B ARM- Cortex M4 MCU with 128KB Flash
11
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Pebble Smartwatch
source: ifixit.com
STMicroelectronics STM32F205RE ARM Cortex-M3 MCU, with a maximum speed of 120 MHz
12 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Oculus VR
Facebook’s $2 Billion Acquisition Of Oculus in 2014
ST Microelectronics STM32F072VB ARM Cortex-M0 32-bit RISC Core
Microcontroller
source: ifixit.com
13 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
HTC Vive
STMicroelectronics 32F072R8
ARM Cortex-M0
Microcontroller
14 source: ifixit.com Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Nest Learning Thermostat
ST Microelectronics STM32L151VB ultra-low-power 32 MHz ARM Cortex-M3 MCU
source: ifixit.com
15 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Samsung Gear Fit Fitness Tracker
STMicroelectronics STM32F439ZI 180 MHz, 32 bit ARM Cortex-M4 CPU
source: ifixit.com
16 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Memory
Memory is arranged as a series of “locations”
Each location has a unique “address”
Each location holds a byte (byte-addressable)
e.g. the memory location at address 0x080001B0
contains the byte value 0x70, i.e., 112
The number of locations in memory is limited e.g. 4 GB of RAM
1 Gigabyte (GB) = 230 bytes
232 locations 4,294,967,296 locations!
Values stored at each location can represent either program data or program instructions
e.g. the value 0x70 might be the code used to tell the processor to add two values together
Data Address 8 bits 32 bits
0xFFFFFFFF
70
BC
18
01
A0
17 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Memory Chapter 1
0x080001B0 0x080001AF 0x080001AE 0x080001AD 0x080001AC
0x00000000
Computer Architecture
Von-Neumann
Instructions and data are stored in the same memory.
Harvard
Data and instructions are stored into separate memories.
18 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Computer Architecture
Von-Neumann
Instructions and data are stored in the same memory.
Harvard
Data and instructions are stored into separate memories.
19 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
ARM Cortex-M Series Family
Von-Neumann
Instructions and data are stored in the same memory.
Harvard
Data and instructions are stored into separate memories.
ARM
Cortex-M0
ARMv6-M
ARMv6-M
ARMv7-M
ARMv7E-M
ARM
Cortex-M0+
ARM
Cortex-M3
20
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
ARM
Cortex-M1
ARMv6-M
ARMv8-M
ARMv7E-M ARMv8-M
ARM
Cortex-M23
ARM
Cortex-M7
ARM
Cortex-M4
ARM
Cortex-M33
Levels of Program Code
C Program
int main(void){ int i;
int total = 0;
for (i = 0; i < 10; i++) {
total += i; }
while(1); // Dead loop }
High-level language
Assembly Program
Machine Program
Compile
Assemble
0010000100000000
0010000000000000
1110000000000001
0100010000000001
0001110001000000
0010100000001010
1101110011111011
1011111100000000
1110011111111110
21
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Level of abstraction closer to problem domain
Assembly language Textual representation
of instructions
Hardware representation
Binary digits (bits)
Encoded instructions and data
Provides for productivity and portability
See a Program Runs
C Code
Assembly Code
int main(void){ int a = 0; int b = 1;
int c;
c = a + b; return 0;
}
MOVS r1, #0x00 MOVS r2,#0x01 ADDS r3, r1, r2 MOVS r0, 0x00 BX lr
; int a = 0
;int b = 1
; c = a + b
; set return value ; return
compiler
Machine Code
; MOVS ; MOVS ; ADDS ; MOVS ; BX
r1, #0x00 r2, #0x01
r3, r1, r2 r0, #0x00 lr
0010000100000000 0010001000000001 0001100010001011 0010000000000000 0100011101110000
In Binary
In Hex
2100 2201 188B 2000 4770
22
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Processor Registers
32 bits
Fastest way to read and write
Registers are within the processor chip A register stores 32-bit value
STM32L has
R0-R12: 13 general-purpose registers
R13: Stack pointer (Shadow of MSP or PSP)
R14: Link register (LR)
R15: Program counter (PC)
Special registers (xPSR, BASEPRI, PRIMASK, etc)
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13 (SP)
R14 (LR)
R15 (PC)
Low Registers
General Purpose Register
High Registers
32 bits
xPSR
BASEPRI
PRIMASK
FAULTMASK
CONTROL
Special Purpose Register
R13 (MSP)
R13 (PSP)
23
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Program Execution
Program Counter (PC) is a register that holds the memory address of the next instruction to be fetched from the memory.
1. Fetch instruction at PC address
3. Execute the instruction
2. Decode the instruction
Memory Address
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
PC = 0x080001B0 Instruction = 188B or 2000188B or 8B180020
4770
2000
188B
2201
2100
24
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
PC
Three-state pipeline: Fetch, Decode, Execution
Pipelining allows hardware resources to be fully utilized
One 32-bit instruction or two 16-bit instructions can be fetched.
Pipeline of 32-bit instructions
25
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Three-state pipeline: Fetch, Decode, Execution
Pipelining allows hardware resources to be fully utilized
One 32-bit instruction or two 16-bit instructions can be fetched.
Clock
Instruction i
Instruction i + 1
Instruction i + 2
Instruction i + 2
Pipeline of 16-bit instructions
Instruction Fetch
Instruction Decode
Instruction Execution
Instruction Decode
Instruction Execution
Instruction Fetch
Instruction Decode
Instruction Execution
Instruction Fetch
Instruction Decode
Instruction Execution
Instruction Fetch
26
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Machine codes are stored in memory
Data
Address 0xFFFFFFFF
r15 r14 r13 r12 r11 r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
pc
lr sp
Registers
CPU
ALU
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
4770
2000
188B
2201
2100
27
Memory
Chapter 1
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Fetch Instruction: pc = 0x08001AC
Decode Instruction: 2100 = MOVS r1, #0x00
Data
Address 0xFFFFFFFF
0x080001 AC
r15 r14 r13 r12 r11 r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
pc
lr sp
Registers
CPU
ALU
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
4770
2000
188B
2201
2100
28
Memory
Chapter 1
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Execute Instruction:
MOVS r1, #0x00
Data
Address 0xFFFFFFFF
0x080001 AC
0x00000000
r15 r14 r13 r12 r11 r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
pc
lr sp
Registers
CPU
ALU
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
4770
2000
188B
2201
2100
29
Memory
Chapter 1
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 2201 = MOVS r2, #0x01
Data
Address 0xFFFFFFFF
0x080001 AE
0x00000001
0x00000000
r15 r14 r13 r12 r11 r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
pc
lr sp
Registers
CPU
ALU
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
4770
2000
188B
2201
2100
30
Memory
Chapter 1
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 188B = ADDS r3, r1, r2
Data
Address 0xFFFFFFFF
r15
r14 lr r13 sp r12
r11
r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
ALU
pc
0x080001B0
0x00000001
0x00000001
0x00000000
Registers
CPU
4770
2000
188B
2201
2100
31
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Memory
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 2000 = MOVS r0, #0x00
Data
Address 0xFFFFFFFF
r15
r14 lr r13 sp r12
r11
r10
pc
0x080001B2
0x00000001
0x00000000
0x00000000
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
ALU
Registers
CPU
4770
2000
188B
2201
2100
32
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Memory
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
Fetch Next Instruction: pc = pc + 2 Decode & Decode: 4770 = BX lr
Data
Address 0xFFFFFFFF
pc
0x080001B4
0x00000001
0x00000000
0x00000000
r15
r14 lr r13 sp r12
r11
r10
r9
r8 r7
r6 r5 r4 r3 r2 r1 r0
ALU
Registers
CPU
4770
2000
188B
2201
2100
33
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Memory
0x080001B4 0x080001B2 0x080001B0 0x080001AE 0x080001AC
0x00000000
Example:
Calculate the Sum of an Array
int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int total;
int main(void){ int i;
total = 0;
for (i = 0; i < 10; i++) {
total += a[i]; }
while(1); }
34 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Example:
Calculate the Sum of an Array
Instruction Memory (Flash)
int main(void){ int i;
total = 0;
for (i = 0; i < 10; i++) {
total += a[i]; }
while(1); }
Data Memory (RAM)
int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int total;
CPU
Starting memory address Starting memory address 0x08000000 0x20000000
I/O Devices
35 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Example:
Calculate the Sum of an Array
0010 0001 0000 0000
0100 1010 0000 1000
0110 0000 0001 0001
0010 0000 0000 0000
1110 0000 0000 1000
0100 1001 0000 0111
1111 1000 0101 0001 0001 0000 0010 0000
0100 1010 0000 0100 0110 1000 0001 0010 0100 0100 0001 0001 0100 1010 0000 0011 0110 0000 0001 0001 0001 1100 0100 0000 0010 1000 0000 1010 1101 1011 1111 0100 1011 1111 0000 0000 1110 0111 1111 1110
MOVS LDR STR MOVS B
Loop: LDR LDR LDR LDR
ADD LDR STR
ADDS Check: CMP
BLT
NOP Self: B
r1, #0x00
r2, = total_addr
r1, [r2, #0x00] r0, #0x00
Check
r1, = a_addr
r1, [r1, r0, LSL #2] r2, = total_addr r2, [r2, #0x00]
r1, r1, r2
r2, = total_addr
r1, [r2,#0x00] r0, r0, #1
r0, #0x0A
Loop Self
Instruction Memory (Flash)
int main(void){ int i;
total = 0;
for (i = 0; i < 10; i++) {
total += a[i]; }
while(1); }
Starting memory address 0x08000000
36
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Example:
Calculate the Sum of an Array
0x0001 0x0000 0x0002 0x0000 0x0003 0x0000 0x0004 0x0000 0x0005 0x0000 0x0006 0x0000 0x0007 0x0000 0x0008 0x0000 0x0009 0x0000 0x000A 0x0000 0x0000 0x0000
Data Memory (RAM)
int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int total;
Assume the starting memory address of the data memory is 0x20000000
0x20000000 0x20000002 0x20000004 0x20000006 0x20000008
0x2000000 A 0x2000000C 0x2000000E
0x20000010
0x20000012
0x20000014
0x20000016
0x20000018
0x2000001 A 0x2000001C 0x2000001E 0x20000020 0x20000022 0x20000024 0x20000026 0x20000028 0x2000002 A
a[0] = 0x00000001 a[1] = 0x00000002 a[2] = 0x00000003 a[3] = 0x00000004 a[4] = 0x00000005 a[5] = 0x00000006 a[6] = 0x00000007 a[7] = 0x00000008 a[8] = 0x00000009 a[9] = 0x0000000A total= 0x00000000
Memory content
Memory address in bytes
37
Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
Loading Code and Data into Memory
38 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Loading Code and Data into Memory
39 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
Loading Code and Data into Memory
• Stack is mandatory
• Heap is used only if
dynamic allocation (e.g. malloc, calloc) is used.
40 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Chapter 1
View of a Binary Program
41 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu) Chapter 1
42 from st.com Chapter 1 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
43 from st.com Chapter 1 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
44 from st.com Chapter 1 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
STM32L4
45 from st.com Chapter 1 Embedded Systems with ARM Cortex-M Microcontrollers (Dr. Y. Zhu)
Memory Map
46 Chapter 1