2. Embedded Development Process
hello world
2
AAME TechCon 2013
TC003v02
2
Development
Environment
Standalone Embedded
Application
In the process of moving from a development environment to a
standalone embedded application, several issues need to be considered:
Application startup
Target memory map
C library use of hardware
4. Introduction
Software Tools Debug Adapters Development Targets
4
AAME TechCon 2013
TC003v02
4
Software Tools
Keil® Microcontroller
Development Kit (MDK)
DS-5
– Basic
– Professional
Debug Adapters
ULINKpro
ULINK2
ULINK-ME
Development Targets
MCU development boards
µVision Simulator
MCU development boards
Fast Models
DSTREAM
5. Keil MDK
Low cost tools for ARM7, ARM9, ARM Cortex-M and ARM Cortex-R4 MCUs
– Extensive device support for many devices
– Core and peripheral simulation
– Flash support
Microcontroller Development Kit (MDK)
– µVision IDE
– ARM Compiler, optimized run-time library, KEIL RTX RTOS
– Real-time trace (for ARM Cortex-M3 and ARM Cortex-M4 based devices)
5
AAME TechCon 2013
TC003v02
5
– Real-time trace (for ARM Cortex-M3 and ARM Cortex-M4 based devices)
Real-Time Library
– KEIL RTX RTOS + Source Code
– TCP networking suit, Flash File System, CAN Driver Library, USB Device Interface
Debug Hardware
Evaluation boards
6. Keil MDK: µVision IDE
The µVision Device Database
automatically configures the
development tools for the target
microcontroller:
– http://www.keil.com/dd/whatisdd.asp
The µVision IDE integrates additional
third-party tools like VCS, CASE, and
6
AAME TechCon 2013
TC003v02
6
third-party tools like VCS, CASE, and
FLASH/Device Programming
µVision incorporates a project
manager, compiler, editor, and
debugger in a single environment
Identical Target Debugger and
Simulator User Interface
7. Keil MDK: ARM Compiler
Compiler
Assembler
Linker
Format Converter
Libraries
Librarian
armcc
armasm
armar
object
assembler source
C/C++ source
compile/assemble
object
objects library
link
7
AAME TechCon 2013
TC003v02
7
These are all command line tools
– Easy to script
All tools emit useful statistics
– e.g. code size, data size, symbol
table, call graph, image memory
map etc.
– See the documentation
armlink
armar
binary
libraries
fromelf
executable
objects library
image
object
image stats
9. Download programs to your target hardware
Examine memory and registers
Single-step through programs and insert multiple breakpoints
Run programs in real-time
Program Flash Memory
Connect using JTAG or Serial Wire modes
On-the-fly debug of ARM Cortex-M based devices
ULINK Debug Adapters
9
AAME TechCon 2013
TC003v02
9
On-the-fly debug of ARM Cortex-M based devices
Examine Trace information from ARM Cortex-M3 and ARM Cortex-M4 devices
ULINKpro
ULINK2
ULINK-ME
10. Development Boards (1)
There are many evaluation boards available from Keil including support for:
– ARM Cortex-M family processors, ARM Cortex-R4 and classic ARM processors like
the ARM7TDMI,
Hardware support for Ethernet, CAN, USB, USB Host/OTG, SD Card, serial
interfaces, and QVGA LCDs
Example projects
Available as starter kits which include a ULINK-ME debug adapter
10
AAME TechCon 2013
TC003v02
10
Available as starter kits which include a ULINK-ME debug adapter
All boards and starter kits include evaluation software, cables. documentation, and
example programs
– Keil ARM Cortex-M boards: http://www.keil.com/boards/cortexm.asp
– Keil ARM Cortex-R boards: http://www.keil.com/boards/cortexr.asp
– Keil ARM7 family boards: http://www.keil.com/boards/arm7.asp
– Keil ARM9 family boards: http://www.keil.com/boards/arm9.asp
Other ARM MCU development boards are available from our partners:
– http://www.keil.com/boards/thirdparty.asp
12. Agenda
Tools
The Embedded Software Development Process
System Configuration
12
AAME TechCon 2013
TC003v02
12
Bit-Banding
Memory Ordering
13. Embedded Development Process
“Out-of-the-box”
hello world
Standalone Embedded
13
AAME TechCon 2013
TC003v02
13
“Out-of-the-box”
Standalone Embedded
Application
In the process of moving from an “out-of-the-box” build to a standalone
embedded application, several issues need to be considered:
C library use of hardware
Target memory map
application startup
14. Language Support
Single compiler armcc can compile standard ISO C/C++
Source language modes
– ISO C90
– 1990 C standard, compile option --c90 (default)
– ISO C99
14
AAME TechCon 2013
TC003v02
14
– ISO C99
– 1999 C standard, compile option --c99
– ISO C++
– 2003 C++ standard, compile option –cpp
Language compliance
– Default mode supports several common extensions
– Strict mode enforces compliance with language standard: --strict
– GNU mode offers partial support for GCC extensions: --gnu
15. C++ Support
The ARM Compiler supports full ISO C++ standard, including exceptions
– Plus numerous extensions to the ISO C++ standard
C++ exception handling is off by default
– To use exceptions, enable with --exceptions
– The Rogue Wave Standard C++ library is provided with C++ exceptions enabled
15
AAME TechCon 2013
TC003v02
15
The C++ initialization sequence is defined by the ABI
– Constructors handled by __cpp_initialize_aeabi_
– As part of this, destructors are registered using __aeabi_atexit
The new and delete operators call malloc() and free() functions
– Retarget malloc()/free() to provide your own heap support, not the
new/delete operators
16. Variable types supported
The compiler supports these basic types
int / long 32 bit (word) integer
short 16-bit (half-word) integer
char 8-bit byte, unsigned by default
long long 64-bit integer
16
AAME TechCon 2013
TC003v02
16
long long 64-bit integer
float 32-bit single-precision IEEE floating point
double 64-bit double-precision IEEE floating point
bool 8-bit Boolean (C++ only)
wchar_t 16-bit “wide character” type (C++ only)
Pointers 32-bit integer addresses
Take care when porting legacy code from other vendors’
architectures
17. Optimization Levels
Level of optimizations carried out by the compiler is selectable
-O0
– Minimum optimization
– The least optimized code, but with the best debug view
-O1
– Restricted optimization
– Optimized code and a good debug view
17
AAME TechCon 2013
TC003v02
17
– Optimized code and a good debug view
-O2 (default)
– High optimization
– Well optimized code but with limited debug view
-O3
– More aggressive optimization, weighted toward -Ospace / -Otime choice
– Enables multifile compilation by default (more later)
Select optimization for code size or execution speed with -Ospace (default)
or -Otime
Use -g or --debug to generate source level debug information
18. Selecting an Architecture or Core
Each new version of the ARM Architecture typically supports extra
instructions and models of operation
Implementation of an architecture version may vary between cores
– Use the most specific setting you can when compiling
Inform the compiler of the architecture or processor
– The default CPU setting is ARM7TDMI (Architecture 4T)
– Either specify an architecture version, or a specific core
18
AAME TechCon 2013
TC003v02
18
– Either specify an architecture version, or a specific core
--cpu 7-M (Do not prefix with a ‘v’)
--cpu Cortex-M3
Some examples of features the compiler and libraries can take advantage of:
– UDIV and SDIV (7-M and 7-R)
– REV (v6) can be used to reverse byte endianness
– Unaligned memory access (v6)
When using the ARM Cortex-M3 it is essential to specify 7-M or ARM Cortex-
M3 to ensure the correct (Thumb only) libraries are used
19. Default C Library
ANSI C
input/ error
stack &
Functions called by
your application
eg: fputc()
Device driver level.
C Library
19
AAME TechCon 2013
TC003v02
19
input/
output
error
handling
stack &
heap
setup
other
Semihosting Support
Use semihosting eg:
_sys_write()
Implemented by
debugging
environment
Debug
Agent
Target-dependent C library functionality is supported “out-of-the-box” by
a device driver level that makes use of debug target semihosting support
20. Default Memory Map
By default code is linked to load and
execute at 0x8000
The heap is placed directly above the data
region
The stack base location is read from the
debugging environment by C library startup
Provided by
debugging
environmentStack
20
AAME TechCon 2013
TC003v02
20
debugging environment by C library startup
code
– ISSM => from setting on configuration dialog in
RVD “Connection properties”
default = 0x10000000 or 0x22000000
– RVI => from debugger internal variable
$top_of_memory
default = 0x80000
Changed in “Connection properties” in RVD:
Advanced_informationDefaultARM_Config RO
RW
ZI
0x8000
Decided at
link time
Heap
21. __main
• copy code
• copy/decompress RW
data
• zero uninitialized data
C Library User Code
Application Startup
main( )
Image Entry
Point
21
AAME TechCon 2013
TC003v02
21
__rt_entry
set up application
stack and heap
initialize library
functions
call top-level
constructors (C++)
Exit from application
main( )
causes the linker to pull
in library initialization
code
22. Thumb C Libraries Provided
RVDS provides a number of ready-built Thumb and Thumb-2 C libraries,
e.g:
c_w.l - Little-endian C library
h_w.b - Big-endian helper library
f_ws.l/.b - Software floating-point library
– See RVCT Compiler and Libraries Guide, section 5.16, “Library naming conventions”
22
AAME TechCon 2013
TC003v02
22
– See RVCT Compiler and Libraries Guide, section 5.16, “Library naming conventions”
These are “semihosting” libraries
– Use ‘BKPT 0xAB’ to communicate with debug system
Normally, you do not have to list any of these libraries explicitly on the
linker command line
– The ARM linker automatically selects the correct C or C++ libraries to use (and it
might use several), based on the accumulation of the object attributes
Smaller, lower functionality ‘Microlib’ C libraries help to reduce overall
code size
23. CMSIS
ARM Cortex Microcontroller Software Interface Standard (CMSIS)
– Vendor-independent hardware abstraction layer for the Cortex-M series of cores
Provides C language access to core features
– Access to internal registers
– Helper functions for common core tasks
– Internal address definitions for core memory map
– Intrinsics for certain common assembly tasks
23
AAME TechCon 2013
TC003v02
23
– Intrinsics for certain common assembly tasks
Example: function to set interrupt priority mask
__ASM void __set_PRIMASK(uint32_t priMask)
{
msr primask, r0
bx lr
}
Most functions will usually be inlined as a single instruction, so no function call or
code size penalty
Available for download from http://www.onarm.com/
24. Semihosting
Library code runs on ARM target, but low-level I/O handled by host
Host access initiated by BKPT instruction
– “BKPT 0xAB” reserved for semihosting
– Traditional SWI/SVC vector not required
– Simpler than traditional semihosting on other cores
– No need to modify your SVC handler
Interface common for DS-5 Development Studio / DSTREAM
24
AAME TechCon 2013
TC003v02
24
Interface common for DS-5 Development Studio / DSTREAM
Debug tools must be connected to provide this functionality
Communication
with debugger
running on
host
hello
:
printf(“hellon”);
:
Application
Code :
BKPT
:
Library
Code
25. Retargeting the C Library (1)
You can replace the C library’s device driver level functionality with an
implementation that is tailored to your target hardware
– For example: printf() should go to LCD screen, not debugger console
ANSI C ANSI C
User
25
AAME TechCon 2013
TC003v02
25
Semihosting
Support
input/
output
C Library
Debug
Agent
Target
Hardware
input/
output
User
CodeRetarget
26. Retargeting the C Library (2)
To ‘Retarget’ the C library, simply replace those C library functions which use
semihosting with your own implementations, to suit your system
– For example, the printf() family of functions (except sprintf()) all ultimately call fputc()
– The default implementation of fputc() uses semihosting
– Replace this with:
extern void sendchar(char *ch);
int fputc(int ch, FILE *f)
26
AAME TechCon 2013
TC003v02
26
int fputc(int ch, FILE *f)
{ /* e.g. write a character to an LCD */
char tempch = ch;
sendchar(&tempch);
return ch;
}
See retarget.c in the “emb_sw_dev” example directory of your tools installation
for further examples of retargeting
How can you be sure that no semihosting-using functions will be linked in from the
C library?…...
27. Avoiding C Library Semihosting
To ensure that no functions which use semihosting are linked in from the C library,
import the ‘guard’ symbol __use_no_semihosting
#pragma import(__use_no_semihosting)
If there are still semihosting functions being linked in, the linker will report:
Error: L6915E: Library reports error: __use_no_semihosting was
27
AAME TechCon 2013
TC003v02
27
Error: L6915E: Library reports error: __use_no_semihosting was
requested but <function> was referenced.
To fix this provide your own implementations of these functions
RVCT Compiler and Libraries Guide, Table 5-3 gives a full list of semihosting C
library functions
Note: The linker will NOT report any functions in the user’s own application code
that use semihosting or SVC calls
28. Agenda
Tools
The Embedded Software Development Process
System Configuration
28
AAME TechCon 2013
TC003v02
28
Bit-Banding
Memory Ordering
29. Coming Out of Reset
Processor will be in Thread mode with privileged
operation
Processor will use Main stack
Core will fetch the MSP and PC from the vector table
– Vector table will be located at address 0x0 (generally non-volatile memory)
29
AAME TechCon 2013
TC003v02
29
– Vector table will be located at address 0x0 (generally non-volatile memory)
– PSP (if used) can be set later in reset handler using MRS instruction
All interrupts are disabled
– Vector table must contain valid value for NMI Handler and Hard Fault
Handler
– PRIMASK, FAULTMASK and BASEPRI are cleared
MPU is disabled
– Default memory map is used
30. Modes and Stacks
Core can support privileged and unprivileged operation
– Used to control access to certain memory and registers
Two different modes – Handler mode and Thread mode
Two possible stacks – Main stack and Process stack
Handler mode
30
AAME TechCon 2013
TC003v02
30
Handler mode
– Entered when taking any exception
– Always privileged
– Always uses main stack
Thread mode
– Core enters thread mode out of reset
– Typically used for User application code
– Runs either privileged or unprivileged (should be configured by the reset handler)
– Can use either main or process stack
– Typically uses process stack if Thread mode is unprivileged
31. Thread Mode Usage Models
Four models supported
Thread Mode Stack Thread Mode Privilege
Main Privileged
Main Unprivileged
31
AAME TechCon 2013
TC003v02
31
Main Unprivileged
Process Privileged
Process Unprivileged
Selection made by CONTROL[1:0]
Out of reset, Thread Mode uses Main Stack and is Privileged
32. Special Purpose Registers
Unlike other ARM cores, special purpose registers are architecturally defined
– No coprocessor implementation
Registers are described in the v7-M ARM ARM
Special Purpose Register: CONTROL
– Bit 0 (nPRIV) – Thread mode privilege
– Bit 1 (SPSEL) – Thread mode stack
– Bit 2 (FPCA) – FP extension is active (M4 only)
32
AAME TechCon 2013
TC003v02
32
– Bit 2 (FPCA) – FP extension is active (M4 only)
Other registers
– Exception priority mask registers: PRIMASK, FAULTMASK, BASEPRI, BASEPRI_MAX
– Main and process stack pointer: MSP and PSP
– Program Status registers: APSR, EPSR, IPSR, XPSR
Registers can be read or written to using
– MRS and MSR instructions
– CPS instruction (for PRIMASK AND FAULTMASK)
– Named registers or intrinsics in C code
33. System Memory Interface
Cores have fixed memory map
– Internal configuration registers are mapped onto fixed locations
– External memory assigned to specific address ranges
Cores have three bus interfaces
– ICode memory interface – instruction fetches from 0x00000000 to 0x1FFFFFFF
– DCode memory interface – data accesses from 0x00000000 to 0x1FFFFFFF
– System memory interface – all accesses from 0x20000000 to 0xFFFFFFFF
33
AAME TechCon 2013
TC003v02
33
– System memory interface – all accesses from 0x20000000 to 0xFFFFFFFF
(excluding Private Peripheral Bus region)
Core can be configured with optional Memory Protection Unit (MPU)
– MPU provides access control for up to 8 memory regions
– MPU configured via memory-mapped control registers
– No extra latency added to memory accesses when using MPU
Memory map includes bit-banding regions
– Writes to a word address in the bit-band alias affect a single bit in the bit-band region
34. System (XN)
External
Peripheral
FFFF_FFFF
A000_0000
E000_0000
512MB
1GB
E00F_F000
E004_2000
E004_1000
E004_0000
E00F_FFFF
Private Peripheral Bus: Debug/External
External PPB
TPIU
ETM
ROM Table
Private Peripheral Bus: Internal
Fixed Memory Map
34
AAME TechCon 2013
TC003v02
34
Code
SRAM
Peripheral
External
SRAM
2000_0000
4000_0000
6000_0000
A000_0000
0000_0000
1 GB
512MB
512MB
512MB
ITM
E000_E000
E000_3000
E000_2000
E000_1000
E000_0000
E000_F000
DWT
FPB
SCS
RESERVED
RESERVED
E003_FFFF
Bit-band area
Bit-band area
4000_0000
2000_0000
ICode or
DCode Bus
System Bus
43FF_FFFF
23FF_FFFF
35. System Control Space
4K address range (0xE000E000 to 0xE000EFFF) for configuration, control,
and status registers
Registers in the System Control Space are architecturally defined and
described in the v7-M ARM ARM
Groups of registers:
– System ID block – Processor version and manufacturer info
35
AAME TechCon 2013
TC003v02
35
– System ID block – Processor version and manufacturer info
– System Control Block – System configuration and internal exception management
– SysTick system timer
– Nested Vectored Interrupt Controller (NVIC)
– Memory Protection Unit (MPU)
– Debug control and configuration
Registers can be accessed through regular memory mapped accesses
36. SysTick Timer
24-bit countdown timer intended for use by RTOS
2 clock sources
– Internal core clock
– External clock (driven by STCLK signal)
2 modes
– Generate interrupt
36
AAME TechCon 2013
TC003v02
36
– Generate interrupt
– Do not interrupt
Can be used as a general timer if not used by RTOS
Programmer’s Interface
– Control and Status Register - SYST_CSR: sets mode and has self-clearing
overflow bit
– Reload Value Register - SYST_RVR: sets the SysTick period
– Current Value Register - SYST_CVR: shows current SysTick value
– Calibration Value Register - SYST_CALIB: for configuring a 10ms period
37. Power Management
Software can put ARM Cortex-M3 or M4 to sleep
– Processor waits for bus cycles and internal operations to complete
– Processor asserts SLEEPING signal (in case external logic can be clock gated)
– Processor then turns off internal clock gates
– NVIC interrupt interface stays awake
Core can also enter Deep Sleep mode
– Intended for longer duration sleep – different output signal to external clocking
logic
37
AAME TechCon 2013
TC003v02
37
logic
– For example, Deep Sleep may disable clock logic for additional power savings
– If implemented, WIC stays awake and can bring core out of deep sleep mode
Sleep-now
– Entered by executing WFE or WFI instruction
– WFE intended for multi-core systems
Sleep-on-exit
– Entered when core exits the lowest priority pending interrupt service routine
– Configured by setting the SLEEPONEXIT bit in the System Control Register
38. Power Management (2)
Waking up from sleep
– Interrupt received by the WIC (if that logic is implemented on the chip)
– Interrupt received by the NVIC, if not powered down
– External event in multicore system (generated through a SEV command)
– Debug event
Power management
38
AAME TechCon 2013
TC003v02
38
Power management
– Very implementation-dependent
– Entered by configuring custom power controller, and then entering sleep mode
– Power controller shuts off some or part of core when CPU asserts sleep signals
Recovering from power down modes
– Depends on what was powered down – program registers, system control
registers, memory, etc.
– System state may have to be saved before powering down
– Recovery may look like a reset (warm or cold)
– All very implementation-dependent
39. Agenda
Tools
The Embedded Software Development Process
System Configuration
39
AAME TechCon 2013
TC003v02
39
Bit-Banding
Memory Ordering
40. v7-M Bit-banding
Traditional Method of Read-Modify-Write Manipulation
0 0 0 0 0 0 0 0 Read byte from SRAM
0x02000000
40
AAME TechCon 2013
TC003v02
40
x x x x x 1 x x
0 0 0 0 0 1 0 0
Mask and Modify Bit
Element
Write byte to SRAM
0x02000000
0x02000000
41. v7-M Bit-banding
Writes to a word address in the
bit-band alias affect a single bit in
the bit-band region
The write is translated to an atomic
read-modify-write by the Cortex-M3
bus matrix
Bit 0 of the stored register is written
41
AAME TechCon 2013
TC003v02
41
Bit 0 of the stored register is written
to the appropriate bit
Word alias
Physical bit
32MB Bit-band alias32MB
31MB
1MB Bit-band region
Bit-band alias32MB
31MB
1MB Bit -and region
42. Bit-banding C Code Example
Two regions: SRAM & Peripheral, both work in the same way
– Bit 0 of the Read/Write word contains the ‘Bit Data’
– Writing to 0x20000000 will modify the SRAM word
– Writing to 0x22000000 will modify a bit within the SRAM word
#define BITBAND_SRAM_REF 0x20000000
#define BITBAND_SRAM_BASE 0x22000000
#define BITBAND_SRAM(a,b) ((BITBAND_SRAM_BASE + (a-BITBAND_SRAM_REF)*32 + (b*4))) // Convert SRAM address
#define BITBAND_PERI_REF 0x40000000
#define BITBAND_PERI_BASE 0x42000000
#define BITBAND_PERI(a,b) ((BITBAND_PERI_BASE + (a-BITBAND_PERI_REF)*32 + (b*4))) // Convert PERI address
42
AAME TechCon 2013
TC003v02
42
#define MAILBOX 0x20004000
#define TIMER 0x40004000
#define MBX_B0 *((volatile unsigned int *)(BITBAND_SRAM(MAILBOX,0))) // Bit 0
#define MBX_B7 *((volatile unsigned int *)(BITBAND_SRAM(MAILBOX,7))) // Bit 7
#define TIMER_B0 *((volatile unsigned char *)(BITBAND_PERI(TIMER,0))) // Bit 0
#define TIMER_B7 *((volatile unsigned char *)(BITBAND_PERI(TIMER,7))) // Bit 7
int main(void)
{
unsigned int temp = 0;
MBX_B0 = 1; // Word write
temp = MBX_B7; // Word read
TIMER_B0 = temp; // Byte write
return TIMER_B7; // Byte read
}
main LDR r0,|L1.16| // Pointer to MAILBOX
MOVS r1,#1
STR r1,[r0,#0]
LDR r0,[r0,#0x1c]
LDR r1,|L1.20| // Pointer to TIMER
STRB r0,[r1,#0]
LDRB r0,[r1,#0x1c]
BX lr
|L1.16| DCD 0x22080000 // Bitband addr of MAILBOX
|L1.20| DCD 0x42080000 // BitBand addr of TIMER
43. Bit-band Attribute & Switch
The ARM Compiler toolchain includes a “bitband” attribute
typedef struct {
int i:1;
int j:1;
int k:1;
} BB __attribute__((bitband));
43
AAME TechCon 2013
TC003v02
43
BB bb __attribute__((at(0x20000004))); // bb in bit-banded space
void set(void)
{
bb.i = 1; // causes a write to address 0x22000080
// aliased region for address 0x20000004
}
The --bitband option applies the “bitband” attribute to all structures, to avoid
having to modify existing (legacy) source code
44. Agenda
Tools
The Embedded Software Development Process
System Configuration
44
AAME TechCon 2013
TC003v02
44
Bit-Banding
Memory Ordering
45. Is Memory Access Order Important?
LDR r0, <address 1>
23
5
address 1
address 2
RAM
45
AAME TechCon 2013
TC003v02
45
LDR r0, <address 1>
LDR r1, <address 2>
SUBS r2,r0,r1
23 5
address 1 = address 2
FIFO
How do we tell the processor when order is important?
46. Memory Access Order
Strongly Ordered memory
– Force all pending memory accesses to complete first
– Subsequent accesses will not complete until current access has completed
– No order defined with respect to normal and strongly-ordered memory accesses
Device memory - Shared
– Accesses to this memory type will complete in program order
46
AAME TechCon 2013
TC003v02
46
– Accesses to this memory type will complete in program order
– No order defined with respect to non-shared device or normal memory accesses
Device memory - Non-Shared
– Accesses to this memory type will complete in program order
– No order defined with respect to shared device or normal memory accesses
Normal memory
– Accesses to this memory type may complete in any order
– Accesses to the same location will complete in order (e.g. RAW hazards)
– Accesses to this memory type can be repeated without altering the result
47. Memory Ordering Restrictions
A1 A2 →
↓
Normal
Device
Strongly
OrderedNon-
Shared
Shared
Normal
Device
Non-
Shared
< <
47
AAME TechCon 2013
TC003v02
47
Device
Shared < <
Strongly Ordered < < <
<Access order not defined Access order defined
48. Memory Barrier Instructions
Classic ARM processors, such as the ARM7TDMI, execute instructions
and complete data accesses in program order
Some of the latest ARM processors can optimize the order of instruction
execution and data accesses
In some situations it might be necessary to ensure that an operation has
completed before continuing execution
ARM processors such as ARM Cortex-M0+ and ARM Cortex-A15 provide
48
AAME TechCon 2013
TC003v02
48
ARM processors such as ARM Cortex-M0+ and ARM Cortex-A15 provide
barrier instructions which can guarantee completion of any preceding
load and store instructions and flush the processor pipeline
ARMv7-M processors, e.g. ARM Cortex-M3, do not perform out-of-order
execution, however, there are some situations where barriers are
required, for example:
– Implementation requirements (Cortex-M3 and Cortex-M4), for example:
– Memory map switching and self-modifying code
– ARMv6-M architectural and portable code requirements, for example:
– CONTROL register or VTOR updates and MPU Programming
49. Data Memory Barrier (DMB)
Ensures that all explicit memory transactions complete before it finishes
No explicit memory transactions in code after this instruction may start until the
Memory Barrier instruction completes
LDR r3,<address1>
DMBCan be executed before DMB
49
AAME TechCon 2013
TC003v02
49
Memory accesses to Strongly-ordered memory, such as the System Control Block,
do not require the use of DMB instructions
DMB
ADD r0,r1,r2
STR r0 <address 2>
Can be executed before DMB
completes
Memory access cannot be
performed until DMB completes
50. Data Synchronization Barrier (DSB)
Ensures that all explicit memory transactions complete before it finishes
LDR r3,<address1>
50
AAME TechCon 2013
TC003v02
50
No further instructions may complete execution or change the interrupt masks
until the Memory Barrier instruction completes
DSB
ADD r0,r1,r2
STR r0 <address 2>
Cannot be executed until
DSB completes
51. Instruction Synchronization Barrier
Ensures that the pipeline of the processor is
flushed
– Instructions in pipeline stages are cleaned out
– Including all instruction buffers
– Instructions are fetched again from memory system
51
AAME TechCon 2013
TC003v02
51