SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 1
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 2
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
➜ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 3
MANCHEstER
1824
The
University
of
Manchester
The 3-stage ARM pipeline
(The original architecture which has affected on the instruction set.)
❏ fetch
❍ the instruction is fetched from memory
❏ decode
❍ the instruction is decoded and the datapath control signals
prepared for the next cycle
❏ execute
❍ the operands are read from the register bank, shifted,
combined in the ALU and the result written back
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 4
MANCHEstER
1824
The
University
of
Manchester
The 3-stage ARM pipeline
❏ Single cycle instructions
❍ complete at a rate of one per clock cycle
❍ intended to use (a single) memory efficiently
fetch decode execute
fetch decode execute
fetch decode execute
1
2
3
instruction
time
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 5
MANCHEstER
1824
The
University
of
Manchester
The 3-stage ARM pipeline
❏ More complex instructions:
❍ ‘STR’ causes a stall while transfer occurs
fetch ADD decode execute
1
2
3
instruction
time
4
5
fetch STR decode calc. addr.
fetch ADD decode execute
fetch ADD decode execute
fetch ADD decode execute
data xfer
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 6
MANCHEstER
1824
The
University
of
Manchester
The 3-stage ARM pipeline
❏ PC behaviour
❍ r15 increments twice before an instruction executes
– due to pipeline operation
❍ therefore r15 = address of instruction + 8
– (+12 if used after first cycle, though this is architecturally
undefined)
– in Thumb code the offset is +4
❍ normally the assembler makes the necessary adjustments,
e.g. in branches
This behaviour is consistent for all ARMs,
although the pipeline structures may vary.
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 7
MANCHEstER
1824
The
University
of
Manchester
3-stage ARM organization
❏ ARM components:
❍ register bank
– 2 read ports, 1 write port
– plus additional read and write ports for r15
❍ barrel shifter
❍ ALU
❍ address register and incrementer
❍ memory data registers
❍ instruction decoder and control
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 8
MANCHEstER
1824
The
University
of
Manchester
3-stage ARM organization
❍ Separate address
incrementer
❍ Two register read ports
❍ Barrel shifter in series
with ALU
data out register data in register
address register
incrementer
register
bank
barrel
shifter
ALU
multiply
register
instruction
control
and
decode
instruction
control
register
A[31:0]
D[31:0]
ALU
bus
A
bus
B
bus
PC
PC
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 9
MANCHEstER
1824
The
University
of
Manchester
Why does this matter?
The internal structure of the processor can result in pipeline stalls
(⇒ loss of performance) in some circumstances.
❏ Instruction dependencies
❍ pipeline needs flushing and refilling when a branch is taken
❍ alleviated by branch prediction
❏ Data dependencies
❍ instruction may have to wait for the result from a previous one
❍ alleviated by forwarding
❍ particular problem with loads (memory is long latency)
– code reordering can help
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 10
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
➜ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 11
MANCHEstER
1824
The
University
of
Manchester
The ARM7TDMI
❏ The ARM7TDMI is …
❍ an ARM7 3-stage pipeline core, with
❍ T - support for the Thumb instruction set
❍ D - support for debug
– the processor can stop on a debug event
❍ M - support for long multiplies
❍ I - the EmbeddedICE macrocell
– provides breakpoint and watchpoint hardware
• described later
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 12
MANCHEstER
1824
The
University
of
Manchester
ARM7TDMI organization
❍ Scan chains provide access to signals for debugging
EmbeddedICE
JTAG TAP
controller
bus
splitter
processor
core
extern0
extern1
Din[31:0]
Dout[31:0]
D[31:0]
A[31:0]
control
other
signals
JTAG
scan chain 0
scan chain 2
scan chain 1
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 13
MANCHEstER
1824
The
University
of
Manchester
The
ARM7TDMI
core interface
signals
mclk
wait
eclk
bigend
irq
fiq
isync
reset
enin
enout
enouti
abe
ale
ape
dbe
tbe
busen
highz
busdis
ecapclk
dbgrq
breakpt
dbgack
exec
extern1
extern0
dbgen
rangeout0
rangeout1
dbgrqi
commrx
commtx
opc
cpi
cpa
cpb
A[31:0]
Din[31:0]
Dout[31:0]
D[31:0]
bl[3:0]
r/w
mas[1:0]
mreq
seq
lock
trans
mode[4:0]
abort
Tbit
tapsm[3:0]
ir[3:0]
tdoen
tck1
tck0
screg[3:0]
drivebs
ecapclkbs
icapclkbs
highz
pclkbs
rstclkbs
sdinbs
sdoutbs
shclkbs
shclk2bs
TRST
TCK
TMS
TDI
TDO
Vdd
Vss
ARM7TDMI
core
boundary
scan
extension
memory
interface
state
TAP
information
JTAG
controls
MMU
interface
clock
control
configuration
interrupts
initialization
bus
control
debug
coprocessor
interface
power
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 14
MANCHEstER
1824
The
University
of
Manchester
ARM7TDMI core interface signals
❏ Memory interface
❍ ~mreq - memory request
❍ seq - sequential address
– together these signals indicate the sort of bus cycle which will
happen next
– they are pipelined ahead to aid memory design
mreq seq Cycle Use
0 0 N Non-sequential memory access
0 1 S Sequential memory access
1 0 I Internal cycle bus and memory inactive
1 1 C Coprocessor register transfer memory inactive
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 15
MANCHEstER
1824
The
University
of
Manchester
ARM7TDMI core interface signals
❏ Notes:
❍ request for memory is in clock cycle before transfer occurs
❍ wait state insertion is possible
mclk
Din[31:0]
Dout[31:0]
mreq, seq
abort
A[31:0]
& control
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 16
MANCHEstER
1824
The
University
of
Manchester
ARM7TDMI
❏ ARM7TDMI debug support
❍ the EmbeddedICE module
❍ supports breakpoints and watchpoints
– controlled via the JTAG test access port
– EmbeddedICE & JTAG are covered later
❏ ARM7TDMI characteristics:
Process 0.35 µm Transistors 74,209 MIPS 60
Metal layers 3 Core area 2.1 mm2 Power 87 mW
Vdd 3.3V Clock 0 to 66 MHz MIPS/W 690
A ‘modern’ ARM7 (0.18µm) is < mm2
1
2
--
-
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 17
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
➜ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 18
MANCHEstER
1824
The
University
of
Manchester
Getting higher performance
❏ Increase the clock rate
❍ the clock rate is limited by the slowest pipeline stage
– decrease the logic complexity per stage
– increase the pipeline depth (number of stages)
❏ improve the CPI (clocks per instruction)
❍ fewer wasted cycles
– better memory bandwidth
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 19
MANCHEstER
1824
The
University
of
Manchester
The 5-stage ARM pipeline
❏ Fetch
❏ Decode
❍ instruction decode and register read
❏ Execute
❍ shift and ALU
❏ Memory
❍ data memory access
❏ Write-back
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 20
MANCHEstER
1824
The
University
of
Manchester
The 5-stage ARM pipeline
❏ Reducing the CPI
❍ ARM7 uses the memory on nearly every clock cycle
– for either instruction fetch or data transfer
❍ therefore a reduced CPI requires
more than one memory access per clock cycle
❏ Possible solutions are:
❍ separate instruction and data memories
❍ double-bandwidth memory (e.g. ARM8)
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 21
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
➜ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 22
MANCHEstER
1824
The
University
of
Manchester
ARM9TDMI
❏ The ARM9TDMI is …
❍ a ‘classic’ Harvard architecture 5-stage pipeline
– separate instruction and data memory ports
❍ with full support for Thumb and EmbeddedICE debug
❍ aimed at significantly higher performance than the
ARM7TDMI
– enhanced pipeline (then) operated at 100-200 MHz
– (now) up to 250MHz
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 23
MANCHEstER
1824
The
University
of
Manchester
ARM9TDMI pipeline
❍ Thumb instructions are decoded directly
instruction
fetch
data mem.
access
ARM
decode read
reg.
write
reg.
shift/ALU
shift/ALU
read
write
reg.
Thumb
decompress
instruction
fetch
decode
Decode Execute
Fetch
Decode Execute
Fetch Memory Write
ARM7TDMI:
ARM9TDMI:
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 24
MANCHEstER
1824
The
University
of
Manchester
ARM9TDMI
pipeline
❍ Pipeline
introduces more
dependencies
❍ alleviated with
forwarding paths
register write
+ 4
shifter
ALU
mul
register
read
mux
byte repl.
D-cache
rot/sgnex
I decode
I cache
+ 4
next PC
PC+4
PC+8
(r15)
LDM/STM
B, BL,
MOV pc,
SUB pc
LDR pc
reg shift
address
post- index
pre-
index
FETCH
DECODE
EXECUTE
BUFFER/DATA
WRITE
immediate
forwarding
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 25
MANCHEstER
1824
The
University
of
Manchester
ARM9TDMI
❏ EmbeddedICE
❍ as ARM7TDMI, plus:
– hardware single-stepping
– breakpoints on exceptions
❏ On-chip coprocessor support
❍ for floating-point, DSP, and so on
Process 0.25 µm Transistors 111,000 MIPS 220
Metal layers 3 Core area 2.1 mm2 Power 150 mW
Vdd 2.5 V Clock 0-200 MHz MIPS/W 1,500
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 26
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
➜ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 27
MANCHEstER
1824
The
University
of
Manchester
ARM10TDMI
❏ The ARM10TDMI is …
❍ aimed at significantly higher performance than the
ARM9TDMI
❍ achieved through use of:
– higher clock rate
– 64-bit I- and D-memory buses
– branch prediction
– hit-under-miss D-memory interface
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 28
MANCHEstER
1824
The
University
of
Manchester
ARM10TDMI pipeline
❏ 6-stage pipeline
❍ Additional time allowed for
– I- and D-memory accesses
– instruction decode
multiply
Write
instruction
fetch
decode
decode
read multiplier
partial add write
reg.
shift/ALU
Memory
Execute
Decode
Issue
Fetch
data mem.
access write
data
addr.
calc.
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 29
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
➜ StrongARM & XScale
❍ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 30
MANCHEstER
1824
The
University
of
Manchester
StrongARM & XScale
❏ Most ARM implementations are designed by ARM Ltd.
❏ Two notable exceptions:
❍ StrongARM
– designed by Digital, later bought by Intel
– now largely obsolete
❍ XScale
– designed by Intel
– current
– up to 800 MHz
Both of these were designed primarily for high performance
Used for proprietary devices
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 31
MANCHEstER
1824
The
University
of
Manchester
XScale™ pipeline
❍ another deep pipeline
❍ (unpredicted) branches are expensive
❍ data (memory) dependencies cause stalls
BTB
fetch
data mem.
access
early termination
multiply
accumulate
fetch2
read/
shift
decode
ALU
exec.
state
exec.
ALU
writeback
data
writeback
MAC
writeback
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 32
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
➜ the ARM11 core
❍ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 33
MANCHEstER
1824
The
University
of
Manchester
ARM11
❏ Most recent available ARM core (from ARM Ltd.)
❏ process portable
❏ high speed
❍ faster clock (~550MHz)
❍ deeper pipeline
– also to accommodate DSP operations
❍ improved branch prediction
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 34
MANCHEstER
1824
The
University
of
Manchester
ARM11 pipeline
❏ 8-stage pipeline
❍ Parallel ALU and data access (especially during LDM/STM)
❍ ‘Hit under Miss’ in ‘data1’ alleviates cache miss stalls
instruction
fetch
Fetch 1
data mem.
access
branch
prediction
Fetch 2
instruction
decode
Decode
register
read
Issue
shift
shift/addr./
ALU saturation
register
write
Writeback
multiply
accumulate
address
generation
register
write
MAC1
ALU/data1/
MAC2
Sat./data2/
MAC3
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 35
MANCHEstER
1824
The
University
of
Manchester
ARM 11 branch prediction
❍ Two-level dynamic branch prediction
– constant offset branches
– 128 entry, direct mapped (addr[9:3])
– cost: 1 or 0 cycles if successful (taken/not taken)
❍ Static branch prediction
– constant offset branches …
– … that miss the dynamic predictor
– cost: 4 cycles if successful
❍ Return stack
– three entries
– Pushes on BL or BLX (inc. BLX Rn)
– Pops on: {BX lr; MOV pc, lr; LDR pc, [sp], #n; LDMIA sp!, {…,pc}}
– cost: 4 cycles if successful
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 36
MANCHEstER
1824
The
University
of
Manchester
ARM 11 dependencies
❍ Data operations forward results
❍ Load operations impose an extra two cycle penalty (cache hit)
❍ Memory operations require their address registers early
Thus: ADD r2, r1, r0 ; produce R2
ADD r4, r3, r2 ; consume R2 - no stall
LDR r2, [r1] ; load r2
ADD r4, r3, r2 ; lose 2 cycles waiting
ADD r2, r1, r0 ; produce R2
LDR r4, [r2] ; lose one cycle
LDR r2, [r1] ;
LDR r4, [r2] ; lose 3 cycles in total
– a few other dependencies may be seen from the pipeline figure
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 37
MANCHEstER
1824
The
University
of
Manchester
ARM integer cores
❏ Outline:
❍ the ARM 3-stage pipeline
❍ the ARM7TDMI core
❍ the ARM 5-stage pipeline
❍ the ARM9TDMI core
❍ the ARM10TDMI core
❍ StrongARM & XScale
❍ the ARM11 core
➜ Cortex
☞ hands-on: system software – interrupts
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 38
MANCHEstER
1824
The
University
of
Manchester
Cortex™
❏ Three ‘flavours’:
❍ Cortex-A Series
– applications processors
– ARM, Thumb and Thumb-2 instruction sets
❍ Cortex-R Series
– embedded processors for real-time systems.
– ARM, Thumb, and Thumb-2 instruction sets
❍ Cortex-M Series
– deeply embedded/cost sensitive processors
– Thumb-2 instruction set only
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 39
MANCHEstER
1824
The
University
of
Manchester
Cortex-M3
❏ First of the line
❍ Thumb-2 only
❍ new design
– 3-stage pipeline
– Harvard core
❍ small core
– ~ mm2
(excluding cache)
❍ performance (0.18µm):
– ~100 MHz
– ~8000 MIPS/W
1
2
--
-
© 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 40
MANCHEstER
1824
The
University
of
Manchester
Hands-on: system software
– interrupts
❏ Look further into ARM system software issues
❍ Write an interrupt handler
❍ Use it to trigger a context switch
☞ Follow the ‘Hands-on’ instructions

Weitere ähnliche Inhalte

Ähnlich wie 07_ARM_Cores.pdf

Arm Processors Architectures
Arm Processors ArchitecturesArm Processors Architectures
Arm Processors ArchitecturesMohammed Hilal
 
Arm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_armArm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_armPrashant Ahire
 
ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC MachineEdutechLearners
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm
 
unit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxunit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxKandavelEee
 
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...Ardavan Pedram
 
Arm processors' architecture
Arm processors'   architectureArm processors'   architecture
Arm processors' architectureDr.YNM
 
Module 1 - ARM 32 Bit Microcontroller
Module 1 - ARM 32 Bit Microcontroller Module 1 - ARM 32 Bit Microcontroller
Module 1 - ARM 32 Bit Microcontroller Amogha Bandrikalli
 
Unit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptxUnit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptxVijayKumar201823
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxdeviifet2015
 
LECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORSLECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORSDr.YNM
 
Lect 2 ARM processor architecture
Lect 2 ARM processor architectureLect 2 ARM processor architecture
Lect 2 ARM processor architectureDr.YNM
 

Ähnlich wie 07_ARM_Cores.pdf (20)

Arm Processors Architectures
Arm Processors ArchitecturesArm Processors Architectures
Arm Processors Architectures
 
Arm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_armArm cortex-m3 by-joe_bungo_arm
Arm cortex-m3 by-joe_bungo_arm
 
ARM AAE - Architecture
ARM AAE - ArchitectureARM AAE - Architecture
ARM AAE - Architecture
 
Archi arm2
Archi arm2Archi arm2
Archi arm2
 
ARM CPU
ARM CPUARM CPU
ARM CPU
 
ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC Machine
 
arm
armarm
arm
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
 
unit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxunit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptx
 
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...
Custom Computer Engine for Optimizing for the Inner kernel of Matrix Multipli...
 
Arm processors' architecture
Arm processors'   architectureArm processors'   architecture
Arm processors' architecture
 
Ppt
PptPpt
Ppt
 
Module 1 - ARM 32 Bit Microcontroller
Module 1 - ARM 32 Bit Microcontroller Module 1 - ARM 32 Bit Microcontroller
Module 1 - ARM 32 Bit Microcontroller
 
18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf18CS44-MODULE1-PPT.pdf
18CS44-MODULE1-PPT.pdf
 
Ec8791 arm 9 processor
Ec8791 arm 9 processorEc8791 arm 9 processor
Ec8791 arm 9 processor
 
Arm processor
Arm processorArm processor
Arm processor
 
Unit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptxUnit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptx
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptx
 
LECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORSLECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORS
 
Lect 2 ARM processor architecture
Lect 2 ARM processor architectureLect 2 ARM processor architecture
Lect 2 ARM processor architecture
 

Kürzlich hochgeladen

定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一
定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一
定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一ga6c6bdl
 
Thane Escorts, (Pooja 09892124323), Thane Call Girls
Thane Escorts, (Pooja 09892124323), Thane Call GirlsThane Escorts, (Pooja 09892124323), Thane Call Girls
Thane Escorts, (Pooja 09892124323), Thane Call GirlsPooja Nehwal
 
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...Pooja Nehwal
 
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...Pooja Nehwal
 
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)kojalkojal131
 
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一ga6c6bdl
 
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...nagunakhan
 
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai GapedCall Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gapedkojalkojal131
 
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查awo24iot
 
Call Girls Delhi {Rohini} 9711199012 high profile service
Call Girls Delhi {Rohini} 9711199012 high profile serviceCall Girls Delhi {Rohini} 9711199012 high profile service
Call Girls Delhi {Rohini} 9711199012 high profile servicerehmti665
 
Dubai Call Girls O528786472 Call Girls In Dubai Wisteria
Dubai Call Girls O528786472 Call Girls In Dubai WisteriaDubai Call Girls O528786472 Call Girls In Dubai Wisteria
Dubai Call Girls O528786472 Call Girls In Dubai WisteriaUnited Arab Emirates
 
Pallawi 9167673311 Call Girls in Thane , Independent Escort Service Thane
Pallawi 9167673311  Call Girls in Thane , Independent Escort Service ThanePallawi 9167673311  Call Girls in Thane , Independent Escort Service Thane
Pallawi 9167673311 Call Girls in Thane , Independent Escort Service ThanePooja Nehwal
 
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...Suhani Kapoor
 
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...ur8mqw8e
 
9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...Pooja Nehwal
 
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...anilsa9823
 
Gaya Call Girls #9907093804 Contact Number Escorts Service Gaya
Gaya Call Girls #9907093804 Contact Number Escorts Service GayaGaya Call Girls #9907093804 Contact Number Escorts Service Gaya
Gaya Call Girls #9907093804 Contact Number Escorts Service Gayasrsj9000
 
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...Pooja Nehwal
 
Top Rated Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Call Girls in Nagpur High Profile
 
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样qaffana
 

Kürzlich hochgeladen (20)

定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一
定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一
定制宾州州立大学毕业证(PSU毕业证) 成绩单留信学历认证原版一比一
 
Thane Escorts, (Pooja 09892124323), Thane Call Girls
Thane Escorts, (Pooja 09892124323), Thane Call GirlsThane Escorts, (Pooja 09892124323), Thane Call Girls
Thane Escorts, (Pooja 09892124323), Thane Call Girls
 
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
 
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...
Kalyan callg Girls, { 07738631006 } || Call Girl In Kalyan Women Seeking Men ...
 
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)
(=Towel) Dubai Call Girls O525547819 Call Girls In Dubai (Fav0r)
 
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一
如何办理(NUS毕业证书)新加坡国立大学毕业证成绩单留信学历认证原版一比一
 
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...
Slim Call Girls Service Badshah Nagar * 9548273370 Naughty Call Girls Service...
 
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai GapedCall Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
 
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查
如何办理(Adelaide毕业证)阿德莱德大学毕业证成绩单Adelaide学历认证真实可查
 
Call Girls Delhi {Rohini} 9711199012 high profile service
Call Girls Delhi {Rohini} 9711199012 high profile serviceCall Girls Delhi {Rohini} 9711199012 high profile service
Call Girls Delhi {Rohini} 9711199012 high profile service
 
Dubai Call Girls O528786472 Call Girls In Dubai Wisteria
Dubai Call Girls O528786472 Call Girls In Dubai WisteriaDubai Call Girls O528786472 Call Girls In Dubai Wisteria
Dubai Call Girls O528786472 Call Girls In Dubai Wisteria
 
Pallawi 9167673311 Call Girls in Thane , Independent Escort Service Thane
Pallawi 9167673311  Call Girls in Thane , Independent Escort Service ThanePallawi 9167673311  Call Girls in Thane , Independent Escort Service Thane
Pallawi 9167673311 Call Girls in Thane , Independent Escort Service Thane
 
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...
VIP Call Girls Kavuri Hills ( Hyderabad ) Phone 8250192130 | ₹5k To 25k With ...
 
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...
《伯明翰城市大学毕业证成绩单购买》学历证书学位证书区别《复刻原版1:1伯明翰城市大学毕业证书|修改BCU成绩单PDF版》Q微信741003700《BCU学...
 
9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...9004554577, Get Adorable Call Girls service. Book call girls & escort service...
9004554577, Get Adorable Call Girls service. Book call girls & escort service...
 
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
 
Gaya Call Girls #9907093804 Contact Number Escorts Service Gaya
Gaya Call Girls #9907093804 Contact Number Escorts Service GayaGaya Call Girls #9907093804 Contact Number Escorts Service Gaya
Gaya Call Girls #9907093804 Contact Number Escorts Service Gaya
 
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...
Call Girls in Thane 9892124323, Vashi cAll girls Serivces Juhu Escorts, powai...
 
Top Rated Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Chakan ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样
哪里办理美国宾夕法尼亚州立大学毕业证(本硕)psu成绩单原版一模一样
 

07_ARM_Cores.pdf

  • 1. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 1 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 2. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 2 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ➜ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 3. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 3 MANCHEstER 1824 The University of Manchester The 3-stage ARM pipeline (The original architecture which has affected on the instruction set.) ❏ fetch ❍ the instruction is fetched from memory ❏ decode ❍ the instruction is decoded and the datapath control signals prepared for the next cycle ❏ execute ❍ the operands are read from the register bank, shifted, combined in the ALU and the result written back
  • 4. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 4 MANCHEstER 1824 The University of Manchester The 3-stage ARM pipeline ❏ Single cycle instructions ❍ complete at a rate of one per clock cycle ❍ intended to use (a single) memory efficiently fetch decode execute fetch decode execute fetch decode execute 1 2 3 instruction time
  • 5. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 5 MANCHEstER 1824 The University of Manchester The 3-stage ARM pipeline ❏ More complex instructions: ❍ ‘STR’ causes a stall while transfer occurs fetch ADD decode execute 1 2 3 instruction time 4 5 fetch STR decode calc. addr. fetch ADD decode execute fetch ADD decode execute fetch ADD decode execute data xfer
  • 6. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 6 MANCHEstER 1824 The University of Manchester The 3-stage ARM pipeline ❏ PC behaviour ❍ r15 increments twice before an instruction executes – due to pipeline operation ❍ therefore r15 = address of instruction + 8 – (+12 if used after first cycle, though this is architecturally undefined) – in Thumb code the offset is +4 ❍ normally the assembler makes the necessary adjustments, e.g. in branches This behaviour is consistent for all ARMs, although the pipeline structures may vary.
  • 7. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 7 MANCHEstER 1824 The University of Manchester 3-stage ARM organization ❏ ARM components: ❍ register bank – 2 read ports, 1 write port – plus additional read and write ports for r15 ❍ barrel shifter ❍ ALU ❍ address register and incrementer ❍ memory data registers ❍ instruction decoder and control
  • 8. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 8 MANCHEstER 1824 The University of Manchester 3-stage ARM organization ❍ Separate address incrementer ❍ Two register read ports ❍ Barrel shifter in series with ALU data out register data in register address register incrementer register bank barrel shifter ALU multiply register instruction control and decode instruction control register A[31:0] D[31:0] ALU bus A bus B bus PC PC
  • 9. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 9 MANCHEstER 1824 The University of Manchester Why does this matter? The internal structure of the processor can result in pipeline stalls (⇒ loss of performance) in some circumstances. ❏ Instruction dependencies ❍ pipeline needs flushing and refilling when a branch is taken ❍ alleviated by branch prediction ❏ Data dependencies ❍ instruction may have to wait for the result from a previous one ❍ alleviated by forwarding ❍ particular problem with loads (memory is long latency) – code reordering can help
  • 10. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 10 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ➜ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 11. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 11 MANCHEstER 1824 The University of Manchester The ARM7TDMI ❏ The ARM7TDMI is … ❍ an ARM7 3-stage pipeline core, with ❍ T - support for the Thumb instruction set ❍ D - support for debug – the processor can stop on a debug event ❍ M - support for long multiplies ❍ I - the EmbeddedICE macrocell – provides breakpoint and watchpoint hardware • described later
  • 12. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 12 MANCHEstER 1824 The University of Manchester ARM7TDMI organization ❍ Scan chains provide access to signals for debugging EmbeddedICE JTAG TAP controller bus splitter processor core extern0 extern1 Din[31:0] Dout[31:0] D[31:0] A[31:0] control other signals JTAG scan chain 0 scan chain 2 scan chain 1
  • 13. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 13 MANCHEstER 1824 The University of Manchester The ARM7TDMI core interface signals mclk wait eclk bigend irq fiq isync reset enin enout enouti abe ale ape dbe tbe busen highz busdis ecapclk dbgrq breakpt dbgack exec extern1 extern0 dbgen rangeout0 rangeout1 dbgrqi commrx commtx opc cpi cpa cpb A[31:0] Din[31:0] Dout[31:0] D[31:0] bl[3:0] r/w mas[1:0] mreq seq lock trans mode[4:0] abort Tbit tapsm[3:0] ir[3:0] tdoen tck1 tck0 screg[3:0] drivebs ecapclkbs icapclkbs highz pclkbs rstclkbs sdinbs sdoutbs shclkbs shclk2bs TRST TCK TMS TDI TDO Vdd Vss ARM7TDMI core boundary scan extension memory interface state TAP information JTAG controls MMU interface clock control configuration interrupts initialization bus control debug coprocessor interface power
  • 14. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 14 MANCHEstER 1824 The University of Manchester ARM7TDMI core interface signals ❏ Memory interface ❍ ~mreq - memory request ❍ seq - sequential address – together these signals indicate the sort of bus cycle which will happen next – they are pipelined ahead to aid memory design mreq seq Cycle Use 0 0 N Non-sequential memory access 0 1 S Sequential memory access 1 0 I Internal cycle bus and memory inactive 1 1 C Coprocessor register transfer memory inactive
  • 15. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 15 MANCHEstER 1824 The University of Manchester ARM7TDMI core interface signals ❏ Notes: ❍ request for memory is in clock cycle before transfer occurs ❍ wait state insertion is possible mclk Din[31:0] Dout[31:0] mreq, seq abort A[31:0] & control
  • 16. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 16 MANCHEstER 1824 The University of Manchester ARM7TDMI ❏ ARM7TDMI debug support ❍ the EmbeddedICE module ❍ supports breakpoints and watchpoints – controlled via the JTAG test access port – EmbeddedICE & JTAG are covered later ❏ ARM7TDMI characteristics: Process 0.35 µm Transistors 74,209 MIPS 60 Metal layers 3 Core area 2.1 mm2 Power 87 mW Vdd 3.3V Clock 0 to 66 MHz MIPS/W 690 A ‘modern’ ARM7 (0.18µm) is < mm2 1 2 -- -
  • 17. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 17 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ➜ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 18. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 18 MANCHEstER 1824 The University of Manchester Getting higher performance ❏ Increase the clock rate ❍ the clock rate is limited by the slowest pipeline stage – decrease the logic complexity per stage – increase the pipeline depth (number of stages) ❏ improve the CPI (clocks per instruction) ❍ fewer wasted cycles – better memory bandwidth
  • 19. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 19 MANCHEstER 1824 The University of Manchester The 5-stage ARM pipeline ❏ Fetch ❏ Decode ❍ instruction decode and register read ❏ Execute ❍ shift and ALU ❏ Memory ❍ data memory access ❏ Write-back
  • 20. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 20 MANCHEstER 1824 The University of Manchester The 5-stage ARM pipeline ❏ Reducing the CPI ❍ ARM7 uses the memory on nearly every clock cycle – for either instruction fetch or data transfer ❍ therefore a reduced CPI requires more than one memory access per clock cycle ❏ Possible solutions are: ❍ separate instruction and data memories ❍ double-bandwidth memory (e.g. ARM8)
  • 21. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 21 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ➜ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 22. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 22 MANCHEstER 1824 The University of Manchester ARM9TDMI ❏ The ARM9TDMI is … ❍ a ‘classic’ Harvard architecture 5-stage pipeline – separate instruction and data memory ports ❍ with full support for Thumb and EmbeddedICE debug ❍ aimed at significantly higher performance than the ARM7TDMI – enhanced pipeline (then) operated at 100-200 MHz – (now) up to 250MHz
  • 23. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 23 MANCHEstER 1824 The University of Manchester ARM9TDMI pipeline ❍ Thumb instructions are decoded directly instruction fetch data mem. access ARM decode read reg. write reg. shift/ALU shift/ALU read write reg. Thumb decompress instruction fetch decode Decode Execute Fetch Decode Execute Fetch Memory Write ARM7TDMI: ARM9TDMI:
  • 24. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 24 MANCHEstER 1824 The University of Manchester ARM9TDMI pipeline ❍ Pipeline introduces more dependencies ❍ alleviated with forwarding paths register write + 4 shifter ALU mul register read mux byte repl. D-cache rot/sgnex I decode I cache + 4 next PC PC+4 PC+8 (r15) LDM/STM B, BL, MOV pc, SUB pc LDR pc reg shift address post- index pre- index FETCH DECODE EXECUTE BUFFER/DATA WRITE immediate forwarding
  • 25. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 25 MANCHEstER 1824 The University of Manchester ARM9TDMI ❏ EmbeddedICE ❍ as ARM7TDMI, plus: – hardware single-stepping – breakpoints on exceptions ❏ On-chip coprocessor support ❍ for floating-point, DSP, and so on Process 0.25 µm Transistors 111,000 MIPS 220 Metal layers 3 Core area 2.1 mm2 Power 150 mW Vdd 2.5 V Clock 0-200 MHz MIPS/W 1,500
  • 26. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 26 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ➜ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 27. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 27 MANCHEstER 1824 The University of Manchester ARM10TDMI ❏ The ARM10TDMI is … ❍ aimed at significantly higher performance than the ARM9TDMI ❍ achieved through use of: – higher clock rate – 64-bit I- and D-memory buses – branch prediction – hit-under-miss D-memory interface
  • 28. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 28 MANCHEstER 1824 The University of Manchester ARM10TDMI pipeline ❏ 6-stage pipeline ❍ Additional time allowed for – I- and D-memory accesses – instruction decode multiply Write instruction fetch decode decode read multiplier partial add write reg. shift/ALU Memory Execute Decode Issue Fetch data mem. access write data addr. calc.
  • 29. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 29 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ➜ StrongARM & XScale ❍ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 30. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 30 MANCHEstER 1824 The University of Manchester StrongARM & XScale ❏ Most ARM implementations are designed by ARM Ltd. ❏ Two notable exceptions: ❍ StrongARM – designed by Digital, later bought by Intel – now largely obsolete ❍ XScale – designed by Intel – current – up to 800 MHz Both of these were designed primarily for high performance Used for proprietary devices
  • 31. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 31 MANCHEstER 1824 The University of Manchester XScale™ pipeline ❍ another deep pipeline ❍ (unpredicted) branches are expensive ❍ data (memory) dependencies cause stalls BTB fetch data mem. access early termination multiply accumulate fetch2 read/ shift decode ALU exec. state exec. ALU writeback data writeback MAC writeback
  • 32. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 32 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ➜ the ARM11 core ❍ Cortex ☞ hands-on: system software – interrupts
  • 33. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 33 MANCHEstER 1824 The University of Manchester ARM11 ❏ Most recent available ARM core (from ARM Ltd.) ❏ process portable ❏ high speed ❍ faster clock (~550MHz) ❍ deeper pipeline – also to accommodate DSP operations ❍ improved branch prediction
  • 34. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 34 MANCHEstER 1824 The University of Manchester ARM11 pipeline ❏ 8-stage pipeline ❍ Parallel ALU and data access (especially during LDM/STM) ❍ ‘Hit under Miss’ in ‘data1’ alleviates cache miss stalls instruction fetch Fetch 1 data mem. access branch prediction Fetch 2 instruction decode Decode register read Issue shift shift/addr./ ALU saturation register write Writeback multiply accumulate address generation register write MAC1 ALU/data1/ MAC2 Sat./data2/ MAC3
  • 35. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 35 MANCHEstER 1824 The University of Manchester ARM 11 branch prediction ❍ Two-level dynamic branch prediction – constant offset branches – 128 entry, direct mapped (addr[9:3]) – cost: 1 or 0 cycles if successful (taken/not taken) ❍ Static branch prediction – constant offset branches … – … that miss the dynamic predictor – cost: 4 cycles if successful ❍ Return stack – three entries – Pushes on BL or BLX (inc. BLX Rn) – Pops on: {BX lr; MOV pc, lr; LDR pc, [sp], #n; LDMIA sp!, {…,pc}} – cost: 4 cycles if successful
  • 36. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 36 MANCHEstER 1824 The University of Manchester ARM 11 dependencies ❍ Data operations forward results ❍ Load operations impose an extra two cycle penalty (cache hit) ❍ Memory operations require their address registers early Thus: ADD r2, r1, r0 ; produce R2 ADD r4, r3, r2 ; consume R2 - no stall LDR r2, [r1] ; load r2 ADD r4, r3, r2 ; lose 2 cycles waiting ADD r2, r1, r0 ; produce R2 LDR r4, [r2] ; lose one cycle LDR r2, [r1] ; LDR r4, [r2] ; lose 3 cycles in total – a few other dependencies may be seen from the pipeline figure
  • 37. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 37 MANCHEstER 1824 The University of Manchester ARM integer cores ❏ Outline: ❍ the ARM 3-stage pipeline ❍ the ARM7TDMI core ❍ the ARM 5-stage pipeline ❍ the ARM9TDMI core ❍ the ARM10TDMI core ❍ StrongARM & XScale ❍ the ARM11 core ➜ Cortex ☞ hands-on: system software – interrupts
  • 38. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 38 MANCHEstER 1824 The University of Manchester Cortex™ ❏ Three ‘flavours’: ❍ Cortex-A Series – applications processors – ARM, Thumb and Thumb-2 instruction sets ❍ Cortex-R Series – embedded processors for real-time systems. – ARM, Thumb, and Thumb-2 instruction sets ❍ Cortex-M Series – deeply embedded/cost sensitive processors – Thumb-2 instruction set only
  • 39. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 39 MANCHEstER 1824 The University of Manchester Cortex-M3 ❏ First of the line ❍ Thumb-2 only ❍ new design – 3-stage pipeline – Harvard core ❍ small core – ~ mm2 (excluding cache) ❍ performance (0.18µm): – ~100 MHz – ~8000 MIPS/W 1 2 -- -
  • 40. © 2005 PEVEIT Unit – ARM System Design ARM integer cores – v5 – 40 MANCHEstER 1824 The University of Manchester Hands-on: system software – interrupts ❏ Look further into ARM system software issues ❍ Write an interrupt handler ❍ Use it to trigger a context switch ☞ Follow the ‘Hands-on’ instructions