More Related Content More from FlexTiles Team (20) RAW 20121. www.thalesgroup.com
Heterogeneous Manycore with Self Adaptive Capabilities
and the Corresponding Industrial Needs
RAW 2012
Fabrice Lemonnier, 22nd May, 2012
Research & Technology
2. 2 / Manycore: main issue for industry
Programmability:
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Time to market
Development cost
Reuse of legacy software
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Why take so many risks with manycore ?
Most of industrials want to continue
like the past few years: compile without
thinking (as much as possible) !
No more Free lunch ! In the near future
the processors will all be made of multi-cores and many-
cores.
Nevertheless, can we provide solutions to ease the
programmation ?
3. 3 /
Tile-Gx100 from Tilera: 100 cores
•SMP
•Bare
•Standard
•Standard
•Multicore
Linux
Programmability:
environmentTM (MDE)
Development
C/C++ languages
Metal Environment
Debugging Tools (gdb 7)
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Programmability: Homogeneous manycores
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
4. 4 /
Multiprocessor
CUDA parallel
multi-threading
Programmability:
C/C++, openCL, …
Fermi from Nvidia 512 cores organised in 16 Streaming
programming model:
Programming languages:
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Programmability: Homogeneous manycores
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
5. 5 /
•Tools
sigmaC
•specific
the application
Programmability:
MPPA from Kalray: 256 cores organised in 16 clusters
to automatically map
data flow language:
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Programmability: Homogeneous manycores
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
6. 6 / Homogeneous manycores
Parallelisation is the only way
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
to raise computing power for
a low power consumption.
Homogeneity eases the
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
programming aspects
Maximum of performance is
reached only for static
application.
Moreover, tools can be used to make automatic
optimisation through data parallelism and generate static
allocation and scheduling.
7. 7 /
•
targeted application domain
Customisation
Australian Desert Animal: the Thorny Devil
Customization is necessary to raise the efficiency for a
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
But parallelisation is not enough
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
8. 8 /
OMAP: Communication market
power consumption ratio) but for a dedicated domain
Heterogeneity for the best efficiency (computing power –
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
MPSoC
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
9. 9 /
Fabric
Cluster
Cluster
Cluster
Cluster
Cluster
Cluster
Cluster
Cluster
Cluster
core
Fabric
Controller
Heterogeneous manycore P2012 from ST
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Heterogeneous manycores
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
10. 10 /
and …
Only affordable for large series of products.
Dedicated to a specific domain of application
no way to develop their own heterogeneous manycore
Industry with small and medium series of products have
An alternative is to use a combination between multicore
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Heterogeneous manycores
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
11. 11 /
ZYNQ: Xilinx FPGA with a dual core ARM A9 MPCore
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
FPGA + multicore
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
12. 12 /
Package (MCP)
Intel® Atom™ Processor E6x5C Series
GPP + dedicated accelerators on FPGA on a Multi-Chip
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
…or the inverse: GPP + FPGA
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
13. 13 /
Fabric
Cluster
Cluster
Cluster
ZYNQ
Cluster
Cluster
Cluster
Cluster
Cluster
Cluster
core
Fabric
Controller
A combination between the heterogeneous manycore
solution like P2012 and the FPGA+multicore approach like
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Our proposition
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
14. 14 /
•
•
A FPGA layer
A manycore layer
A 3D stacked chip based on:
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Our proposition
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
15. 15 / Most important Advantages
Increase accessibility to heterogeneous manycores
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
technology by allowing a customisation by the user
Reduction of the impact of the NRC
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Allow implementation of self adaptive capabilities
necessary for the future interactive applications and the
constraints of the current and future technologies
16. 16 /
low volume
Cognitive radio
low power consumption
Embedded Real-Time Applications
Smart camera
UAV
Adapt to environment dynamicity, flexibility & dependability
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Future applications issues
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
17. 17 / Self adaptive capabilities, why?
•Tobe able to dynamically adapt the architecture to the
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
current request of the application for the same power
consumption
•Evolutionof the technology: reduction of the reliability
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
and the yield of current and future sub-micron
technologies -> adaptation depending on the faulty cores.
•Increase energy efficiency
•Increasethe programming efficiency by taking a part of
the mapping complexity at runtime
•Temperature management -> adaptation of the
application mapping
18. 18 /
•
•
•Main
•FOSFOR
Projects:
•Morpheus
drawbacks:
multicore on FPGA
the scalability of the solution
the limitation of the size of the FPGA area
technologies managed by an ARM processor.
(ANR project): distributed OS for heterogeneous
(FP6 project): heterogeneous chip with 3 FPGA
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
State of the art
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
19. 19 /
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
20. 20 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
21. 21 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
22. 22 /
GPP
DSP nodes
eFPGA nodes
Slave Nodes
Master Nodes
Master-slave execution model
data
DMA
requests
DMA
NI
NI
NoC
GPP Node
control
/ status
node
accelerator
acc
requests
Accelerator Interface (AI)
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Programming efficiency: common execution model
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
23. 23 /
data_transfer
requests FIFOs
specificities
DMA
wait_sync
send_data
receive_data
send_sync2acc
send_sync2gpp
GPP
synchro
(master)
work
wait_sync
or (slave)
Accelerat
send_sync2dmu
Ensure hardware and software
independency with the accelerator
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Master-slave execution model
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
24. 24 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
25. 25 /
Tool flow and MoC
•Optimisation and parallelisation tools can only
be used on static applications.
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
•Necessity to identify static clusters inside the
applications based on SDF/CSDF MoC
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Act Act
SDF, CSDF MoC Act Act
Act : Actor
Act
: static cluster
Act Act Act : Clusters group managed
by one state management
: Cluster group input/output
actor: consume and produce token of data with : Cluster input/output
predefined and static rules
26. 26 / Tool flow and MoC
The Tool flow is based on Application
(C code)
2 main tools:
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
•Thales tool: SpearDE
Graphic C to SpearDE
•ACE tool: Cosy input representation
(manual) Conversion (Cosy)
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Data
architecture parallelisation
representation Mapping
(SpearDE)
Streaming
optimisation
(Cosy)
Compilation
(Cosy)
Library of IPs
executable code
Slave cores Master cores
27. 27 /
A1
A3
Ax
A2
A4
A5
cluster1
: partition
: static cluster
: cluster input/output
: partition input/output
: Actor number x
A3
A1
A4
A2
partition2
partition1
A5
partition3
cluster1p1
A1.4
A1.3
A1.1
A1.2
A2.4
A2.3
A2.1
A2.2
•DSP
•FPGA
A3
A4
A5
•DSP
•FPGA
cluster1p1
•DSP
•GPP
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Tools : partitionning, parallelisation and mapping
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
28. 28 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
29. 29 /
nodes
Generic
Interfaces
accelerators
GPP nodes
Heterogeneous
Homogeneous
AI
NI
NI
DSP
Node
GPP Node
AI
NI
NI
Node
GPP Node
Dedicated
Accelerator
HW acc.)
NoC
AI
NI
NI
Node
GPP Node
Dedicated
Accelerator
eFPGA Domain (Reconfigurable
NI
DDR Ctrl.
NI
NI
I/O
Config. Ctrl.
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Modularity and scalability: common interfaces
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
30. 30 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
31. 31 /
event
Act
Act
Act
Act
Act
states management
Act
Act
state 2
state 1
Act
state 3
cluster group
Act
: Actor
: static cluster
: Cluster input/output
: Cluster group input/output
: Clusters group managed
by one state management
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Dynamicity: the cluster group
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
32. 32 / Dynamicity at cluster group level
event cluster group 1 event cluster group 4
states management event cluster group 3 states management
states management state 1
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
state 1
state 1 Act Act
nop
Act Act Act
sensor
data
Act Act
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
event cluster group 5
states management
state 1
event cluster group 2
states management Act Act Act
s Act Act
c g
sensor a state 1.1 a
data Act : Actor
state 2 t Act Act t
Act
t h
Act Act Act
e e : static cluster
Act Act
r r
Act Act : Clusters group managed
state 1.2 by one state management
Act Act Act : Cluster group input/output
: Cluster input/output
Act Act
33. 33 / Start a new part of the application
event cluster group 1 event cluster group 4
states management event cluster group 3 states management
states management state 1
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
state 1 Act Act
Act Act Act
sensor
data state 2
Act Act
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Act Act Act
Act event cluster group 5
states management
state 1
event cluster group 2
states management Act Act Act
s Act Act
c g
sensor a state 1.1 a
data Act : Actor
state 2 t Act Act t
Act
t h
Act Act Act
e e : static cluster
Act Act
r r
Act Act : Clusters group managed
state 1.2 by one state management
Act Act Act : Cluster group input/output
: Cluster input/output
Act Act
34. 34 / Modification of the behaviour
event cluster group 1 event cluster group 4
states management event cluster group 3 states management
states management
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
state 1
Act Act Act
sensor state 2
data state 2
Act Act Act
Act
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Act Act Act
Act
Act event cluster group 5
states management
state 1
event cluster group 2
states management Act Act Act
s Act Act
c g
sensor a state 1.1 a
data Act : Actor
state 2 t Act Act t
Act
t h
Act Act Act
e e : static cluster
Act Act
r r
Act Act : Clusters group managed
state 1.2 by one state management
Act Act Act : Cluster group input/output
: Cluster input/output
Act Act
35. 35 / Modification of the parallelisation level
event cluster group 1 event cluster group 4
states management event cluster group 3 states management
states management
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
state 1
Act Act Act
sensor state 2
data state 2
Act Act Act
Act
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Act Act Act
Act
Act event cluster group 5
states management
state 1
event cluster group 2
states management Act Act Act
s Act Act
c g
sensor a a
data Act : Actor
state 2 t t
t h
Act Act Act
e e : static cluster
r r
Act Act : Clusters group managed
by one state management
: Cluster group input/output
: Cluster input/output
36. 36 /
A1.4
A1.3
A1.2
A1.1
relocation
A2.4
A2.3
A2.2
A2.1
•FPGA
A3
A4
A5
•FPGA
•GPP
cluster1p1
A1.4
A1.3
A1.2
A1.1
relocation
A2.4
A2.3
A2.2
A2.1
•DSP
A3
A4
A5
•DSP
•GPP
cluster1p1
A1.2
A1.4
A1.3
A1.1
relocation
A2.4
A2.3
A2.2
A2.1
•DSP
A3
A4
A5
•DSP
•DSP
time
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Dynamicity at cluster level
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
cluster1p1
37. 37 / Dynamic relocation
thread1
thread3 thread1 thread2thread2 thread4
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
API
I/O Acc1 Acc1 Acc3 Acc4 DDR ctrl
GPP GPP GPP GPP GPP GPP
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
NoC
runtime
Dynamic relocation
compile time
I/O
Acc1 thread1 thread2
Acc3
thread1 thread2 thread3 thread4
Acc1
Acc4 API
Tools for Tools for
parallelisation parallelisation
and mapping and mapping
Application
38. 38 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
39. 39 /
A Virtualisation Layer for self adaptive capabilities
Virtualisation services provide a high level of abstraction of the heterogeneous
resources: communication and accelerators management
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Self adaptive services define actions to be taken depending on events (monitoring):
relocation, DVFS,…
Allocation file Application
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
Self adaptive services
ACTION
Virtualisation Virtualisation
Layer services
MONITORING DIAGNOSIS
O = F(L)
SYSTEM
kernel Monitoring Actuators
Task Memory Network Communication
Scheduler
Cluster Semaphore
mngt mngt services management mngt event mngt
40. AI
NI
NI
40 /
DSP
Node
GPP Node
AI
NI
NI
Node
GPP Node
Dedicated
Accelerator
NoC
AI
NI
NI
Node
GPP Node
Dedicated
Accelerator
NI
DDR Ctrl.
NI
Config. Ctrl.
eFPGA Domain (Reconfigurable HW acc.)
NI
I/O
Accelerator/Virtual Code
Mapping
MONITORING
ACTION
SYSTEM
O = F(L)
DIAGNOSIS
Dynamic
allocation / binding
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Self-adaptation
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
41. 41 /
Model of Computation
Optimisation tools
Model of Execution
Model of programmation
Common Interfaces
strategies of relocation
Flexible Hardware
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
Holistic Approach
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
42. 42 /
NoC
Homogeneous manycore
Tile
Tile
Tile
Tile
Tile
Tile
Tile
FlexTiles: a 3D stack chip
Tile
Tile
3D stacked reconfigurable layer
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
New dynamic reconfigurable technology
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
43. 43 /
NoC
Homogeneous manycore
3D stacked reconfigurable layer
Tile
Tile
Tile
Tile
Tile
Tile
Tile
Map Accelerated functions
FlexTiles: a 3D stack chip
Tile
Tile
The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
New dynamic reconfigurable technology
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8