1. FlexTiles
Self-adaptive heterogeneous many-core based on Flexible Tiles
Dr. Gabriel Marchesan Almeida
gabriel.almeida@kit.edu
Institute for Information Processing Technology (ITIV)
Prof. Dr.-Ing. K. D. Müller-Glaser · Prof. Dr.-Ing. J. Becker · Prof. Dr. rer. nat. W. Stork
KIT – Universität des Landes Baden-Württemberg und
nationales Forschungszentrum in der Helmholtz-Gemeinschaft www.itiv.kit.edu
2. Motivation
architectures are designed in very customized ways to deal with a specific
problem:
well defined set of applications;
pre-defined budgets (power/energy + area + time-to-market);
several requirements must be met upon application execution:
power/energy consumption;
performance (application throughput / deadlines);
complexity of applications is increasing;
parallelization is the solution;
Source: http://baldmike2004.xanga.com/
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
2
Gabriel Marchesan Almeida 11.10.2012
3. Motivation
issues for industry:
let’s be as conservative as possible and
keep everything under control!
why to take so many risks with many-core
architectures?
Source: http://shop.cafepress.com/old-school-conservative
applications often exhibit time-changing
workloads (mapping decisions sub-optimal);
Proposition Cognitive Radio Smart Camera
novel many-core architecture based on reconfigurable devices (FPGAs),
DSPs and GPPs with a clever virtualization layer;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
3
Gabriel Marchesan Almeida 11.10.2012
4. Source: http://thefreeman.net
Who are we?
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
4
Gabriel Marchesan Almeida 11.10.2012
5. Project Consortium
Partners:
Project Goals:
Propose novel adaptive techniques for many-core
architectures;
Autonomous decision making mechanism;
Budget: 3.67M € Provide an innovative virtualization layer and
Period: 15.10.2011 – 14.10.2014 dedicated tool-flow to:
Duration: 36 months
improve programming efficiency;
Coordinator: Fabrice Lemmonier
reduce the impact on time to market;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
5
Gabriel Marchesan Almeida 11.10.2012
6. TILEPro64™ (Tilera)
8 x 8 grid general purpose processor cores (tiles);
ANSI standard C and C++;
Up to 443 BOPS (billion operations per second);
Support SMP Linux with 2.6 Kernel;
Compute-intensive applications such as advanced networking, digital multimedia and
telecom, wireless infrastructure
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
6
Gabriel Marchesan Almeida 11.10.2012
7. Fermi Architecture (Nvidia)
3 billion transistors;
up to 512 CUDA cores;
a CUDA core executes a floating point or
integer instruction per clock for a thread;
16 SMs (Streaming Multiprocessor) of 32
cores each;
CUDA parallel programming model SFU
(Special
Function Unit)
Transcendental
Instructions
(sin, cosine,
square root,
etc.)
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
7
Gabriel Marchesan Almeida 11.10.2012
8. Homogeneous architectures
replica of the same processing element;
intended to be more flexible;
programmability facilities make of such architectures good
solutions for future scalable systems;
intended to better deal with faults that may appear in the system;
Source: http://www.starwarsreport.com/
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
8
Gabriel Marchesan Almeida 11.10.2012
9. … but
Source: http://knowyourmeme.com/
… this is not enough!
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
9
Gabriel Marchesan Almeida 11.10.2012
10. Customization is needed to raise efficiency of
applications
Source: http://saxonyfineclothing.com/
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
10
Gabriel Marchesan Almeida 11.10.2012
11. QualComm MSM7200
MPSoC + specialized ISPs + HW accelerators;
4 cores heterogeneous / shared memory design + a number of accelerators;
Dataflow: message passing;
4 differentiated CPUs:
PE PE PE PE
L1 / L2 / L3
Caches
L1 / L2 / L3
Caches
L1 / L2 / L3
Caches
L1 / L2 / L3
Caches
- ARM 11 (Application
proc.)
- ARM 9 (Modem)
MAIN
MEMORY
I/O SYSTEM - 2 DSPs (Audio +
Modem)
TASK 1 TASK 2
Process1(){ Process2(){
a = 1; …
send(a,task2);
… a = receive(task1); 2D/3D, Java
… a = a * 2;
… … Accelerators
… send(a,task1);
… }
a = receive(task2);
a = a + 5;
}
11 Self-adaptive heterogeneous manycore based on Flexible Tiles HYBRID MODEL Institute for Information Processing Technology (ITIV)
Gabriel Marchesan Almeida 11.10.2012
12. Heterogeneous architectures
dedicated to a specific domain of applications;
efficient architectures:
low power consumption;
high processing power;
What is the price to pay?
Source: http://www.bripblap.com/
reduced flexibility;
poor scalability;
hard programmability;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
12
Gabriel Marchesan Almeida 11.10.2012
13. Challenge
How to get the best of both worlds?
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
13
Gabriel Marchesan Almeida 11.10.2012
14. Challenge
APPLICATIONS
PROCESSORS
How to efficiently map complex applications to
Source: http://www.funtoosh.com
many-core architectures with limited budgethttp://www.vision.caltech.edu
Source:
(power, performance, …)
FPGA
DSP
???
Source: http://www.gamearenaph.com
LIMITED BUDGET
Source: http://www.lnci.org.au
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
14
Gabriel Marchesan Almeida 11.10.2012
15. FlexTiles – Architecture Overview
TILE
Tile TILE
Tile
GPP Node GPP Node GPP Node DDR Ctrl. I/O
NI NI NI NI NI
NoC
Network Interface
NI Interfaces a node with NoC
NI NI NI NI
AI Accelerator Interface
Interpret requests from GPP
AI AI AI Config. Ctrl.
DSP eFPGA Domain (Reconfigurable HW acc.)
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
15
Gabriel Marchesan Almeida 11.10.2012
16. FlexTiles – Tool-flow
Adaptive Techniques
Source: http://www.psdgraphics.com
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
16
Gabriel Marchesan Almeida 11.10.2012
17. Adaptation
An adaptive system is a set of interacting entities
able to respond to environmental changes or
changes in the interacting parts.
Source: http://www.stjohns.edu
ADAPTATION
ARCHITECTURE SYSTEM
LEVEL LEVEL
DYNAMIC TASK
DYNAMIC OVERALL
ADAPT AS FAST IMPROVE
FREQUENCY MAPPING MIGRATION
AS POSSIBLE PERFORMANCE
SCALING
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
17
Gabriel Marchesan Almeida 11.10.2012
18. Information Management
Information Management and Decision Making Mechanisms:
Monitoring
Diagnosis Reference: Gabriel Marchesan Almeida. Adaptive
Action Multiprocessor Systems-on-Chip Architectures: Principles,
Methods and Tools, 124p. LAP LAMPERT,
ISBN 978-3848424282, 2012.
ACTION
MONITORING DIAGNOSIS
O = F(L)
SYSTEM
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
18
Gabriel Marchesan Almeida 11.10.2012
19. FlexTiles – Architecture
A 3D stacked chip based on:
A many-core layer
A FPGA layer
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
19
Gabriel Marchesan Almeida 11.10.2012
20. FlexTiles – Techniques
Static Mapping: applications are mapped at design-time according to a given heuristic
1 TASK MAPPING TABLE
APP TASK NPU
2 3 APP 1 1 1 0x0000
0000 0100 1 0200
2 0x0100
4 5 1 3 0x0001
DESIGN-TIME 1 4 0x0001
TASK MAPPING
1 5 0x0001
ALGORITHM
2 1 0x0002
0001 0101 0201
1 2 2 0x0102
2 3 0x0101
2 3 APP 2 2 4 0x0102
2 5 0x0202
5 0002 0102 0202
4
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
20
Gabriel Marchesan Almeida 11.10.2012
21. Information Management
Monitoring, Diagnosis, Action (MDA)
FIFO Filling
40%
1 2 1
80%
ACTION 1 3
2 3
60%
MONITORING DIAGNOSIS 3 4
O = F(L)
60% 4
SYSTEM
2 4
20%
4 5 5
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
21
Gabriel Marchesan Almeida11.10.2012
22. Information Management
Monitoring, Diagnosis, Action (MDA)
CPU Workload
ACTION
MONITORING DIAGNOSIS
O = F(L)
SYSTEM
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
22
Gabriel Marchesan Almeida11.10.2012
23. Information Management
Monitoring, Diagnosis, Action (MDA)
Application Throughput 1
1 2
2 3
ACTION 1 3
4
MONITORING DIAGNOSIS 3 4
O = F(L)
SYSTEM
2 4 5
3,58 MB/s
4 5
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
23
Gabriel Marchesan Almeida11.10.2012
24. Information Management
Monitoring, Diagnosis, Action (MDA)
Draw conclusions based on monitored information
CPU is getting overloaded
CPU is most of the time in idle mode
Application throughput is decreasing
ACTION
MONITORING DIAGNOSIS
O = F(L)
SYSTEM
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
24
Gabriel Marchesan Almeida11.10.2012
25. Information Management
Monitoring, Diagnosis, Action (MDA)
Decisions are made based on both monitored information and diagnosis
1. Reduce processor frequency whenever CPU is running in idle mode or
no-high speed processing is required;
2. Increase processor frequency in order to meet application performance
requirements;
ACTION
3. Migrate a task whenever CPU becomes overloaded;
MONITORING DIAGNOSIS
O = F(L)
SYSTEM
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
25
Gabriel Marchesan Almeida11.10.2012
26. FlexTiles – Techniques
Task Migration: tasks are migrated at run-time according to certain criteria
Task is Migrated
=
Improved
Load Balancing 1 2
0000 0100 0200
3 4
Performance
3
5
0001 0101 0201
2
1 5
0002 4
0102 0202
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
26
Gabriel Marchesan Almeida11.10.2012
27. FlexTiles – Techniques
reconfigurable areas
Tile GPP node GPP node
GPP node GPP node GPP node
GPP node GPP node GPP node
Dynamic reconfiguration
of customisation layer : migration and
relocation
Tile Tile GPP node
eFPGA – reconfigurable resources are
seen as a homogeneous set of resources GPP node GPP node GPP node
(to be allocated at run-time); GPP node GPP node GPP node
this leads to a better resource sharing
among the many-core SoC;
enable implementation of large
accelerators if required;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
27
Gabriel Marchesan Almeida 11.10.2012
28. FlexTiles – Platforms
CompOSe
PCRUN Model Low-Level Model
EMULATION FPGA
PROTOTYPE
High-Level Model
SIMULATION
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
28
Gabriel Marchesan Almeida 11.10.2012
29. FlexTiles – Simplify
Simplify Framework (http://simplify.itiv.kit.edu)
• Number of processors
(1) Architecture Modeling
• Interconnection type (bus, NoC)
• Memory size
Processing Element
(2) • Processor type (microBlaze, MIPS32, ARM7,
Configuration
OpenRISC (OR1K) and PowerPC)
Application
(3) • C Programming language
Description
Application • Cross-compilers
(4) Compilation and • Operating system (Windows, Linux)
Model Execution • Architecture (32, 64 bits)
• Applications trace
Execution • MIPS per processor and total MIPS
(5)
Reports • Number of simulated instructions
• Simulation time
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
29
Gabriel Marchesan Almeida 11.10.2012
30. FlexTiles – Simplify
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
30
Gabriel Marchesan Almeida 11.10.2012
31. FlexTiles – Simplify
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
31
Gabriel Marchesan Almeida 11.10.2012
32. FlexTiles – Simplify
Version 1.0 Version 2.0
Processors: OS support:
MIPS32, microBlaze, ARM7, openRISC; PowerPC; Round-robin scheduler;
Interconnect: Semaphores;
Bus; Mutexes;
Web framework: Multi-task;
Architecture modeling; Communication API for applications;
PE (processing element) configuration; Web framework:
Application description, compilation and execution; New design;
Execution reports; Improved performance;
Automatic generation of OVP platforms; Application profiling (instruction
counter per application);
No OS support;
Mono application – 1 per core;
No API for app. communication;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
32
Gabriel Marchesan Almeida 11.10.2012
33. FlexTiles – Simplify Sheet1
2000
MIPS32 ARM7 OR1K POWERPC32 MICROBLAZE
1800
(MILLION INSTRUCTIONS PER SECOND)
1600
1400
1200
MIPS
1000
800
600
400
200
0
DHRYSTONE LINPACK PEAKSPEED1 SHA1
BUBBLESORT FIBONACCI MERGESORT QUICKSORT SUSAN
APPLICATIONS
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
33
Gabriel Marchesan Almeida 11.10.2012
34. FlexTiles – Simplify
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
34
Gabriel Marchesan Almeida 11.10.2012
35. Closing Remarks
FlexTiles is a novel architecture which contains several adaptive
techniques mainly used for:
Improving application performance;
Reducing energy/power consumption;
Decreasing temperature hot-spots;
The tool-flow ease the programmability of many-core heterogeneous
platforms;
Application-driven frequency scaling:
Performance requirements;
Power consumption budget;
Feedback to application designers;
Source: http://www.charlesphoenix.com/
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
35
Gabriel Marchesan Almeida 11.10.2012
36. Karlsruhe Institute of Technology
Thank you for your attention
Dr. Gabriel Marchesan Almeida
Institute for Information Processing Technology (ITIV)
gabriel.almeida@kit.edu
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
36
Gabriel Marchesan Almeida 11.10.2012
37. Backup Slides
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
37
Gabriel Marchesan Almeida 11.10.2012
38. Programming Model
Application is a set of static clusters;
A cluster is described using Synchronous Data Flow (SDF) or
Cyclo-Static Data Flow (CSDF);
Within a data flow, each
consumer/producer of tokens is
called actor;
Actors are featured by nested loops
implementing the operators and the
rules of token consumption/production;
Two actors communicate through
FIFOs of tokens;
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
38
Gabriel Marchesan Almeida 11.10.2012
39. Programming Model
Self-adaptive heterogeneous manycore based on Flexible Tiles Institute for Information Processing Technology (ITIV)
39
Gabriel Marchesan Almeida 11.10.2012