SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Parallella 
Presented By: 
Somnath Mazumdar 
University of Siena, Italy
Outline 
This Presentation was held on 
10th Dec 2014 
Place: 
Ericsson Research Lab, Lund 
Sweden This work is licensed under a Creative Commons Attribution 4.0 International License.
Outline 
Introduction 
Architecture 
System View 
Programming 
Conclusion 
Outline
Genesis 
Influenced by Open Source Hardware Design 
projects: 
Arduino 
Beaglebone 
Inspired by: 
Raspberry Pi 
Zedboard 
The board is open source hardware* 
*https://github.com/parallella/parallella-hw
In News “Smallest Supercomputer in the World” 
Adapteva A-1…... 
• Launched at 
ISC'14* 
• It has 2.112 RISC 
cores 
• Based on 64-core 
Epiphany board 
• Power 
Consumption 200 
Watt. 
• Performance: 16 
Gflop/s per Watt 
*http://primeurmagazine.com/weekly/AE-PR-07-14-104.html 
Image Source: 
https://twitter.com/StreamComputing/media
Adapteva (Zynq + Epiphany III) 
• Based on Epiphany™ architecture (Multi-core MIMD 
Architecture) 
• SoC fully programmable Xilinx Zynq with dual core CPU 
ARM Cortex-A9 
• 16/64-core microprocessor/coprocessor: 
No cache 
32-bit cores 
Max Clock Speed 1 GHz (600 MHz) 
Peak Performance : 32 GFLOPS 
Support Fused Multiply–Add (FMA) operations 
Superscalar floating-point (IEEE-754) RISC CPU Core 
Two floating point operations /clock cycle. 
• Supports Static Dual-Issue Scheduling
Adapteva (Zynq + Epiphany III) 
 IALU: Single 32-bit 
 integer operation/clk. cycle. 
 FPU: Single floating-point 
instruction /clk cycle 
 64 General purpose registers 
 Program Sequencer supports 
all standard program flows…. 
 Branching costs 3 cycles. 
 No hardware support: 
 Integer multiply 
 Floating point divide 
 Double-precision 
floating point ops. 
eCore CPU(1)
Epiphany Architecture(1) 
 Every router in the mesh is connected to North, East, West, South, and to a 
mesh node. 
 Routers at every node contains round-robin arbiters. 
 Routing hop latency is 1.5 clock cycles
Interconnects 
• Ecores are Connected by 2D 
low-latency NoC (eMesh) 
 rMesh for read 
 xMesh for off-chip write 
 cMesh for on-chip write 
• eMash has only nearest-neighbor 
direct connections. 
• Each routing link can 
transfer up to 8 bytes data 
on every clock cycle. Network-On-Chip Overview(1)
Interconnects 
Network Topology(1) 
• Network complete 
transactions in a single 
clock cycle because of 
spatial locality and short 
point-to-point on-chip 
wires. 
• Each mesh node has 
globally addressable ID (6 
row-ID and 6 col-ID)
Memory 
• Shared memory (32 bit wide flat memory and 
Chip Core Start Address End Address Size 
(0,0) 00000000 00007FFF 32KB 
unprotected) 
• Primary Memory: 1GB (DDR3 SDRAM) 
• Flash Memory: 128Mb (Boot code) 
• Is a little-endian memory architecture. 
• This, single, flat address space consisting of 232 8- 
bit bytes.(consisting of 230 32-bit words) 
• SRAM Distribution:
Memory 
• On every clock cycle 64 bits of data / instructions 
can be exchanged between memory and CPU’s 
register file, network interface or local DMA. 
• Dual channel DMA engine 
• Memory Mapped Registers 
• Each eCore has 32KB of local memory(4 sub-banks * 
8KB) 
• eCPU has a variable-length instruction pipeline that 
depends on the type of instruction being executed.
Memory Architecture(2)
Memory: Read-Write Transactions 
• Read transactions are non-blocking 
• RW transactions from local memory follow a strong 
memory-order model. 
• RW transactions that access non-local memory 
follow weak memory-order model. 
• Soln: Use run-time synchronization calls with 
order-dependent memory sequences. 
• Less inter-node communication
Scalability 
• It has four identical source-synchronous 
bidirectional 
off chip eLink. 
• eLink is non-blocking 
• Optimal bandwidth is 
achieved when a large 
number of incrementally 
numbered 64 bit data 
packets are sent 
consecutively 
FPGA eLink Integration(1)
360 Degree View(front) 
Image Source : http://www.parallella.org/board/
360 Degree View(back) 
Image Source : http://www.parallella.org/board/ 
PEC: Parallella Expansion Connector
How to get started.. 
1. Create a Parallella 
micro-SD card1 
2. Connect the wires 
mentioned in2 
3. Power On 
4. Go... 
1. http://www.parallella.org/create-sdcard/ 
2. http://www.parallella.org/quick-start/
Epiphany Host Library (eHAL) 
• Encapsulates low-level Epiphany functionality 
(Epiphany device driver) 
• Library interface is defined in “e-hal.h”. 
• Steps to write a program: 
1. Prepare the system: 
e_init(NULL); //Initialize system 
e_reset_system(); //reset the platform 
e_get_platform_info(&platform); // get the 
actual system parameters
Epiphany Host Library (eHAL) 
2. Allocate Memory(optional) 
e_mem_t emem; // object of type e_mem_t 
char emsg[Size]; 
e_alloc(&emem, <BufOffset>, <BufferSize>); 
//Allocate a buffer in shared external memory 
3. Open Workgroup: 
e_open(&dev, 0, 0, platform.rows, platform.cols); 
// open all cores 
(OR) 
e_open(&dev, 0, 0, 1, 1); // Core coordinates relative to 
the workgroup. 
e_reset_group(&dev); //Soft Reset
Epiphany Host Library (eHAL) 
4. Load program 
e_load("program", &dev, 0, 0, E_TRUE); 
5. Wait and then print message from buffer. 
usleep(time); 
e_read(&emem, 0, 0, 0x0, emsg, _BufSize); 
fprintf(stderr, ""%s"n", emsg); 
6: Close every connection. 
e_close(&dev); 
e_free(&emem); 
e_finalize();
Epiphany Hardware Utility Library 
(eLib) 
• Provides functions for configuring and querying 
eCores. 
• Also automates many common programming tasks in 
eCores 
• Steps to write an eCore program 
• Step1: Declare shared memory: 
char outbuf[128] SECTION("shared_dram"); 
• Step2: Enquire about eCore id: 
e_coreid_t coreid; 
coreid = e_get_coreid(); 
• Step3: Print “Hello World” with core id 
• Step4: Exit
Hello World 
int main(int argc, char *argv[]){ 
e_platform_t platform; 
e_epiphany_t dev; 
e_mem_t emem; 
char emsg[_BufSize]; 
e_init(NULL); 
e_reset_system(); 
e_get_platform_info(&platform); 
e_alloc(&emem, _BufOffset, 
_BufSize); 
e_open(&dev, 0, 0, 1, 1); 
e_load("e_core.srec", &dev, 0, 0, 
E_TRUE); 
usleep(10000); 
e_read(&emem, 0, 0, 0x0, emsg, 
_BufSize); 
fprintf(stderr, ""%s"n", emsg); 
e_close(&dev); 
fflush(stdout); 
e_free(&emem); 
e_finalize(); 
return 0; 
} 
#include <needed .h files> 
#include "e-lib.h" 
char outbuf[128] 
SECTION("shared_dram"); 
int main(void){ 
e_coreid_t coreid; 
coreid = e_get_coreid(); 
sprintf(outbuf, "Hello World from 
core 0x%03x!", coreid); 
return 0; 
} 
Host Side 
eCore Side
Epiphany Program Build Flow(2)
Where to put the code.. 
• 3 different Linker Description Files (LDF) 
• Internal.ldf : Store Data/Ins. in internal SRAM 
(limit 32KB). 
• Fast.ldf : User code/data and stack in internal 
SRAM. Standard libraries in external DRAM. 
Good for few large library functions 
• Legacy.ldf: Everything stored in external DRAM 
(limit 1MB) 
Slower than internal and legacy..
Synchronization(eCores) 
http://www.linuxplanet.org/blogs/?cat=2359 
Barrier for synchronizing 
parallel executing threads 
1. Setup 
e_barrier_init(bar_array[],tgt_bar_arr 
ay[]) 
2. Call Function 
3. Wait for sync 
e_barrier(bar_array[],tgt_bar_array[] 
Mutex(blocking & non 
blocking).. 
1. Setup: 
e_mutex_init(0,0,s_mutex, mutex_attr) 
2. Gain access: 
e_mutex_lock(0,0,s_mutex) 
3. Call function 
4. Release access 
e_mutex_unlock(0,0,s_mutex)
Image Source: http://xkcd.com/1445/
My Understanding 
Synchronization between the ARM and eCores use 
flag 
Because: eMesh writes from an individual Epiphany core to the 
external shared DRAM will update the DRAM in the same order 
as they were sent. However if multiple cores are writing to 
external DRAM, the sequence of writing into the DRAM will be 
changed. 
Soln: 
1. Set Flag 
2. Use software barrier function e_barrier() (time 
consuming) 
3. Use the experimental hardware barrier opcode
Useful for Sync 
Ecore side Read & Write: 
e_write(remote, Dst, row, col, Src, Byte_size); 
e_read(remote, Dst, row, col,Src, Byte_size); 
Remote parameter must be either: 
e_group_config if remote is workgroup core 
or 
e_emem_config if remote is an external memory buffer
Conclusion 
• Fast and power efficient 
• Power needed 5V/2A (0.3A -1.5A) 
• Fully-featured ANSI-C/C++ and OpenCL 
programming environments 
• Large Application domain support 
• But.. 
• Need Improved SDK (on the way..) 
• Cache might improve the performance (software cache is 
on the way…) 
• Synchronization and randomness is a big issue…
Reference 
1. Epiphany Architecture Reference 
http://www.adapteva.com/docs/epiphany_arch_ref.pdf 
2. Epiphany SDK Reference: 
http://adapteva.com/docs/epiphany_sdk_ref.pdf 
3. Esdk GitHub: 
https://github.com/adapteva/epiphany-sdk 
4. Reading: 
http://www.adapteva.com/all-documents/

Weitere ähnliche Inhalte

Andere mochten auch

Parallella: The Most Energy Efficient Supercomputer on the Planet
Parallella: The Most Energy Efficient Supercomputer on the PlanetParallella: The Most Energy Efficient Supercomputer on the Planet
Parallella: The Most Energy Efficient Supercomputer on the PlanetRaymond T Hightower
 
Building iOS Apps With RubyMotion
Building iOS Apps With RubyMotionBuilding iOS Apps With RubyMotion
Building iOS Apps With RubyMotionRaymond T Hightower
 
AMBIENT INTELLIGENCE by Bhagyasri Matta
AMBIENT INTELLIGENCE by Bhagyasri MattaAMBIENT INTELLIGENCE by Bhagyasri Matta
AMBIENT INTELLIGENCE by Bhagyasri Mattabagisrim
 
Seminar on Ambient Intelligence
Seminar on Ambient IntelligenceSeminar on Ambient Intelligence
Seminar on Ambient IntelligenceSreenivasa B
 
Ambient Intelligence made by Shifali Jindal
Ambient Intelligence made by Shifali JindalAmbient Intelligence made by Shifali Jindal
Ambient Intelligence made by Shifali JindalShifaliJindal
 
The Past, present, and (p)Future of the Parallella Project
The Past, present, and (p)Future of the Parallella ProjectThe Past, present, and (p)Future of the Parallella Project
The Past, present, and (p)Future of the Parallella ProjectAndreas Olofsson
 
Eye tracking
Eye trackingEye tracking
Eye trackingGrupoLER
 
Digi tek project fiona presentation may 1st
Digi tek project fiona presentation may 1stDigi tek project fiona presentation may 1st
Digi tek project fiona presentation may 1stjemillsunt
 
Smart things
Smart thingsSmart things
Smart things鹏 泽
 
Ambient intelligence
Ambient intelligenceAmbient intelligence
Ambient intelligencechandrika95
 
EYE TRACKING TECHNOLOGY
EYE TRACKING TECHNOLOGYEYE TRACKING TECHNOLOGY
EYE TRACKING TECHNOLOGYVikram raja
 
Ambient Intelligence
Ambient IntelligenceAmbient Intelligence
Ambient IntelligenceRam Inamdar
 
Light peak presentation
Light peak presentationLight peak presentation
Light peak presentationSimer Sahni
 
Eye-tracking presentation
Eye-tracking presentationEye-tracking presentation
Eye-tracking presentationPeter Smith
 
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...Prajs Ks
 

Andere mochten auch (20)

Parallella: The Most Energy Efficient Supercomputer on the Planet
Parallella: The Most Energy Efficient Supercomputer on the PlanetParallella: The Most Energy Efficient Supercomputer on the Planet
Parallella: The Most Energy Efficient Supercomputer on the Planet
 
Building iOS Apps With RubyMotion
Building iOS Apps With RubyMotionBuilding iOS Apps With RubyMotion
Building iOS Apps With RubyMotion
 
Agathos-PHD-uoi-2016
Agathos-PHD-uoi-2016Agathos-PHD-uoi-2016
Agathos-PHD-uoi-2016
 
AMBIENT INTELLIGENCE by Bhagyasri Matta
AMBIENT INTELLIGENCE by Bhagyasri MattaAMBIENT INTELLIGENCE by Bhagyasri Matta
AMBIENT INTELLIGENCE by Bhagyasri Matta
 
Ambient intelligence pranathi
Ambient intelligence pranathiAmbient intelligence pranathi
Ambient intelligence pranathi
 
Seminar on Ambient Intelligence
Seminar on Ambient IntelligenceSeminar on Ambient Intelligence
Seminar on Ambient Intelligence
 
Ambient Intelligence made by Shifali Jindal
Ambient Intelligence made by Shifali JindalAmbient Intelligence made by Shifali Jindal
Ambient Intelligence made by Shifali Jindal
 
The Past, present, and (p)Future of the Parallella Project
The Past, present, and (p)Future of the Parallella ProjectThe Past, present, and (p)Future of the Parallella Project
The Past, present, and (p)Future of the Parallella Project
 
Eye tracking
Eye trackingEye tracking
Eye tracking
 
Ambient intelligence
Ambient intelligence Ambient intelligence
Ambient intelligence
 
Digi tek project fiona presentation may 1st
Digi tek project fiona presentation may 1stDigi tek project fiona presentation may 1st
Digi tek project fiona presentation may 1st
 
Smart things
Smart thingsSmart things
Smart things
 
Ambient intelligence
Ambient intelligenceAmbient intelligence
Ambient intelligence
 
EYE TRACKING TECHNOLOGY
EYE TRACKING TECHNOLOGYEYE TRACKING TECHNOLOGY
EYE TRACKING TECHNOLOGY
 
Ambient Intelligence
Ambient IntelligenceAmbient Intelligence
Ambient Intelligence
 
Light peak presentation
Light peak presentationLight peak presentation
Light peak presentation
 
Eye-tracking presentation
Eye-tracking presentationEye-tracking presentation
Eye-tracking presentation
 
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...
Theeye tribe, it s a eye tracking device which makes the usage of PC, laptops...
 
Eye Tracking & Design
Eye Tracking & DesignEye Tracking & Design
Eye Tracking & Design
 
Smart things
Smart thingsSmart things
Smart things
 

Ähnlich wie Brief Introduction to Parallella

bfarm-v2
bfarm-v2bfarm-v2
bfarm-v2Zeus G
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesDr. Fabio Baruffa
 
Intel Briefing Notes
Intel Briefing NotesIntel Briefing Notes
Intel Briefing NotesGraham Lee
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerNETWAYS
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerNETWAYS
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Hajime Tazaki
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
 
Multicore processing
Multicore processingMulticore processing
Multicore processingguestc0be34a
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its TypesNimrah Shahbaz
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-reviewabinaya m
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Akhil Nadh PC
 
From L3 to seL4: What have we learnt in 20 years of L4 microkernels
From L3 to seL4: What have we learnt in 20 years of L4 microkernelsFrom L3 to seL4: What have we learnt in 20 years of L4 microkernels
From L3 to seL4: What have we learnt in 20 years of L4 microkernelsmicrokerneldude
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance CachingScyllaDB
 
Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorScalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorLou Loizides
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)Louis Loizides PE
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architectureJawid Ahmad Baktash
 

Ähnlich wie Brief Introduction to Parallella (20)

bfarm-v2
bfarm-v2bfarm-v2
bfarm-v2
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Intel Briefing Notes
Intel Briefing NotesIntel Briefing Notes
Intel Briefing Notes
 
Massively Parallel Architectures
Massively Parallel ArchitecturesMassively Parallel Architectures
Massively Parallel Architectures
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner Fischer
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data center
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
Multicore processing
Multicore processingMulticore processing
Multicore processing
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its Types
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-review
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]
 
General Purpose GPU Computing
General Purpose GPU ComputingGeneral Purpose GPU Computing
General Purpose GPU Computing
 
From L3 to seL4: What have we learnt in 20 years of L4 microkernels
From L3 to seL4: What have we learnt in 20 years of L4 microkernelsFrom L3 to seL4: What have we learnt in 20 years of L4 microkernels
From L3 to seL4: What have we learnt in 20 years of L4 microkernels
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Caching
 
Sucet os module_2_notes
Sucet os module_2_notesSucet os module_2_notes
Sucet os module_2_notes
 
Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorScalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architecture
 

Kürzlich hochgeladen

Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...amitlee9823
 
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men 🔝kakinada🔝 Escor...
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men  🔝kakinada🔝   Escor...➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men  🔝kakinada🔝   Escor...
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men 🔝kakinada🔝 Escor...amitlee9823
 
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...gajnagarg
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...Pooja Nehwal
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证ehyxf
 
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men 🔝Deoghar🔝 Escorts...
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men  🔝Deoghar🔝   Escorts...➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men  🔝Deoghar🔝   Escorts...
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men 🔝Deoghar🔝 Escorts...amitlee9823
 
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime GuwahatiGuwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahatimeghakumariji156
 
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证tufbav
 
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...motiram463
 
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制uodye
 
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...gajnagarg
 
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...amitlee9823
 

Kürzlich hochgeladen (20)

Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Kalyan Call On 9920725232 With Body to body massage wit...
 
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men 🔝kakinada🔝 Escor...
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men  🔝kakinada🔝   Escor...➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men  🔝kakinada🔝   Escor...
➥🔝 7737669865 🔝▻ kakinada Call-girls in Women Seeking Men 🔝kakinada🔝 Escor...
 
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...
Just Call Vip call girls Begusarai Escorts ☎️9352988975 Two shot with one gir...
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Th...
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
 
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men 🔝Deoghar🔝 Escorts...
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men  🔝Deoghar🔝   Escorts...➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men  🔝Deoghar🔝   Escorts...
➥🔝 7737669865 🔝▻ Deoghar Call-girls in Women Seeking Men 🔝Deoghar🔝 Escorts...
 
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In RT Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime GuwahatiGuwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
 
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Bommasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Vinay Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Pimple Saudagar Call Me 7737669865 Budget Friendly No Advance Booking
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
 
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Kothrud Call Me 7737669865 Budget Friendly No Advance Booking
 
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
(👉Ridhima)👉VIP Model Call Girls Mulund ( Mumbai) Call ON 9967824496 Starting ...
 
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
 
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Shillong Escorts ☎️9352988975 Two shot with one girl...
 
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...
Just Call Vip call girls chhindwara Escorts ☎️9352988975 Two shot with one gi...
 
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Bhiwandi Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Banashankari Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
 
Critical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptxCritical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptx
 

Brief Introduction to Parallella

  • 1. Parallella Presented By: Somnath Mazumdar University of Siena, Italy
  • 2. Outline This Presentation was held on 10th Dec 2014 Place: Ericsson Research Lab, Lund Sweden This work is licensed under a Creative Commons Attribution 4.0 International License.
  • 3. Outline Introduction Architecture System View Programming Conclusion Outline
  • 4. Genesis Influenced by Open Source Hardware Design projects: Arduino Beaglebone Inspired by: Raspberry Pi Zedboard The board is open source hardware* *https://github.com/parallella/parallella-hw
  • 5. In News “Smallest Supercomputer in the World” Adapteva A-1…... • Launched at ISC'14* • It has 2.112 RISC cores • Based on 64-core Epiphany board • Power Consumption 200 Watt. • Performance: 16 Gflop/s per Watt *http://primeurmagazine.com/weekly/AE-PR-07-14-104.html Image Source: https://twitter.com/StreamComputing/media
  • 6. Adapteva (Zynq + Epiphany III) • Based on Epiphany™ architecture (Multi-core MIMD Architecture) • SoC fully programmable Xilinx Zynq with dual core CPU ARM Cortex-A9 • 16/64-core microprocessor/coprocessor: No cache 32-bit cores Max Clock Speed 1 GHz (600 MHz) Peak Performance : 32 GFLOPS Support Fused Multiply–Add (FMA) operations Superscalar floating-point (IEEE-754) RISC CPU Core Two floating point operations /clock cycle. • Supports Static Dual-Issue Scheduling
  • 7. Adapteva (Zynq + Epiphany III)  IALU: Single 32-bit  integer operation/clk. cycle.  FPU: Single floating-point instruction /clk cycle  64 General purpose registers  Program Sequencer supports all standard program flows….  Branching costs 3 cycles.  No hardware support:  Integer multiply  Floating point divide  Double-precision floating point ops. eCore CPU(1)
  • 8. Epiphany Architecture(1)  Every router in the mesh is connected to North, East, West, South, and to a mesh node.  Routers at every node contains round-robin arbiters.  Routing hop latency is 1.5 clock cycles
  • 9. Interconnects • Ecores are Connected by 2D low-latency NoC (eMesh)  rMesh for read  xMesh for off-chip write  cMesh for on-chip write • eMash has only nearest-neighbor direct connections. • Each routing link can transfer up to 8 bytes data on every clock cycle. Network-On-Chip Overview(1)
  • 10. Interconnects Network Topology(1) • Network complete transactions in a single clock cycle because of spatial locality and short point-to-point on-chip wires. • Each mesh node has globally addressable ID (6 row-ID and 6 col-ID)
  • 11. Memory • Shared memory (32 bit wide flat memory and Chip Core Start Address End Address Size (0,0) 00000000 00007FFF 32KB unprotected) • Primary Memory: 1GB (DDR3 SDRAM) • Flash Memory: 128Mb (Boot code) • Is a little-endian memory architecture. • This, single, flat address space consisting of 232 8- bit bytes.(consisting of 230 32-bit words) • SRAM Distribution:
  • 12. Memory • On every clock cycle 64 bits of data / instructions can be exchanged between memory and CPU’s register file, network interface or local DMA. • Dual channel DMA engine • Memory Mapped Registers • Each eCore has 32KB of local memory(4 sub-banks * 8KB) • eCPU has a variable-length instruction pipeline that depends on the type of instruction being executed.
  • 14. Memory: Read-Write Transactions • Read transactions are non-blocking • RW transactions from local memory follow a strong memory-order model. • RW transactions that access non-local memory follow weak memory-order model. • Soln: Use run-time synchronization calls with order-dependent memory sequences. • Less inter-node communication
  • 15. Scalability • It has four identical source-synchronous bidirectional off chip eLink. • eLink is non-blocking • Optimal bandwidth is achieved when a large number of incrementally numbered 64 bit data packets are sent consecutively FPGA eLink Integration(1)
  • 16. 360 Degree View(front) Image Source : http://www.parallella.org/board/
  • 17. 360 Degree View(back) Image Source : http://www.parallella.org/board/ PEC: Parallella Expansion Connector
  • 18. How to get started.. 1. Create a Parallella micro-SD card1 2. Connect the wires mentioned in2 3. Power On 4. Go... 1. http://www.parallella.org/create-sdcard/ 2. http://www.parallella.org/quick-start/
  • 19. Epiphany Host Library (eHAL) • Encapsulates low-level Epiphany functionality (Epiphany device driver) • Library interface is defined in “e-hal.h”. • Steps to write a program: 1. Prepare the system: e_init(NULL); //Initialize system e_reset_system(); //reset the platform e_get_platform_info(&platform); // get the actual system parameters
  • 20. Epiphany Host Library (eHAL) 2. Allocate Memory(optional) e_mem_t emem; // object of type e_mem_t char emsg[Size]; e_alloc(&emem, <BufOffset>, <BufferSize>); //Allocate a buffer in shared external memory 3. Open Workgroup: e_open(&dev, 0, 0, platform.rows, platform.cols); // open all cores (OR) e_open(&dev, 0, 0, 1, 1); // Core coordinates relative to the workgroup. e_reset_group(&dev); //Soft Reset
  • 21. Epiphany Host Library (eHAL) 4. Load program e_load("program", &dev, 0, 0, E_TRUE); 5. Wait and then print message from buffer. usleep(time); e_read(&emem, 0, 0, 0x0, emsg, _BufSize); fprintf(stderr, ""%s"n", emsg); 6: Close every connection. e_close(&dev); e_free(&emem); e_finalize();
  • 22. Epiphany Hardware Utility Library (eLib) • Provides functions for configuring and querying eCores. • Also automates many common programming tasks in eCores • Steps to write an eCore program • Step1: Declare shared memory: char outbuf[128] SECTION("shared_dram"); • Step2: Enquire about eCore id: e_coreid_t coreid; coreid = e_get_coreid(); • Step3: Print “Hello World” with core id • Step4: Exit
  • 23. Hello World int main(int argc, char *argv[]){ e_platform_t platform; e_epiphany_t dev; e_mem_t emem; char emsg[_BufSize]; e_init(NULL); e_reset_system(); e_get_platform_info(&platform); e_alloc(&emem, _BufOffset, _BufSize); e_open(&dev, 0, 0, 1, 1); e_load("e_core.srec", &dev, 0, 0, E_TRUE); usleep(10000); e_read(&emem, 0, 0, 0x0, emsg, _BufSize); fprintf(stderr, ""%s"n", emsg); e_close(&dev); fflush(stdout); e_free(&emem); e_finalize(); return 0; } #include <needed .h files> #include "e-lib.h" char outbuf[128] SECTION("shared_dram"); int main(void){ e_coreid_t coreid; coreid = e_get_coreid(); sprintf(outbuf, "Hello World from core 0x%03x!", coreid); return 0; } Host Side eCore Side
  • 25. Where to put the code.. • 3 different Linker Description Files (LDF) • Internal.ldf : Store Data/Ins. in internal SRAM (limit 32KB). • Fast.ldf : User code/data and stack in internal SRAM. Standard libraries in external DRAM. Good for few large library functions • Legacy.ldf: Everything stored in external DRAM (limit 1MB) Slower than internal and legacy..
  • 26. Synchronization(eCores) http://www.linuxplanet.org/blogs/?cat=2359 Barrier for synchronizing parallel executing threads 1. Setup e_barrier_init(bar_array[],tgt_bar_arr ay[]) 2. Call Function 3. Wait for sync e_barrier(bar_array[],tgt_bar_array[] Mutex(blocking & non blocking).. 1. Setup: e_mutex_init(0,0,s_mutex, mutex_attr) 2. Gain access: e_mutex_lock(0,0,s_mutex) 3. Call function 4. Release access e_mutex_unlock(0,0,s_mutex)
  • 28. My Understanding Synchronization between the ARM and eCores use flag Because: eMesh writes from an individual Epiphany core to the external shared DRAM will update the DRAM in the same order as they were sent. However if multiple cores are writing to external DRAM, the sequence of writing into the DRAM will be changed. Soln: 1. Set Flag 2. Use software barrier function e_barrier() (time consuming) 3. Use the experimental hardware barrier opcode
  • 29. Useful for Sync Ecore side Read & Write: e_write(remote, Dst, row, col, Src, Byte_size); e_read(remote, Dst, row, col,Src, Byte_size); Remote parameter must be either: e_group_config if remote is workgroup core or e_emem_config if remote is an external memory buffer
  • 30. Conclusion • Fast and power efficient • Power needed 5V/2A (0.3A -1.5A) • Fully-featured ANSI-C/C++ and OpenCL programming environments • Large Application domain support • But.. • Need Improved SDK (on the way..) • Cache might improve the performance (software cache is on the way…) • Synchronization and randomness is a big issue…
  • 31. Reference 1. Epiphany Architecture Reference http://www.adapteva.com/docs/epiphany_arch_ref.pdf 2. Epiphany SDK Reference: http://adapteva.com/docs/epiphany_sdk_ref.pdf 3. Esdk GitHub: https://github.com/adapteva/epiphany-sdk 4. Reading: http://www.adapteva.com/all-documents/