SlideShare ist ein Scribd-Unternehmen logo
1 von 84
Migration to  Multi-Core Zvi Avraham, CTO [email_address]
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multi-Core vs. Many-Core Multi-Core:  ≤8 cores/threads Many-Core: >8 cores/threads
Multi-Core x86 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Many-Core Processors (MPU) the replacement for DSP and FPGA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intel Tera-scale ,[object Object],[object Object]
Cell Processor SONY Playstation 3 ,[object Object],[object Object],[object Object]
Xenon CPU Microsoft XBOX 360 ,[object Object]
NVIDIA GeForce 8800GTX ,[object Object]
Intel Larrabee GPU ,[object Object],[object Object],[object Object]
Hardware Slides ,[object Object],[object Object]
Parallel Programming Models
Concurrent/Parallel/Distributed ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Levels of HW Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Flynn’s Taxonomy Single Instruction Multiple Instruction Single Data SISD MISD Multiple Data SIMD MIMD
Flynn’s Taxonomy (2)
Parallel Programming Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Shared State ,[object Object],[object Object],[object Object],[object Object]
Message Passing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Message Passing (cont.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Active Object Design Pattern
Rational’s Capsule ROOM – Real-Time Object Oriented Modeling
Implicit vs. Explicit Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Implicit vs. Explicit Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Parallel vs. Task Parallel ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Parallel example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Task Parallel example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
“ Plain” vs Nested Data Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Parallel Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Scatter / Gather
Fork / Join Barries – “joins” Parallel Regions Master Thread
Recursive Fork/Join ,[object Object]
Map / Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Map / Reduce
Types of Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Embarrassingly Parallel Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extended Flynn’s Taxonomy ,[object Object],[object Object],[object Object],[object Object]
CPU Affinity Changing Process and Thread Affinity
CPU Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why mess with CPU Affinity? Legacy Code Migration ,[object Object],[object Object],[object Object]
Why mess with CPU Affinity?  Real-Time and Determinism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why mess with CPU Affinity?  Memory Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Changing CPU Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
CPU Affinity – Task Manager
CPU Affinity – Task Manager
ImageCFG – Affinity Mask Tool ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Process.exe – Get Affinity Mask ,[object Object]
Process.exe – Set Affinity Mask ,[object Object],[object Object]
IntFiltr.exe   – Interrupt Affinity Filter ,[object Object],[object Object],[object Object]
Parallel Programming APIs
Parallel Programming APIs ,[object Object],[object Object],[object Object],[object Object]
Simplicity / Complexity ,[object Object],[object Object]
Capabilities Comparison Intel TBB OpenMP Threads Task level parallelism + + - Data decomposition support + + - Complex parallel patterns (non-loops) + - - Generic parallel patterns + - - Scalable nested parallelism support + - - Built-in load balancing + + - Affinity support - + + Static scheduling - + - Concurrent data structures + - - Scalable memory allocator  + - - I/O dominated tasks - - + User-level synch. primitives + + - Compiler support is not required + - + Cross OS support + + -
Native Threads ,[object Object],[object Object],[object Object],[object Object]
Win32 Threads API It’s assumed, that reader/listener is familiar with basic multithreading and corresponding Win32 API
What are Win32 Threads? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Win32 API Hierarchy for Concurrency Windows OS Job Job Process Primary Thread Thread Process Fiber Fiber
Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thread ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Job object ,[object Object],[object Object],[object Object]
Fiber / Coroutine / Microthread ,[object Object],[object Object],[object Object],[object Object],[object Object]
Win32 Threads API Example: CreateThread
Win32 Threads API: Critical Section
CCriticalSection C++ Class
CCriticalSection & CLock example
Thread Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object]
Thread Ideal Processor ,[object Object],[object Object],[object Object]
Our running Example:  The PI program Numerical Integration    4.0 (1+x 2 ) dx =   0 1    F(x i )  x       i = 0 N Mathematically, we know that: We can approximate the integral as a sum of rectangles: Where each rectangle has width   x and height F(x i ) at the middle of interval i. F(x) = 4.0/(1+x 2 ) 4.0 2.0 1.0 X 0.0
PI: Matlab N=1000000;  Step = 1/N;  PI = Step*sum(4./(1+(((1:N)-0.5)*Step).^2)); ,[object Object],[object Object],[object Object]
PI Program: an example static long num_steps = 100000; double step; void main () {   int i;    double x, pi, sum = 0.0;   step = 1.0/(double) num_steps;   for (i=1;i<= num_steps; i++){   x = (i-0.5)*step;   sum = sum + 4.0/(1.0+x*x);   }   pi = step * sum; }
OpenMP PI Program :  Parallel for with a reduction #include <omp.h> static long num_steps = 100000;  double step; #define NUM_THREADS 2 void main () {   int i;    double x, pi, sum = 0.0;   step = 1.0/(double) num_steps;   omp_set_num_threads(NUM_THREADS); #pragma omp parallel for reduction(+:sum) private(x)   for (i=1;i<= num_steps; i++){   x = (i-0.5)*step;   sum = sum + 4.0/(1.0+x*x);   }   pi = step * sum; } OpenMP adds 2 to 4 lines of code
OpenMP
OpenMP Slides ,[object Object],[object Object]
OpenMP Compiler Option ,[object Object],[object Object]
OpenMP Compiler Option ,[object Object],[object Object],[object Object]
Auto-Parallelization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intel® TBB Thread Building Blocks C++ Library
Calculate PI using TBB
Calculate PI in MPI
MPI_Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Commercial Multi-core ,[object Object],[object Object]
Any Questions?
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Thrashing allocation frames.43
Thrashing allocation frames.43Thrashing allocation frames.43
Thrashing allocation frames.43myrajendra
 
Memory management ppt
Memory management pptMemory management ppt
Memory management pptManishaJha43
 
Operating system Dead lock
Operating system Dead lockOperating system Dead lock
Operating system Dead lockKaram Munir Butt
 
Operating system memory management
Operating system memory managementOperating system memory management
Operating system memory managementrprajat007
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory MultiprocessorsSalvatore La Bua
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) A B Shinde
 
Unit IV Memory and I/O Organization
Unit IV Memory and I/O OrganizationUnit IV Memory and I/O Organization
Unit IV Memory and I/O OrganizationBalaji Vignesh
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algosAkhil Sharma
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platformsSyed Zaid Irshad
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platformsSyed Zaid Irshad
 

Was ist angesagt? (20)

Thrashing allocation frames.43
Thrashing allocation frames.43Thrashing allocation frames.43
Thrashing allocation frames.43
 
Memory management ppt
Memory management pptMemory management ppt
Memory management ppt
 
Multiprocessing operating systems
Multiprocessing operating systemsMultiprocessing operating systems
Multiprocessing operating systems
 
Operating system Dead lock
Operating system Dead lockOperating system Dead lock
Operating system Dead lock
 
Cache memory
Cache memoryCache memory
Cache memory
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
Operating system memory management
Operating system memory managementOperating system memory management
Operating system memory management
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
 
Process Management-Process Migration
Process Management-Process MigrationProcess Management-Process Migration
Process Management-Process Migration
 
Thread
ThreadThread
Thread
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
 
DMA and DMA controller
DMA and DMA controllerDMA and DMA controller
DMA and DMA controller
 
Direct manipulation - ppt
Direct manipulation - pptDirect manipulation - ppt
Direct manipulation - ppt
 
Memory management
Memory managementMemory management
Memory management
 
VIRTUAL MEMORY
VIRTUAL MEMORYVIRTUAL MEMORY
VIRTUAL MEMORY
 
Unit IV Memory and I/O Organization
Unit IV Memory and I/O OrganizationUnit IV Memory and I/O Organization
Unit IV Memory and I/O Organization
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algos
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
 

Andere mochten auch

Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)Mikhail Kurnosov
 
Consistency And Parallelism Presentation
Consistency And Parallelism PresentationConsistency And Parallelism Presentation
Consistency And Parallelism Presentationjesler
 
Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013Intel Software Brasil
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...Srivatsan Ramanujam
 
Lecture 3
Lecture 3Lecture 3
Lecture 3Mr SMAK
 
Multiprocessor architecture and programming
Multiprocessor architecture and programmingMultiprocessor architecture and programming
Multiprocessor architecture and programmingRaul Goycoolea Seoane
 
Computer architecture
Computer architecture Computer architecture
Computer architecture Ashish Kumar
 
Presentation on flynn’s classification
Presentation on flynn’s classificationPresentation on flynn’s classification
Presentation on flynn’s classificationvani gupta
 
Flynns classification
Flynns classificationFlynns classification
Flynns classificationYasir Khan
 

Andere mochten auch (16)

Scheduling for Parallel and Multi-Core Systems
Scheduling for Parallel and Multi-Core SystemsScheduling for Parallel and Multi-Core Systems
Scheduling for Parallel and Multi-Core Systems
 
Pipeline parallelism
Pipeline parallelismPipeline parallelism
Pipeline parallelism
 
How to Use OpenMP on Native Activity
How to Use OpenMP on Native ActivityHow to Use OpenMP on Native Activity
How to Use OpenMP on Native Activity
 
Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)
 
Introduccion a MPI
Introduccion a MPIIntroduccion a MPI
Introduccion a MPI
 
Consistency And Parallelism Presentation
Consistency And Parallelism PresentationConsistency And Parallelism Presentation
Consistency And Parallelism Presentation
 
Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Multiprocessor architecture and programming
Multiprocessor architecture and programmingMultiprocessor architecture and programming
Multiprocessor architecture and programming
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Types of parallelism
Types of parallelismTypes of parallelism
Types of parallelism
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
 
What is Parallelism?
What is Parallelism?What is Parallelism?
What is Parallelism?
 
Presentation on flynn’s classification
Presentation on flynn’s classificationPresentation on flynn’s classification
Presentation on flynn’s classification
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 

Ähnlich wie Migration To Multi Core - Parallel Programming Models

Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming PrimerSri Prasanna
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1mobius.cn
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeDmitri Nesteruk
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptxParallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptxkrnaween
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterSudhang Shankar
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureGabriele Modena
 
parellel computing
parellel computingparellel computing
parellel computingkatakdound
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8AbdullahMunir32
 
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Sara Alvarez
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReducecoolmirza143
 
Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Bhavik Vashi
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programmingUmeshwaran V
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptxMohamedBilal73
 

Ähnlich wie Migration To Multi Core - Parallel Programming Models (20)

Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming Primer
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1
 
Parallel computation
Parallel computationParallel computation
Parallel computation
 
parallel-computation.pdf
parallel-computation.pdfparallel-computation.pdf
parallel-computation.pdf
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptxParallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptx
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC cluster
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
 
parellel computing
parellel computingparellel computing
parellel computing
 
Lecture1
Lecture1Lecture1
Lecture1
 
Chap 1(one) general introduction
Chap 1(one)  general introductionChap 1(one)  general introduction
Chap 1(one) general introduction
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReduce
 
Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Parallel processing (simd and mimd)
Parallel processing (simd and mimd)
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programming
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
 
Multicore
MulticoreMulticore
Multicore
 

Mehr von Zvi Avraham

Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!Zvi Avraham
 
Functional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQLFunctional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQLZvi Avraham
 
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)Zvi Avraham
 
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...Zvi Avraham
 
Erlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent WorldErlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent WorldZvi Avraham
 
Cloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean StartupsCloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean StartupsZvi Avraham
 

Mehr von Zvi Avraham (10)

Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!
 
Functional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQLFunctional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQL
 
Limited supply
Limited supplyLimited supply
Limited supply
 
TimeSpaceDB
TimeSpaceDBTimeSpaceDB
TimeSpaceDB
 
Erlang on OSv
Erlang on OSvErlang on OSv
Erlang on OSv
 
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
 
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
 
Erlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent WorldErlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent World
 
Cloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean StartupsCloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean Startups
 
Erlang OTP
Erlang OTPErlang OTP
Erlang OTP
 

Kürzlich hochgeladen

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesSanjay Willie
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 

Kürzlich hochgeladen (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 

Migration To Multi Core - Parallel Programming Models

  • 1. Migration to Multi-Core Zvi Avraham, CTO [email_address]
  • 2.
  • 3. Multi-Core vs. Many-Core Multi-Core: ≤8 cores/threads Many-Core: >8 cores/threads
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 13.
  • 14.
  • 15. Flynn’s Taxonomy Single Instruction Multiple Instruction Single Data SISD MISD Multiple Data SIMD MIMD
  • 17.
  • 18.
  • 19.
  • 20.
  • 22. Rational’s Capsule ROOM – Real-Time Object Oriented Modeling
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 31. Fork / Join Barries – “joins” Parallel Regions Master Thread
  • 32.
  • 33.
  • 35.
  • 36.
  • 37.
  • 38. CPU Affinity Changing Process and Thread Affinity
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44. CPU Affinity – Task Manager
  • 45. CPU Affinity – Task Manager
  • 46.
  • 47.
  • 48.
  • 49.
  • 51.
  • 52.
  • 53. Capabilities Comparison Intel TBB OpenMP Threads Task level parallelism + + - Data decomposition support + + - Complex parallel patterns (non-loops) + - - Generic parallel patterns + - - Scalable nested parallelism support + - - Built-in load balancing + + - Affinity support - + + Static scheduling - + - Concurrent data structures + - - Scalable memory allocator + - - I/O dominated tasks - - + User-level synch. primitives + + - Compiler support is not required + - + Cross OS support + + -
  • 54.
  • 55. Win32 Threads API It’s assumed, that reader/listener is familiar with basic multithreading and corresponding Win32 API
  • 56.
  • 57. Win32 API Hierarchy for Concurrency Windows OS Job Job Process Primary Thread Thread Process Fiber Fiber
  • 58.
  • 59.
  • 60.
  • 61.
  • 62. Win32 Threads API Example: CreateThread
  • 63. Win32 Threads API: Critical Section
  • 66.
  • 67.
  • 68. Our running Example: The PI program Numerical Integration  4.0 (1+x 2 ) dx =  0 1  F(x i )  x   i = 0 N Mathematically, we know that: We can approximate the integral as a sum of rectangles: Where each rectangle has width  x and height F(x i ) at the middle of interval i. F(x) = 4.0/(1+x 2 ) 4.0 2.0 1.0 X 0.0
  • 69.
  • 70. PI Program: an example static long num_steps = 100000; double step; void main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; for (i=1;i<= num_steps; i++){ x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; }
  • 71. OpenMP PI Program : Parallel for with a reduction #include <omp.h> static long num_steps = 100000; double step; #define NUM_THREADS 2 void main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; omp_set_num_threads(NUM_THREADS); #pragma omp parallel for reduction(+:sum) private(x) for (i=1;i<= num_steps; i++){ x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; } OpenMP adds 2 to 4 lines of code
  • 73.
  • 74.
  • 75.
  • 76.
  • 77. Intel® TBB Thread Building Blocks C++ Library
  • 80.
  • 81.
  • 82.