SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Introduction to Parallel
Processor
Chinmay Terse
Rahul Agarwal
Vivek Ashokan
Rahul Nair
What is Parallelism?
• Parallel processing is a term used to denote
simultaneous computation in CPU for the purpose of
measuring its computation speeds
• Parallel Processing was introduced because the
sequential process of executing instructions took a lot of
time
Classification Parallel Processor Architectures
SIMD MIMD
It is also called as Array Processor It is also called as Multiprocessor
Here, single stream of instruction is
fetched
Here, multiple streams of instruction
are fetched
The instruction stream is fetched by
shared memory
The instruction streams are fetched
by control unit
Here, instruction is broadcasted to
multiple processing elements(PEs)
Here, instruction streams are
decoded to get multiple decoded
instruction stream (IS)
Block Diagram of Tightly Coupled Multiprocessor
• Processors share memory
• Communicate via that shared memory
Block Diagram of Loosely Coupled Multiprocessor
Cluster Computer Architecture
Cluster v. SMP
• Both provide multiprocessor support to high demand applications.
• Both available commercially
• SMP:
— Easier to manage and control
— Closer to single processor systems
– Scheduling is main difference
– Less physical space
– Lower power consumption
• Clustering:
— Superior incremental & absolute scalability
— Less cost
— Superior availability
– Redundancy
Loosely Coupled Multiprocessor Tightly Coupled Multiprocessor
Communication takes places
through Message Transfer System
Communication takes places
through PMIN.
The processors do not share
memory
The processors shares memory
Each processor has its own local
memory
Each processor does have its own
local memory
They are referred to as Distributed
Systems
They are referred to as Shared
Systems
Interaction between the processors
and IO modules is very low
Interaction between the processors
and IO modules is very high
The processor is directly connected
to IO devices.
The processor is connected to IO
devices through IOPIN
There are no unmapped local
memory
Every processor has unmapped
local memory
Communication Models
• Shared Memory
– Processors communicate with shared address space
– Easy on small-scale machines
– Advantages:
• Model of choice for uniprocessors, small-scale MPs
• Ease of programming
• Lower latency
• Easier to use hardware controlled caching
• Message passing
– Processors have private memories,
communicate via messages
– Advantages:
• Less hardware, easier to design
• Focuses attention on costly non-local operations
• Can support either SW model on either HW base
Shared Address/Memory Model
• Communicate via Load and Store
– Oldest and most popular model
• Based on timesharing: processes on multiple processors vs. sharing
single processor
• Single virtual and physical address space
– Multiple processes can overlap (share), but ALL threads share a
process address space
• Writes to shared address space by one thread are visible to reads of
other threads
– Usual model: share code, private stack, some shared heap, some
private heap
Advantages shared-memory
communication model
• Compatibility with SMP hardware
• Ease of programming when communication patterns are complex or vary
dynamically during execution
• Ability to develop apps using familiar SMP model, attention only on
performance critical accesses
• Lower communication overhead, better use of BW for small items, due to
implicit communication and memory mapping to implement protection in
hardware, rather than through I/O system
• HW-controlled caching to reduce remote comm. by caching of all data, both
shared and private.
Shared Address Model Summary
• Each process can name all data it shares with other processes
• Data transfer via load and store
• Data size: byte, word, ... or cache blocks
• Uses virtual memory to map virtual to local or remote physical
• Memory hierarchy model applies: now communication moves
data to local proc. cache (as load moves data from memory to
cache)
– Latency, BW (cache block?),
scalability when communicate?
Message Passing Model
• Whole computers (CPU, memory, I/O devices) communicate as explicit I/O
operations
• Send specifies local buffer + receiving process on remote computer
• Receive specifies sending process on remote computer + local buffer to place data
– Usually send includes process tag
and receive has rule on tag: match 1, match any
– Synch: when send completes, when buffer free, when request accepted, receive
wait for send
• Send+receive => memory-memory copy, where each each supplies local address,
AND does pair-wise synchronization!
Message Passing Model
• Send + receive => memory-memory copy, synchronization on OS even on
1 processor
• History of message passing:
– Network topology important because could only send to immediate
neighbour
– Typically synchronous, blocking send & receive
– Later DMA with non-blocking sends, DMA for receive into buffer
until processor does receive, and then data is transferred to local
memory
– Later SW libraries to allow arbitrary communication
• Example: IBM SP-2, RS6000 workstations in racks.
Advantages message-passing
communication model
• The hardware can be simpler
• Communication explicit => simpler to understand; in shared memory it can be hard
to know when communicating and when not, and how costly it is
• Explicit communication focuses attention on costly aspect of parallel computation,
sometimes leading to improved structure in multiprocessor program
• Synchronization is naturally associated with sending messages, reducing the
possibility for errors introduced by incorrect synchronization
• Easier to use sender-initiated communication, which may have some advantages in
performance
Applications of Parallel Processing
Summary
■ Serial computers / microprocessors will probably not get much faster - parallelization unavoidable
❑ Pipelining, cache and other optimization strategies for serial computers reaching a plateau
❑ the heat wall has also been reached
■ Application examples
■ Data and functional parallelism
■ Flynn’s taxonomy: SIMD, MISD, MIMD/SPMD
■ Parallel Architectures Intro
❑ Distributed Memory message-passing
❑ Shared Memory
■ Uniform Memory Access
■ Non Uniform Memory Access (distributed shared memory)
■ Parallel program/algorithm examples

Weitere ähnliche Inhalte

Was ist angesagt?

Microprogram Control
Microprogram Control Microprogram Control
Microprogram Control Anuj Modi
 
Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection NetworkAli A Jalil
 
Control Unit Design
Control Unit DesignControl Unit Design
Control Unit DesignVinit Raut
 
Instruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsSyed Zaid Irshad
 
Thrashing allocation frames.43
Thrashing allocation frames.43Thrashing allocation frames.43
Thrashing allocation frames.43myrajendra
 
Flynns classification
Flynns classificationFlynns classification
Flynns classificationYasir Khan
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-SystemsVenkata Sreeram
 
Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processornoor ul ain
 
Distributed concurrency control
Distributed concurrency controlDistributed concurrency control
Distributed concurrency controlBinte fatima
 
Multithreading
MultithreadingMultithreading
MultithreadingA B Shinde
 
Virtual memory
Virtual memoryVirtual memory
Virtual memoryAnuj Modi
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingSayed Chhattan Shah
 

Was ist angesagt? (20)

Microprogram Control
Microprogram Control Microprogram Control
Microprogram Control
 
Instruction format
Instruction formatInstruction format
Instruction format
 
Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection Network
 
Control Unit Design
Control Unit DesignControl Unit Design
Control Unit Design
 
Memory mapping
Memory mappingMemory mapping
Memory mapping
 
Instruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar Processors
 
Thrashing allocation frames.43
Thrashing allocation frames.43Thrashing allocation frames.43
Thrashing allocation frames.43
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Memory management
Memory managementMemory management
Memory management
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
 
SCHEDULING ALGORITHMS
SCHEDULING ALGORITHMSSCHEDULING ALGORITHMS
SCHEDULING ALGORITHMS
 
Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processor
 
Distributed concurrency control
Distributed concurrency controlDistributed concurrency control
Distributed concurrency control
 
operating system structure
operating system structureoperating system structure
operating system structure
 
Multithreading
MultithreadingMultithreading
Multithreading
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 

Ähnlich wie Introduction to parallel processing

Ähnlich wie Introduction to parallel processing (20)

High performance computing
High performance computingHigh performance computing
High performance computing
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
Lecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptxLecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptx
 
Hpc 6 7
Hpc 6 7Hpc 6 7
Hpc 6 7
 
Memory Management in Operating Systems for all
Memory Management in Operating Systems for allMemory Management in Operating Systems for all
Memory Management in Operating Systems for all
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memory
 
Multiprocessor_YChen.ppt
Multiprocessor_YChen.pptMultiprocessor_YChen.ppt
Multiprocessor_YChen.ppt
 
Pthread
PthreadPthread
Pthread
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
Operating system memory management
Operating system memory managementOperating system memory management
Operating system memory management
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
Operating system 32 logical versus physical address
Operating system 32 logical versus physical addressOperating system 32 logical versus physical address
Operating system 32 logical versus physical address
 

Kürzlich hochgeladen

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 

Kürzlich hochgeladen (20)

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 

Introduction to parallel processing

  • 1. Introduction to Parallel Processor Chinmay Terse Rahul Agarwal Vivek Ashokan Rahul Nair
  • 2. What is Parallelism? • Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds • Parallel Processing was introduced because the sequential process of executing instructions took a lot of time
  • 4. SIMD MIMD It is also called as Array Processor It is also called as Multiprocessor Here, single stream of instruction is fetched Here, multiple streams of instruction are fetched The instruction stream is fetched by shared memory The instruction streams are fetched by control unit Here, instruction is broadcasted to multiple processing elements(PEs) Here, instruction streams are decoded to get multiple decoded instruction stream (IS)
  • 5. Block Diagram of Tightly Coupled Multiprocessor • Processors share memory • Communicate via that shared memory
  • 6. Block Diagram of Loosely Coupled Multiprocessor
  • 8. Cluster v. SMP • Both provide multiprocessor support to high demand applications. • Both available commercially • SMP: — Easier to manage and control — Closer to single processor systems – Scheduling is main difference – Less physical space – Lower power consumption • Clustering: — Superior incremental & absolute scalability — Less cost — Superior availability – Redundancy
  • 9. Loosely Coupled Multiprocessor Tightly Coupled Multiprocessor Communication takes places through Message Transfer System Communication takes places through PMIN. The processors do not share memory The processors shares memory Each processor has its own local memory Each processor does have its own local memory They are referred to as Distributed Systems They are referred to as Shared Systems Interaction between the processors and IO modules is very low Interaction between the processors and IO modules is very high The processor is directly connected to IO devices. The processor is connected to IO devices through IOPIN There are no unmapped local memory Every processor has unmapped local memory
  • 10. Communication Models • Shared Memory – Processors communicate with shared address space – Easy on small-scale machines – Advantages: • Model of choice for uniprocessors, small-scale MPs • Ease of programming • Lower latency • Easier to use hardware controlled caching • Message passing – Processors have private memories, communicate via messages – Advantages: • Less hardware, easier to design • Focuses attention on costly non-local operations • Can support either SW model on either HW base
  • 11. Shared Address/Memory Model • Communicate via Load and Store – Oldest and most popular model • Based on timesharing: processes on multiple processors vs. sharing single processor • Single virtual and physical address space – Multiple processes can overlap (share), but ALL threads share a process address space • Writes to shared address space by one thread are visible to reads of other threads – Usual model: share code, private stack, some shared heap, some private heap
  • 12. Advantages shared-memory communication model • Compatibility with SMP hardware • Ease of programming when communication patterns are complex or vary dynamically during execution • Ability to develop apps using familiar SMP model, attention only on performance critical accesses • Lower communication overhead, better use of BW for small items, due to implicit communication and memory mapping to implement protection in hardware, rather than through I/O system • HW-controlled caching to reduce remote comm. by caching of all data, both shared and private.
  • 13. Shared Address Model Summary • Each process can name all data it shares with other processes • Data transfer via load and store • Data size: byte, word, ... or cache blocks • Uses virtual memory to map virtual to local or remote physical • Memory hierarchy model applies: now communication moves data to local proc. cache (as load moves data from memory to cache) – Latency, BW (cache block?), scalability when communicate?
  • 14. Message Passing Model • Whole computers (CPU, memory, I/O devices) communicate as explicit I/O operations • Send specifies local buffer + receiving process on remote computer • Receive specifies sending process on remote computer + local buffer to place data – Usually send includes process tag and receive has rule on tag: match 1, match any – Synch: when send completes, when buffer free, when request accepted, receive wait for send • Send+receive => memory-memory copy, where each each supplies local address, AND does pair-wise synchronization!
  • 15. Message Passing Model • Send + receive => memory-memory copy, synchronization on OS even on 1 processor • History of message passing: – Network topology important because could only send to immediate neighbour – Typically synchronous, blocking send & receive – Later DMA with non-blocking sends, DMA for receive into buffer until processor does receive, and then data is transferred to local memory – Later SW libraries to allow arbitrary communication • Example: IBM SP-2, RS6000 workstations in racks.
  • 16. Advantages message-passing communication model • The hardware can be simpler • Communication explicit => simpler to understand; in shared memory it can be hard to know when communicating and when not, and how costly it is • Explicit communication focuses attention on costly aspect of parallel computation, sometimes leading to improved structure in multiprocessor program • Synchronization is naturally associated with sending messages, reducing the possibility for errors introduced by incorrect synchronization • Easier to use sender-initiated communication, which may have some advantages in performance
  • 18.
  • 19.
  • 20.
  • 21. Summary ■ Serial computers / microprocessors will probably not get much faster - parallelization unavoidable ❑ Pipelining, cache and other optimization strategies for serial computers reaching a plateau ❑ the heat wall has also been reached ■ Application examples ■ Data and functional parallelism ■ Flynn’s taxonomy: SIMD, MISD, MIMD/SPMD ■ Parallel Architectures Intro ❑ Distributed Memory message-passing ❑ Shared Memory ■ Uniform Memory Access ■ Non Uniform Memory Access (distributed shared memory) ■ Parallel program/algorithm examples