SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Algorithms
Parallel Algorithms
1
Page 2
An overview
• A simple parallel algorithm for computing
parallel prefix.
• A parallel merging algorithm
Page 3
• We are given an ordered set A of n
elements
and a binary associative operator .
• We have to compute the ordered set
0 1 2 1, , ,..., nA a a a a
0 0 1 0 1 1, ,..., ... na a a a a a
Definition of prefix computation
Page 4
• For example, if is + and the input is the
ordered set
{5, 3, -6, 2, 7, 10, -2, 8}
then the output is
{5, 8, 2, 4, 11, 21, 19, 27}
• Prefix sum can be computed in O (n) time
sequentially.
An example of prefix computation
Page 5
First Pass
• For every internal node of the tree, compute
the sum of all the leaves in its subtree in a
bottom-up fashion.
sum[v] := sum[L[v]] + sum[R[v]]
Using a binary tree
Page 6
for d = 0 to log n – 1 do
for i = 0 to n – 1 by 2d+1 do in parallel
a[i + 2d+1 - 1] := a[i + 2d - 1] + a[i + 2d+1 - 1]
• In our example, n = 8, hence the outer loop
iterates 3 times, d = 0, 1, 2.
Parallel prefix computation
Page 7
• d = 0: In this case, the increments of 2d+1 will be
in terms of 2 elements.
• for i = 0,
a[0 + 20+1 - 1] := a[0 + 20 - 1] + a[0 + 20+1 - 1]
or, a[1] := a[0] + a[1]
When d= 0
Page 8
• d = 1: In this case, the increments of 2d+1 will be
in terms of 4 elements.
• for i = 0,
a[0 + 21+1 - 1] := a[0 + 21 - 1] + a[0 + 21+1 - 1]
or, a[3] := a[1] + a[3]
• for i = 4,
a[4 + 21+1 - 1] := a[4 + 21 - 1] + a[4 + 21+1 - 1]
or, a[7] := a[5] + a[7]
When d = 1
Page 9
• blue: no change from last iteration.
• magenta: changed in the current iteration.
The First Pass
Page 10
Second Pass
• The idea in the second pass is to do a
topdown computation to generate all the prefix
sums.
• We use the notation pre[v] to denote the prefix
sum at every node.
The Second Pass
Page 11
• pre[root] := 0, the identity element for the
operation, since we are considering the
operation.
• If the operation is max, the identity element will
be .
Computation in the second phase
Page 12
pre[L[v]] := pre[v]
pre[R[v]] := sum[L[v]] + pre[v]
Second phase (continued)
Page 13
Example of second phase
pre[L[v]] := pre[v]
pre[R[v]] := sum[L[v]] + pre[v]
Page 14
for d = (log n – 1) downto 0 do
for i = 0 to n – 1 by 2d+1 do in parallel
temp := a[i + 2d - 1]
a[i + 2d - 1] := a[i + 2d+1 - 1] (left child)
a[i + 2d+1 - 1] := temp + a[i + 2d+1 - 1] (right
child)
a[7] is set to 0
Parallel prefix computation
Page 15
• We consider the case d = 2 and i = 0
temp := a[0 + 22 - 1] := a[3]
a[0 + 22 - 1] := a[0 + 22+1 - 1] or, a[3] := a[7]
a[0 + 22+1 - 1] := temp + a[0 + 22+1 - 1] or,
a[7] := a[3] + a[7]
Parallel prefix computation
Page 16
• blue: no change from last iteration.
• magenta: left child.
• brown: right child.
Parallel prefix computation
Page 17
• All the prefix sums except the last one are now
in the leaves of the tree from left to right.
• The prefix sums have to be shifted one position
to the left. Also, the last prefix sum (the sum of
all the elements) should be inserted at the last
leaf.
• The complexity is O (log n) time and O (n)
processors.
Exercise: Reduce the processor complexity to
O (n / log n).
Parallel prefix computation
Page 18
Parallel merging through
partitioning
The partitioning strategy consists of:
• Breaking up the given problem into many
independent subproblems of equal size
• Solving the subproblems in parallel
This is similar to the divide-and-conquer
strategy in sequential computing.
Page 19
Partitioning and Merging
Given a set S with a relation , S is linearly
ordered, if for every pair a,b S.
• either a b or b a.
The merging problem is the following:
Page 20
Partitioning and Merging
Input: Two sorted arrays A = (a1, a2,..., am) and
B = (b1, b2,..., bn) whose elements are drawn
from a linearly ordered set.
Output: A merged sorted sequence
C = (c1, c2,..., cm+n).
Page 21
Merging
For example, if A = (2,8,11,13,17,20) and B =
(3,6,10,15,16,73), the merged sequence
C = (2,3,6,8,10,11,13,15,16,17,20,73).
Page 22
Merging
A sequential algorithm
• Simultaneously move two pointers along the
two arrays
• Write the items in sorted order in another
array
Page 23
Partitioning and Merging
• The complexity of the sequential algorithm is
O(m + n).
• We will use the partitioning strategy for
solving this problem in parallel.
Page 24
Partitioning and Merging
Definitions:
rank(ai : A) is the number of elements in A less
than or equal to ai A.
rank(bi : A) is the number of elements in A less
than or equal to bi B.
Page 25
Merging
For example, consider the arrays:
A = (2,8,11,13,17,20)
B = (3,6,10,15,16,73)
rank(11 : A) = 3 and rank(11 : B) = 3.
Page 26
Merging
• The position of an element ai A in the sorted
array C is:
rank(ai : A) + rank(ai : B).
For example, the position of 11 in the sorted
array C is:
rank(11 : A) + rank(11 : B) = 3 + 3 = 6.
Page 27
Parallel Merging
• The idea is to decompose the overall merging
problem into many smaller merging
problems.
• When the problem size is sufficiently small,
we will use the sequential algorithm.
Page 28
Merging
• The main task is to generate smaller merging
problems such that:
• Each sequence in such a smaller problem has
O(log m) or O(log n) elements.
• Then we can use the sequential algorithm since
the time complexity will be O(log m + log n).
Page 29
Parallel Merging
Step 1. Divide the array B into blocks such that each
block has log m elements. Hence there are m/log m
blocks.
For each block, the last elements are
i log m, 1 i m/log m
Page 30
Parallel Merging
Step 2. We allocate one processor for each last
element in B.
•For a last element i log m, this processor does
a binary search in the array A to determine two
elements ak, ak+1 such that ak i log m ak+1.
•All the m/log m binary searches are done in
parallel and take O(log m) time each.
Page 31
Parallel Merging
• After the binary searches are over, the array
A is divided into m/log m blocks.
• There is a one-to-one correspondence
between the blocks in A and B. We call a pair
of such blocks as matching blocks.
Page 32
Parallel Merging
• Each block in A is determined in the following
way.
• Consider the two elements i log m and(i + 1)
log m. These are the elements in the (i + 1)-th
block of B.
• The two elements that determine rank(i log m
: A) and rank((i + 1) log m : A) define the
matching block in A
Page 33
Parallel Merging
• These two matching blocks determine a smaller
merging problem.
• Every element inside a matching block has to be
ranked inside the other matching block.
• Hence, the problem of merging a pair of matching
blocks is an independent subproblem which does
not affect any other block.
Page 34
Parallel Merging
• If the size of each block in A is O(log m), we can
directly run the sequential algorithm on every pair of
matching blocks from A and B.
• Some blocks in A may be larger than O(log m) and
hence we have to do some more work to break
them into smaller blocks.
Page 35
Parallel Merging
If a block in Ai is larger than O(log m) and the
matching block of Ai is Bj, we do the following
•We divide Ai into blocks of size O(log m).
•Then we apply the same algorithm to rank the
boundary elements of each block in Ai in Bj.
•Now each block in A is of size O(log m)
•This takes O(log log m) time.
Page 36
Parallel Merging
Step 3.
• We now take every pair of matching blocks from A
and B and run the sequential merging algorithm.
• One processor is allocated for every matching pair
and this processor merges the pair in O(log m)
time.
We have to analyse the time and processor
complexities of each of the steps to get the overall
complexities.
Page 37
Parallel Merging
Complexity of Step 1
• The task in Step 1 is to partition B into
blocks of size log m.
• We allocate m/log m processors.
• Since B is an array, processor Pi, 1 i m/log
m can find the element i log m in O(1) time.
Page 38
Parallel Merging
Complexity of Step 2
• In Step 2, m/log m processors do binary
search in array A in O(log n) time each.
• Hence the time complexity is O(log n) and
the work done is
(m log n)/ log m (m log(m + n)) / log m (m + n)
for n,m 4. Hence the total work is O(m + n).
Page 39
Parallel Merging
Complexity of Step 3
• In Step 3, we use m/log m processors
• Each processor merges a pair Ai, Bi in O(log m)
time.Hence the total work done is m.
Theorem
Let A and B be two sorted sequences each of
length n. A and B can be merged in O(log n) time
using O(n) operations in the CREW PRAM.

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Computer arithmetic
Computer arithmeticComputer arithmetic
Computer arithmetic
 
Complexity of Algorithm
Complexity of AlgorithmComplexity of Algorithm
Complexity of Algorithm
 
Algorithm: Quick-Sort
Algorithm: Quick-SortAlgorithm: Quick-Sort
Algorithm: Quick-Sort
 
Analysis of algorithm
Analysis of algorithmAnalysis of algorithm
Analysis of algorithm
 
Knapsack Problem
Knapsack ProblemKnapsack Problem
Knapsack Problem
 
Divide and conquer - Quick sort
Divide and conquer - Quick sortDivide and conquer - Quick sort
Divide and conquer - Quick sort
 
Biconnected components (13024116056)
Biconnected components (13024116056)Biconnected components (13024116056)
Biconnected components (13024116056)
 
Greedy Algorithm
Greedy AlgorithmGreedy Algorithm
Greedy Algorithm
 
Algorithms Lecture 2: Analysis of Algorithms I
Algorithms Lecture 2: Analysis of Algorithms IAlgorithms Lecture 2: Analysis of Algorithms I
Algorithms Lecture 2: Analysis of Algorithms I
 
Divide and Conquer
Divide and ConquerDivide and Conquer
Divide and Conquer
 
Complexity analysis in Algorithms
Complexity analysis in AlgorithmsComplexity analysis in Algorithms
Complexity analysis in Algorithms
 
Design and analysis of algorithms
Design and analysis of algorithmsDesign and analysis of algorithms
Design and analysis of algorithms
 
Daa unit 1
Daa unit 1Daa unit 1
Daa unit 1
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
Hasse diagram
Hasse diagramHasse diagram
Hasse diagram
 
03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra
 
Vertex cover problem
Vertex cover problemVertex cover problem
Vertex cover problem
 
04 brute force
04 brute force04 brute force
04 brute force
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 

Andere mochten auch

Andere mochten auch (17)

RMABC
RMABCRMABC
RMABC
 
An improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithmAn improved memetic search in artificial bee colony algorithm
An improved memetic search in artificial bee colony algorithm
 
Modified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithmModified position update in spider monkey optimization algorithm
Modified position update in spider monkey optimization algorithm
 
Enhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithmEnhanced local search in artificial bee colony algorithm
Enhanced local search in artificial bee colony algorithm
 
Graph
GraphGraph
Graph
 
Lecture25
Lecture25Lecture25
Lecture25
 
Lecture23
Lecture23Lecture23
Lecture23
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture27 linear programming
Lecture27 linear programmingLecture27 linear programming
Lecture27 linear programming
 
Lecture26
Lecture26Lecture26
Lecture26
 
Augmenting Data Structures
Augmenting Data StructuresAugmenting Data Structures
Augmenting Data Structures
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Lecture29
Lecture29Lecture29
Lecture29
 
Shortest Path in Graph
Shortest Path in GraphShortest Path in Graph
Shortest Path in Graph
 
Lecture28 tsp
Lecture28 tspLecture28 tsp
Lecture28 tsp
 
Network flow problems
Network flow problemsNetwork flow problems
Network flow problems
 
Soft computing
Soft computingSoft computing
Soft computing
 

Ähnlich wie Parallel Algorithms

Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programming
Oye Tu
 
streamingalgo88585858585858585pppppp.pptx
streamingalgo88585858585858585pppppp.pptxstreamingalgo88585858585858585pppppp.pptx
streamingalgo88585858585858585pppppp.pptx
GopiNathVelivela
 
Basic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptxBasic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptx
PremanandS3
 

Ähnlich wie Parallel Algorithms (20)

Lecture -16-merge sort (slides).pptx
Lecture -16-merge sort (slides).pptxLecture -16-merge sort (slides).pptx
Lecture -16-merge sort (slides).pptx
 
dynamic programming Rod cutting class
dynamic programming Rod cutting classdynamic programming Rod cutting class
dynamic programming Rod cutting class
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Sorting2
Sorting2Sorting2
Sorting2
 
Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programming
 
Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)
 
Ada notes
Ada notesAda notes
Ada notes
 
Counting sort(Non Comparison Sort)
Counting sort(Non Comparison Sort)Counting sort(Non Comparison Sort)
Counting sort(Non Comparison Sort)
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Ch07 linearspacealignment
Ch07 linearspacealignmentCh07 linearspacealignment
Ch07 linearspacealignment
 
Class13_Quicksort_Algorithm.pdf
Class13_Quicksort_Algorithm.pdfClass13_Quicksort_Algorithm.pdf
Class13_Quicksort_Algorithm.pdf
 
2.ppt
2.ppt2.ppt
2.ppt
 
chapter1.pdf ......................................
chapter1.pdf ......................................chapter1.pdf ......................................
chapter1.pdf ......................................
 
streamingalgo88585858585858585pppppp.pptx
streamingalgo88585858585858585pppppp.pptxstreamingalgo88585858585858585pppppp.pptx
streamingalgo88585858585858585pppppp.pptx
 
Basic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptxBasic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptx
 
Linear Programing.pptx
Linear Programing.pptxLinear Programing.pptx
Linear Programing.pptx
 
Daa chapter 3
Daa chapter 3Daa chapter 3
Daa chapter 3
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
 

Mehr von Dr Sandeep Kumar Poonia

Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithm
Dr Sandeep Kumar Poonia
 
Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...
Dr Sandeep Kumar Poonia
 

Mehr von Dr Sandeep Kumar Poonia (13)

Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithm
 
Improved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithmImproved onlooker bee phase in artificial bee colony algorithm
Improved onlooker bee phase in artificial bee colony algorithm
 
Comparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithmComparative study of_hybrids_of_artificial_bee_colony_algorithm
Comparative study of_hybrids_of_artificial_bee_colony_algorithm
 
A novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithmA novel hybrid crossover based abc algorithm
A novel hybrid crossover based abc algorithm
 
Multiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked listsMultiplication of two 3 d sparse matrices using 1d arrays and linked lists
Multiplication of two 3 d sparse matrices using 1d arrays and linked lists
 
Sunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithmSunzip user tool for data reduction using huffman algorithm
Sunzip user tool for data reduction using huffman algorithm
 
New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm New Local Search Strategy in Artificial Bee Colony Algorithm
New Local Search Strategy in Artificial Bee Colony Algorithm
 
A new approach of program slicing
A new approach of program slicingA new approach of program slicing
A new approach of program slicing
 
Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...Performance evaluation of different routing protocols in wsn using different ...
Performance evaluation of different routing protocols in wsn using different ...
 
Enhanced abc algo for tsp
Enhanced abc algo for tspEnhanced abc algo for tsp
Enhanced abc algo for tsp
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
 
Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...Performance evaluation of diff routing protocols in wsn using difft network p...
Performance evaluation of diff routing protocols in wsn using difft network p...
 
Problems in parallel computations of tree functions
Problems in parallel computations of tree functionsProblems in parallel computations of tree functions
Problems in parallel computations of tree functions
 

Kürzlich hochgeladen

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 

Kürzlich hochgeladen (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 

Parallel Algorithms

  • 2. Page 2 An overview • A simple parallel algorithm for computing parallel prefix. • A parallel merging algorithm
  • 3. Page 3 • We are given an ordered set A of n elements and a binary associative operator . • We have to compute the ordered set 0 1 2 1, , ,..., nA a a a a 0 0 1 0 1 1, ,..., ... na a a a a a Definition of prefix computation
  • 4. Page 4 • For example, if is + and the input is the ordered set {5, 3, -6, 2, 7, 10, -2, 8} then the output is {5, 8, 2, 4, 11, 21, 19, 27} • Prefix sum can be computed in O (n) time sequentially. An example of prefix computation
  • 5. Page 5 First Pass • For every internal node of the tree, compute the sum of all the leaves in its subtree in a bottom-up fashion. sum[v] := sum[L[v]] + sum[R[v]] Using a binary tree
  • 6. Page 6 for d = 0 to log n – 1 do for i = 0 to n – 1 by 2d+1 do in parallel a[i + 2d+1 - 1] := a[i + 2d - 1] + a[i + 2d+1 - 1] • In our example, n = 8, hence the outer loop iterates 3 times, d = 0, 1, 2. Parallel prefix computation
  • 7. Page 7 • d = 0: In this case, the increments of 2d+1 will be in terms of 2 elements. • for i = 0, a[0 + 20+1 - 1] := a[0 + 20 - 1] + a[0 + 20+1 - 1] or, a[1] := a[0] + a[1] When d= 0
  • 8. Page 8 • d = 1: In this case, the increments of 2d+1 will be in terms of 4 elements. • for i = 0, a[0 + 21+1 - 1] := a[0 + 21 - 1] + a[0 + 21+1 - 1] or, a[3] := a[1] + a[3] • for i = 4, a[4 + 21+1 - 1] := a[4 + 21 - 1] + a[4 + 21+1 - 1] or, a[7] := a[5] + a[7] When d = 1
  • 9. Page 9 • blue: no change from last iteration. • magenta: changed in the current iteration. The First Pass
  • 10. Page 10 Second Pass • The idea in the second pass is to do a topdown computation to generate all the prefix sums. • We use the notation pre[v] to denote the prefix sum at every node. The Second Pass
  • 11. Page 11 • pre[root] := 0, the identity element for the operation, since we are considering the operation. • If the operation is max, the identity element will be . Computation in the second phase
  • 12. Page 12 pre[L[v]] := pre[v] pre[R[v]] := sum[L[v]] + pre[v] Second phase (continued)
  • 13. Page 13 Example of second phase pre[L[v]] := pre[v] pre[R[v]] := sum[L[v]] + pre[v]
  • 14. Page 14 for d = (log n – 1) downto 0 do for i = 0 to n – 1 by 2d+1 do in parallel temp := a[i + 2d - 1] a[i + 2d - 1] := a[i + 2d+1 - 1] (left child) a[i + 2d+1 - 1] := temp + a[i + 2d+1 - 1] (right child) a[7] is set to 0 Parallel prefix computation
  • 15. Page 15 • We consider the case d = 2 and i = 0 temp := a[0 + 22 - 1] := a[3] a[0 + 22 - 1] := a[0 + 22+1 - 1] or, a[3] := a[7] a[0 + 22+1 - 1] := temp + a[0 + 22+1 - 1] or, a[7] := a[3] + a[7] Parallel prefix computation
  • 16. Page 16 • blue: no change from last iteration. • magenta: left child. • brown: right child. Parallel prefix computation
  • 17. Page 17 • All the prefix sums except the last one are now in the leaves of the tree from left to right. • The prefix sums have to be shifted one position to the left. Also, the last prefix sum (the sum of all the elements) should be inserted at the last leaf. • The complexity is O (log n) time and O (n) processors. Exercise: Reduce the processor complexity to O (n / log n). Parallel prefix computation
  • 18. Page 18 Parallel merging through partitioning The partitioning strategy consists of: • Breaking up the given problem into many independent subproblems of equal size • Solving the subproblems in parallel This is similar to the divide-and-conquer strategy in sequential computing.
  • 19. Page 19 Partitioning and Merging Given a set S with a relation , S is linearly ordered, if for every pair a,b S. • either a b or b a. The merging problem is the following:
  • 20. Page 20 Partitioning and Merging Input: Two sorted arrays A = (a1, a2,..., am) and B = (b1, b2,..., bn) whose elements are drawn from a linearly ordered set. Output: A merged sorted sequence C = (c1, c2,..., cm+n).
  • 21. Page 21 Merging For example, if A = (2,8,11,13,17,20) and B = (3,6,10,15,16,73), the merged sequence C = (2,3,6,8,10,11,13,15,16,17,20,73).
  • 22. Page 22 Merging A sequential algorithm • Simultaneously move two pointers along the two arrays • Write the items in sorted order in another array
  • 23. Page 23 Partitioning and Merging • The complexity of the sequential algorithm is O(m + n). • We will use the partitioning strategy for solving this problem in parallel.
  • 24. Page 24 Partitioning and Merging Definitions: rank(ai : A) is the number of elements in A less than or equal to ai A. rank(bi : A) is the number of elements in A less than or equal to bi B.
  • 25. Page 25 Merging For example, consider the arrays: A = (2,8,11,13,17,20) B = (3,6,10,15,16,73) rank(11 : A) = 3 and rank(11 : B) = 3.
  • 26. Page 26 Merging • The position of an element ai A in the sorted array C is: rank(ai : A) + rank(ai : B). For example, the position of 11 in the sorted array C is: rank(11 : A) + rank(11 : B) = 3 + 3 = 6.
  • 27. Page 27 Parallel Merging • The idea is to decompose the overall merging problem into many smaller merging problems. • When the problem size is sufficiently small, we will use the sequential algorithm.
  • 28. Page 28 Merging • The main task is to generate smaller merging problems such that: • Each sequence in such a smaller problem has O(log m) or O(log n) elements. • Then we can use the sequential algorithm since the time complexity will be O(log m + log n).
  • 29. Page 29 Parallel Merging Step 1. Divide the array B into blocks such that each block has log m elements. Hence there are m/log m blocks. For each block, the last elements are i log m, 1 i m/log m
  • 30. Page 30 Parallel Merging Step 2. We allocate one processor for each last element in B. •For a last element i log m, this processor does a binary search in the array A to determine two elements ak, ak+1 such that ak i log m ak+1. •All the m/log m binary searches are done in parallel and take O(log m) time each.
  • 31. Page 31 Parallel Merging • After the binary searches are over, the array A is divided into m/log m blocks. • There is a one-to-one correspondence between the blocks in A and B. We call a pair of such blocks as matching blocks.
  • 32. Page 32 Parallel Merging • Each block in A is determined in the following way. • Consider the two elements i log m and(i + 1) log m. These are the elements in the (i + 1)-th block of B. • The two elements that determine rank(i log m : A) and rank((i + 1) log m : A) define the matching block in A
  • 33. Page 33 Parallel Merging • These two matching blocks determine a smaller merging problem. • Every element inside a matching block has to be ranked inside the other matching block. • Hence, the problem of merging a pair of matching blocks is an independent subproblem which does not affect any other block.
  • 34. Page 34 Parallel Merging • If the size of each block in A is O(log m), we can directly run the sequential algorithm on every pair of matching blocks from A and B. • Some blocks in A may be larger than O(log m) and hence we have to do some more work to break them into smaller blocks.
  • 35. Page 35 Parallel Merging If a block in Ai is larger than O(log m) and the matching block of Ai is Bj, we do the following •We divide Ai into blocks of size O(log m). •Then we apply the same algorithm to rank the boundary elements of each block in Ai in Bj. •Now each block in A is of size O(log m) •This takes O(log log m) time.
  • 36. Page 36 Parallel Merging Step 3. • We now take every pair of matching blocks from A and B and run the sequential merging algorithm. • One processor is allocated for every matching pair and this processor merges the pair in O(log m) time. We have to analyse the time and processor complexities of each of the steps to get the overall complexities.
  • 37. Page 37 Parallel Merging Complexity of Step 1 • The task in Step 1 is to partition B into blocks of size log m. • We allocate m/log m processors. • Since B is an array, processor Pi, 1 i m/log m can find the element i log m in O(1) time.
  • 38. Page 38 Parallel Merging Complexity of Step 2 • In Step 2, m/log m processors do binary search in array A in O(log n) time each. • Hence the time complexity is O(log n) and the work done is (m log n)/ log m (m log(m + n)) / log m (m + n) for n,m 4. Hence the total work is O(m + n).
  • 39. Page 39 Parallel Merging Complexity of Step 3 • In Step 3, we use m/log m processors • Each processor merges a pair Ai, Bi in O(log m) time.Hence the total work done is m. Theorem Let A and B be two sorted sequences each of length n. A and B can be merged in O(log n) time using O(n) operations in the CREW PRAM.