SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Graph Algorithms
Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
To accompany the text “Introduction to Parallel Computing”,
Addison Wesley, 2003.
Topic Overview
• Definitions and Representation
• Minimum Spanning Tree: Prim’s Algorithm
• Single-Source Shortest Paths: Dijkstra’s Algorithm
• All-Pairs Shortest Paths
• Transitive Closure
• Connected Components
• Algorithms for Sparse Graphs
Definitions and Representation
• An undirected graph G is a pair (V, E), where V is a finite set of
points called vertices and E is a finite set of edges.
• An edge e ∈ E is an unordered pair (u, v), where u, v ∈ V .
• In a directed graph, the edge e is an ordered pair (u, v). An
edge (u, v) is incident from vertex u and is incident to vertex v.
• A path from a vertex v to a vertex u is a sequence
v0, v1, v2, . . . , vk of vertices where v0 = v, vk = u, and (vi, vi+1) ∈
E for i = 0, 1, . . . , k − 1.
• The length of a path is defined as the number of edges in the
path.
Definitions and Representation
1
2
3
45
6
1
2
3 4
56
f
e
(a) (b)
(a) An undirected graph and (b) a directed graph.
Definitions and Representation
• An undirected graph is connected if every pair of vertices is
connected by a path.
• A forest is an acyclic graph, and a tree is a connected acyclic
graph.
• A graph that has weights associated with each edge is called
a weighted graph.
Definitions and Representation
• Graphs can be represented by their adjacency matrix or an
edge (or vertex) list.
• Adjacency matrices have a value ai,j = 1 if nodes i and j share
an edge; 0 otherwise. In case of a weighted graph, ai,j = wi,j,
the weight of the edge.
• The adjacency list representation of a graph G = (V, E) consists
of an array Adj[1..|V |] of lists. Each list Adj[v] is a list of all vertices
adjacent to v.
• For a grapn with n nodes, adjacency matrices take Theta(n2
)
space and adjacency list takes Θ(|E|) space.
Definitions and Representation
1
32
4 5
0
1
0
0
0
1
0
1
0
1
0
1
0
0
1
0
0
0
0
1
0
1
1
1
0
A =
An undirected graph and its adjacency matrix representation.
3
1
2
4 5
1
2
3
4
5
2
2 5
5
2 3 4
1 53
An undirected graph and its adjacency list representation.
Minimum Spanning Tree
• A spanning tree of an undirected graph G is a subgraph of G
that is a tree containing all the vertices of G.
• In a weighted graph, the weight of a subgraph is the sum of
the weights of the edges in the subgraph.
• A minimum spanning tree (MST) for a weighted undirected
graph is a spanning tree with minimum weight.
Minimum Spanning Tree
2
4
1
2
3
5
28
4
3
2
1
3
2
An undirected graph and its minimum spanning tree.
Minimum Spanning Tree: Prim’s Algorithm
• Prim’s algorithm for finding an MST is a greedy algorithm.
• Start by selecting an arbitrary vertex, include it into the current
MST.
• Grow the current MST by inserting into it the vertex closest to
one of the vertices already in current MST.
Minimum Spanning Tree: Prim’s Algorithm
11 5
0
d[]
d[]
1 12 4 3
3
1
1
2
5
1
4
b
a
f
d
c
e
3
3
1
1
2
5
1
4
b
a
f
d
c
e
3
(d) Final minimum
spanning tree
(c)
has been selected
After the second edge
After the first edge has
been selected
(b)
3
1
1
2
5
1
4
b
a
f
d
c
e
3
3
1
1
2
5
1
4
b
a
f
d
c
e
3
5
5
5
5
0
3
1
0
0
2
1
0
1
4
0
5 0
3a
b
c
d
e
f
0 1
3
1 5
5 2
1
1 4 5
2
0
3
1
0
0
2
1
0
1
4
0
5 0
3a
b
c
d
e
f
d[]
3
1 5
5 2
1
1 4 5
2
0
3
1
0
0
2
1
0
1
4
0
5 0
3a
b
c
d
e
f
d[]
3
1 5
5 2
1
1 4 5
2
2 4
ca b d e
ca b d e
ca b d e
0
3
1
0
0
2
1
0
1
4
0
5 0
3a
b
c
d
e
f
0
3
1 5
5 2
1
1 4 5
2
ca b d e f
f
f
f
(a) Original graph
0 11 2 1 3
1
PSfrag replacements
∞∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞
∞∞
∞
∞
∞
∞
∞∞
∞
Prim’s minimum spanning tree algorithm.
Minimum Spanning Tree: Prim’s Algorithm
1. procedure PRIM MST(V, E, w, r)
2. begin
3. VT := {r};
4. d[r] := 0;
5. for all v ∈ (V − VT ) do
6. if edge (r, v) exists set d[v] := w(r, v);
7. else set d[v] := ∞;
8. while VT = V do
9. begin
10. find a vertex u such that d[u] := min{d[v]|v ∈ (V − VT )};
11. VT := VT ∪ {u};
12. for all v ∈ (V − VT ) do
13. d[v] := min{d[v], w(u, v)};
14. endwhile
15. end PRIM MST
Prim’s sequential minimum spanning tree algorithm.
Prim’s Algorithm: Parallel Formulation
• The algorithm works in n outer iterations – it is hard to execute
these iterations concurrently.
• The inner loop is relatively easy to parallelize. Let p be the
number of processes, and let n be the number of vertices.
• The adjacency matrix is partitioned in a 1-D block fashion, with
distance vector d partitioned accordingly.
• In each step, a processor selects the locally closest node.
followed by a global reduction to select globally closest node.
• This node is inserted into MST, and the choice broadcast to all
processors.
• Each processor updates its part of the d vector locally.
Prim’s Algorithm: Parallel Formulation
iProcessors 0 1 p-1
(b)
(a)
PSfrag replacements
n
p
n
d[1..n]
A
The partitioning of the distance array d and the adjacency
matrix A among p processes.
Prim’s Algorithm: Parallel Formulation
• The cost to select the minimum entry is O(n/p + log p).
• The cost of a broadcast is O(log p).
• The cost of local updation of the d vector is O(n/p).
• The parallel time per iteration is O(n/p + log p).
• The total parallel time is given by O(n2
/p + n log p).
• The corresponding isoefficiency is O(p2
log2
p).
Single-Source Shortest Paths
• For a weighted graph G = (V, E, w), the single-source shortest
paths problem is to find the shortest paths from a vertex v ∈ V
to all other vertices in V .
• Dijkstra’s algorithm is similar to Prim’s algorithm. It maintains a
set of nodes for which the shortest paths are known.
• It grows this set based on the node closest to source using one
of the nodes in the current shortest path set.
Single-Source Shortest Paths: Dijkstra’s Algorithm
1. procedure DIJKSTRA SINGLE SOURCE SP(V, E, w, s)
2. begin
3. VT := {s};
4. for all v ∈ (V − VT ) do
5. if (s, v) exists set l[v] := w(s, v);
6. else set l[v] := ∞;
7. while VT = V do
8. begin
9. find a vertex u such that l[u] := min{l[v]|v ∈ (V − VT )};
10. VT := VT ∪ {u};
11. for all v ∈ (V − VT ) do
12. l[v] := min{l[v], l[u] + w(u, v)};
13. endwhile
14. end DIJKSTRA SINGLE SOURCE SP
Dijkstra’s sequential single-source shortest paths algorithm.
Dijkstra’s Algorithm: Parallel Formulation
• Very similar to the parallel formulation of Prim’s algorithm for
minimum spanning trees.
• The weighted adjacency matrix is partitioned using the 1-D
block mapping.
• Each process selects, locally, the node closest to the source,
followed by a global reduction to select next node.
• The node is broadcast to all processors and the l-vector
updated.
• The parallel performance of Dijkstra’s algorithm is identical to
that of Prim’s algorithm.
All-Pairs Shortest Paths
• Given a weighted graph G(V, E, w), the all-pairs shortest paths
problem is to find the shortest paths between all pairs of
vertices vi, vj ∈ V .
• A number of algorithms are known for solving this problem.
All-Pairs Shortest Paths: Matrix-Multiplication Based
Algorithm
• Consider the multiplication of the weighted adjacency matrix
with itself – except, in this case, we replace the multiplication
operation in matrix multiplication by addition, and the addition
operation by minimization.
• Notice that the product of weighted adjacency matrix with
itself returns a matrix that contains shortest paths of length 2
between any pair of nodes.
• It follows from this argument that An
contains all shortest paths.
Matrix-Multiplication Based Algorithm
1 2
21
2
2
3
2
3
1
1
1
G
FB
E
H
D
I
C
A
A1
=
0
B
B
B
B
B
B
B
B
B
B
B
B
B
@
0 2 3 ∞ ∞ ∞ ∞ ∞ ∞
∞ 0 ∞ ∞ ∞ 1 ∞ ∞ ∞
∞ ∞ 0 1 2 ∞ ∞ ∞ ∞
∞ ∞ ∞ 0 ∞ ∞ 2 ∞ ∞
∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞
∞ ∞ ∞ ∞ ∞ 0 2 3 2
∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0
1
C
C
C
C
C
C
C
C
C
C
C
C
C
A
A2
=
0
B
B
B
B
B
B
B
B
B
B
B
B
B
@
0 2 3 4 5 3 ∞ ∞ ∞
∞ 0 ∞ ∞ ∞ 1 3 4 3
∞ ∞ 0 1 2 ∞ 3 ∞ ∞
∞ ∞ ∞ 0 3 ∞ 2 3 ∞
∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞
∞ ∞ ∞ ∞ 3 0 2 3 2
∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0
1
C
C
C
C
C
C
C
C
C
C
C
C
C
A
A4
=
0
B
B
B
B
B
B
B
B
B
B
B
B
B
@
0 2 3 4 5 3 5 6 5
∞ 0 ∞ ∞ 4 1 3 4 3
∞ ∞ 0 1 2 ∞ 3 4 ∞
∞ ∞ ∞ 0 3 ∞ 2 3 ∞
∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞
∞ ∞ ∞ ∞ 3 0 2 3 2
∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0
1
C
C
C
C
C
C
C
C
C
C
C
C
C
A
A8
=
0
B
B
B
B
B
B
B
B
B
B
B
B
B
@
0 2 3 4 5 3 5 6 5
∞ 0 ∞ ∞ 4 1 3 4 3
∞ ∞ 0 1 2 ∞ 3 4 ∞
∞ ∞ ∞ 0 3 ∞ 2 3 ∞
∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞
∞ ∞ ∞ ∞ 3 0 2 3 2
∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞
∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0
1
C
C
C
C
C
C
C
C
C
C
C
C
C
A
Matrix-Multiplication Based Algorithm
• An
is computed by doubling powers – i.e., as A, A2
, A4
, A8
, and
so on.
• We need log n matrix multiplications, each taking time O(n3
).
• The serial complexity of this procedure is O(n3
log n).
• This algorithm is not optimal, since the best known algorithms
have complexity O(n3
).
Matrix-Multiplication Based Algorithm: Parallel
Formulation
• Each of the log n matrix multiplications can be performed in
parallel.
• We can use n3
/ log n processors to compute each matrix-matrix
product in time log n.
• The entire process takes O(log2
n) time.
Dijkstra’s Algorithm
• Execute n instances of the single-source shortest path problem,
one for each of the n source vertices.
• Complexity is O(n3
).
Dijkstra’s Algorithm: Parallel Formulation
• Two parallelization strategies – execute each of the n shortest
path problems on a different processor (source partitioned),
or use a parallel formulation of the shortest path problem to
increase concurrency (source parallel).
Dijkstra’s Algorithm: Source Partitioned Formulation
• Use n processors, each processor Pi finds the shortest paths
from vertex vi to all other vertices by executing Dijkstra’s
sequential single-source shortest paths algorithm.
• It requires no interprocess communication (provided that the
adjacency matrix is replicated at all processes).
• The parallel run time of this formulation is: Θ(n2
).
• While the algorithm is cost optimal, it can only use n processors.
Therefore, the isoefficiency due to concurrency is p3
.
Dijkstra’s Algorithm: Source Parallel Formulation
• In this case, each of the shortest path problems is further
executed in parallel. We can therefore use up to n2
processors.
• Given p processors (p > n), each single source shortest path
problem is executed by p/n processors.
• Using previous results, this takes time:
TP =
computation
Θ
n3
p
+
communication
Θ(n log p). (1)
• For cost optimality, we have p = O(n2
/ log n) and the
isoefficiency is Θ((p log p)1.5
).
Floyd’s Algorithm
• For any pair of vertices vi, vj ∈ V , consider all paths from vi to vj
whose intermediate vertices belong to the set {v1, v2, . . . , vk}.
Let p
(k)
i,j (of weight d
(k)
i,j be the minimum-weight path among
them.
• If vertex vk is not in the shortest path from vi to vj, then p
(k)
i,j is the
same as p
(k−1)
i,j .
• If f vk is in p
(k)
i,j , then we can break p
(k)
i,j into two paths – one from
vi to vk and one from vk to vj. Each of these paths uses vertices
from {v1, v2, . . . , vk−1}.
Floyd’s Algorithm
From our observations, the following recurrence relation follows:
d
(k)
i,j =
w(vi, vj) if k = 0
min d
(k−1)
i,j , d
(k−1)
i,k + d
(k−1)
k,j if k ≥ 1
(2)
This equation must be computed for each pair of nodes and for
k = 1, n. The serial complexity is O(n3
).
Floyd’s Algorithm
1. procedure FLOYD ALL PAIRS SP(A)
2. begin
3. D(0)
= A;
4. for k := 1 to n do
5. for i := 1 to n do
6. for j := 1 to n do
7. d
(k)
i,j := min
“
d
(k−1)
i,j , d
(k−1)
i,k + d
(k−1)
k,j
”
;
8. end FLOYD ALL PAIRS SP
Floyd’s all-pairs shortest paths algorithm. This program computes the all-pairs
shortest paths of the graph G = (V, E) with adjacency matrix A.
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
• Matrix D(k)
is divided into p blocks of size (n/
√
p) × (n/
√
p).
• Each processor updates its part of the matrix during each
iteration.
• To compute d
(k)
l,r processor Pi,j must get d
(k−1)
l,k and d
(k−1)
k,r .
• In general, during the kth
iteration, each of the
√
p processes
containing part of the kth
row send it to the
√
p − 1 processes in
the same column.
• Similarly, each of the
√
p processes containing part of the kth
column sends it to the
√
p − 1 processes in the same row.
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
(a)
(2,1)
(i,j)
(b)
(1,1) (1,2)
PSfrag replacements
n√
p
n√
p
(i − 1) n√
p
+ 1, (j − 1) n√
p
+ 1
i n√
p
, j n√
p
(a) Matrix D(k)
distributed by 2-D block mapping into
√
p ×
√
p
subblocks, and (b) the subblock of D(k)
assigned to process Pi,j.
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
(a) (b)
k
k column k column
row
PSfrag replacements
d
(k−1)
l,k
d
(k−1)
k,r
d
(k)
l,r
(a) Communication patterns used in the 2-D block mapping.
When computing d
(k)
i,j , information must be sent to the
highlighted process from two other processes along the same
row and column. (b) The row and column of
√
p processes that
contain the kth
row and column send them along process
columns and rows.
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
1. procedure FLOYD 2DBLOCK(D(0)
)
2. begin
3. for k := 1 to n do
4. begin
5. each process Pi,j that has a segment of the kth
row of D(k−1)
;
broadcasts it to the P∗,j processes;
6. each process Pi,j that has a segment of the kth
column of D(k−1)
;
broadcasts it to the Pi,∗ processes;
7. each process waits to receive the needed segments;
8. each process Pi,j computes its part of the D(k)
matrix;
9. end
10. end FLOYD 2DBLOCK
Floyd’s parallel formulation using the 2-D block mapping. P∗,j denotes all the
processes in the jth
column, and Pi,∗ denotes all the processes in the ith
row.
The matrix D(0)
is the adjacency matrix.
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
• During each iteration of the algorithm, the kth
row and kth
column of processors perform a one-to-all broadcast along
their rows/columns.
• The size of this broadcast is n/
√
p elements, taking time
Θ((n log p)/
√
p).
• The synchronization step takes time Θ(log p).
• The computation time is Θ(n2
/p).
• The parallel run time of the 2-D block mapping formulation of
Floyd’s algorithm is
TP =
computation
Θ
n3
p
+
communication
Θ
n2
√
p
log p .
Floyd’s Algorithm: Parallel Formulation Using 2-D Block
Mapping
• The above formulation can use O(n2
/ log2
n) processors cost-
optimally.
• The isoefficiency of this formulation is Θ(p1.5
log3
p).
• This algorithm can be further improved by relaxing the strict
synchronization after each iteration.
Floyd’s Algorithm: Speeding Things Up by Pipelining
• The synchronization step in parallel Floyd’s algorithm can be
removed without affecting the correctness of the algorithm.
• A process starts working on the kth
iteration as soon as it has
computed the (k − 1)th
iteration and has the relevant parts of
the D(k−1)
matrix.
Floyd’s Algorithm: Speeding Things Up by Pipelining
1 2 3 4 5 6 7 8 9 10
t
t+1
t+2
t+3
t+4
t+5
Processors
Time
Communication protocol followed in the pipelined 2-D block
mapping formulation of Floyd’s algorithm. Assume that process 4
at time t has just computed a segment of the kth
column of the
D(k−1)
matrix. It sends the segment to processes 3 and 5. These
processes receive the segment at time t + 1 (where the time unit
is the time it takes for a matrix segment to travel over the
communication link between adjacent processes). Similarly,
processes farther away from process 4 receive the segment later.
Process 1 (at the boundary) does not forward the segment after
receiving it.
Floyd’s Algorithm: Speeding Things Up by Pipelining
• In each step, n/
√
p elements of the first row are sent from
process Pi,j to Pi+1,j.
• Similarly, elements of the first column are sent from process Pi,j
to process Pi,j+1.
• Each such step takes time Θ(n/
√
p).
• After Θ(
√
p) steps, process P√
p,
√
p gets the relevant elements of
the first row and first column in time Θ(n).
• The values of successive rows and columns follow after time
Θ(n2
/p) in a pipelined mode.
• Process P√
p,
√
p finishes its share of the shortest path computation
in time Θ(n3
/p) + Θ(n).
• When process P√
p,
√
p has finished the (n − 1)th
iteration, it sends
the relevant values of the nth
row and column to the other
processes.
Floyd’s Algorithm: Speeding Things Up by Pipelining
• The overall parallel run time of this formulation is
TP =
computation
Θ
n3
p
+
communication
Θ(n).
• The pipelined formulation of Floyd’s algorithm uses up to O(n2
)
processes efficiently.
• The corresponding isoefficiency is Θ(p1.5
).
All-pairs Shortest Path: Comparison
The performance and scalability of the all-pairs shortest paths
algorithms on various architectures with O(p) bisection
bandwidth. Similar run times apply to all k − d cube
architectures, provided that processes are properly mapped to
the underlying processors.
Maximum Number
of Processes Corresponding Isoefficiency
for E = Θ(1) Parallel Run Time Function
Dijkstra source-partitioned Θ(n) Θ(n2
) Θ(p3
)
Dijkstra source-parallel Θ(n2
/ log n) Θ(n log n) Θ((p log p)1.5
)
Floyd 1-D block Θ(n/ log n) Θ(n2
log n) Θ((p log p)3
)
Floyd 2-D block Θ(n2
/ log2
n) Θ(n log2
n) Θ(p1.5
log3
p)
Floyd pipelined 2-D block Θ(n2
) Θ(n) Θ(p1.5
)
Transitive Closure
• If G = (V, E) is a graph, then the /em transitive closure of G is
defined as the graph G∗
= (V, E∗
), where E∗
= {(vi, vj)| there is
a path from vi to vj in G}.
• The /em connectivity matrix of G is a matrix A∗
= (a∗
i,j) such
that a∗
i,j = 1 if there is a path from vi to vj or i = j, and a∗
i,j = ∞
otherwise.
• To compute A∗
we assign a weight of 1 to each edge of E
and use any of the all-pairs shortest paths algorithms on this
weighted graph.
Connected Components
The connected components of an undirected graph are the
equivalence classes of vertices under the “is reachable from”
relation.
1
2 3
4
5
6
7 8
9
A graph with three connected components: {1, 2, 3, 4}, {5, 6, 7},
and {8, 9}.
Connected Components: Depth-First Search Based
Algorithm
Perform DFS on the graph to get a forest – eac tree in the forest
corresponds to a separate connected component.
(a)
(b)
2 5
41 6
10
9
11
12
3
2 5
41 6
10
9
11
12
3
Part (b) is a depth-first forest obtained from depth-first traversal of
the graph in part (a). Each of these trees is a connected
component of the graph in part (a).
Connected Components: Parallel Formulation
• Partition the graph across processors and run independent
connected component algorithms on each processor. At this
point, we have p spanning forests.
• In the second step, spanning forests are merged pairwise until
only one spanning forest remains.
Connected Components: Parallel Formulation
1
0
0
0
76
7
6
0
0
1
0
5
2
0
0
0
0
1 0 0
0
0
0
1
1
1
1
0
0
0
0
0
1
0
1
0
54321
0
1
0
4
3
1
0
0
0
1
1
0
1
00 0
0
1
1
0
17
6
5
4
3
2
1
Processor 1
Processor 2
1
2
3
4
5
6
7 1
2
3
4
5
6
7
(b)
(d)
1
2
3
4
5
6
7 1
2
3
4
5
6
7
(e) (f)
(c)
(a)
Computing connected components in parallel. The adjacency
matrix of the graph G in (a) is partitioned into two parts (b). Each
process gets a subgraph of G ((c) and (e)). Each process then
computes the spanning forest of the subgraph ((d) and (f)).
Finally, the two spanning trees are merged to form the solution.
Connected Components: Parallel Formulation
• To merge pairs of spanning forests efficiently, the algorithm uses
disjoint sets of edges.
• We define the following operations on the disjoint sets:
find(x) returns a pointer to the representative element of the
set containing x. Each set has its own unique representative.
union(x, y) unites the sets containing the elements x and y. The
two sets are assumed to be disjoint prior to the operation.
Connected Components: Parallel Formulation
• For merging forest A into forest B, for each edge (u, v) of A, a
find operation is performed to determine if the vertices are in
the same tree of B.
• If not, then the two trees (sets) of B containing u and v are
united by a union operation.
• Otherwise, no union operation is necessary.
• Hence, merging A and B requires at most 2(n − 1) find
operations and (n − 1) union operations.
Connected Components: Parallel 1-D Block Mapping
• The n × n adjacency matrix is partitioned into p blocks.
• Each processor can compute its local spanning forest in time
Θ(n2
/p).
• Merging is done by embedding a logical tree into the topology.
There are log p merging stages, and each takes time Θ(n). Thus,
the cost due to merging is Θ(n log p).
• During each merging stage, spanning forests are sent between
nearest neighbors. Recall that Θ(n) edges of the spanning
forest are transmitted.
Connected Components: Parallel 1-D Block Mapping
• The parallel run time of the connected-component algorithm
is
TP =
local computation
Θ
n2
p
+
forest merging
Θ(n log p).
• For a cost-optimal formulation p = O(n/ log n). The
corresponding isoefficiency is Θ(p2
log2
p).
Algorithms for Sparse Graphs
A graph G = (V, E) is sparse if |E| is much smaller than |V |2
.
(b)(a)
(c)
Examples of sparse graphs: (a) a linear graph, in which each
vertex has two incident edges; (b) a grid graph, in which each
vertex has four incident vertices; and (c) a random sparse graph.
Algorithms for Sparse Graphs
• Dense algorithms can be improved significantly if we make
use of the sparseness. For example, the run time of Prim’s
minimum spanning tree algorithm can be reduced from Θ(n2
)
to Θ(|E| log n).
• Sparse algorithms use adjacency list instead of an adjacency
matrix.
• Partitioning adjacency lists is more difficult for sparse graphs –
do we balance number of vertices or edges?
• Parallel algorithms typically make use of graph structure or
degree information for performance.
Algorithms for Sparse Graphs
(a) (b)
A street map (a) can be represented by a graph (b). In the
graph shown in (b), each street intersection is a vertex and each
edge is a street segment. The vertices of (b) are the intersections
of (a) marked by dots.
Finding a Maximal Independent Set
A set of vertices I ⊂ V is called /em independent if no pair of
vertices in I is connected via an edge in G. An independent set
is called /em maximal if by including any other vertex not in I,
the independence property is violated.
{a, d, i, h} is an independent set
{a, c, j, f, g} is a maximal independent set
{a, d, h, f} is a maximal independent set
e
b
d
i
h
e
a
c
f
g
j
Examples of independent and maximal independent sets.
Finding a Maximal Independent Set (MIS)
• Simple algorithms start by MIS I to be empty, and assigning all
vertices to a candidate set C.
• Vertex v from C is moved into I and all vertices adjacent to v
are removed from C.
• This process is repeated until C is empty.
• This process is inherently serial!
Finding a Maximal Independent Set (MIS)
• Parallel MIS algorithms use randimization to gain concurrency
(Luby’s algorithm for graph coloring).
• Initially, each node is in the candidate set C. Each node
generates a (unique) random number and communicates it
to its neighbors.
• If a nodes number exceeds that of all its neighbors, it joins set
I. All of its neighbors are removed from C.
• This process continues until C is empty.
• On average, this algorithm converges after O(log |V |) such
steps.
Finding a Maximal Independent Set (MIS)
Vertex adjacent to a vertex
in the independent set
Vertex in the independent set
(b) After the 2nd random number assignment (c) Final maximal independent set
(a) After the 1st random number assignment
0 1
0
1
2
3
4
15
15
6
14
7
10 8
9
11
12
13
11
15
The different augmentation steps of Luby’s randomized maximal
independent set algorithm. The numbers inside each vertex
correspond to the random number assigned to the vertex.
Finding a Maximal Independent Set (MIS): Parallel
Formulation
• We use three arrays, each of length n – I, which stores nodes
in MIS, C, which stores the candidate set, and R, the random
numbers.
• Partition C across p processors. Each processor generates the
corresponding values in the R array, and from this, computes
which candidate vertices can enter MIS.
• The C array is updated by deleting all the neighbors of vertices
that entered MIS.
• The performance of this algorithm is dependent on the
structure of the graph.
Single-Source Shortest Paths
• Dijkstra’s algorithm, modified to handle sparse graphs is called
Johnson’s algorithm.
• The modification accounts for the fact that the minimization
step in Dijkstra’s algorithm needs to be performed only for those
nodes adjacent to the previously selected nodes.
• Johnson’s algorithm uses a priority queue Q to store the value
l[v] for each vertex v ∈ (V − VT ).
Single-Source Shortest Paths: Johnson’s Algorithm
1. procedure JOHNSON SINGLE SOURCE SP(V, E, s)
2. begin
3. Q := V ;
4. for all v ∈ Q do
5. l[v] := ∞;
6. l[s] := 0;
7. while Q = ∅ do
8. begin
9. u := extract min(Q);
10. for each v ∈ Adj[u] do
11. if v ∈ Q and l[u] + w(u, v) < l[v] then
12. l[v] := l[u] + w(u, v);
13. endwhile
14. end JOHNSON SINGLE SOURCE SP
Johnson’s sequential single-source shortest paths algorithm.
Single-Source Shortest Paths: Parallel Johnson’s
Algorithm
• Maintaining strict order of Johnson’s algorithm generally leads
to a very restrictive class of parallel algorithms.
• We need to allow exploration of multiple nodes concurrently.
This is done by simultaneously extracting p nodes from the
priority queue, updating the neighbors’ cost, and augmenting
the shortest path.
• If an error is made, it can be discovered (as a shorter path) and
the node can be reinserted with this shorter path.
Single-Source Shortest Paths: Parallel Johnson’s
Algorithm
0 1 7
0 1 4 3 6 10 47
0 1 4 3 107
0 1 4 3 6 5 4 67
b d f h ia c e g
b:1, d:7, c:inf, e:inf, f:inf, g:inf, h:inf, i:inf
e:3, c:4, g:10, f:inf, h:inf, i:inf
h:4, f:6, i:inf
g:5, i:6
Priority Queue Array l[]
(1)
(2)
(3)
(4)
2
8
1 2
1 3
2
1
5
3
7
a b c
d e f
g h i
1
ments
∞∞∞
∞
∞∞∞∞∞∞
An example of the modified Johnson’s algorithm for processing
unsafe vertices concurrently.
Single-Source Shortest Paths: Parallel Johnson’s
Algorithm
• Even if we can extract and process multiple nodes from the
queue, the queue itself is a major bottleneck.
• For this reason, we use multiple queues, one for each processor.
Each processor builds its priority queue only using its own
vertices.
• When process Pi extracts the vertex u ∈ Vi, it sends a message
to processes that store vertices adjacent to u.
• Process Pj, upon receiving this message, sets the value of l[v]
stored in its priority queue to min{l[v], l[u] + w(u, v)}.
Single-Source Shortest Paths: Parallel Johnson’s
Algorithm
• If a shorter path has been discovered to node v, it is reinserted
back into the local priority queue.
• The algorithm terminates only when all the queues become
empty.
• A number of node paritioning schemes can be used to exploit
graph structure for performance.

Weitere ähnliche Inhalte

Was ist angesagt?

Unit26 shortest pathalgorithm
Unit26 shortest pathalgorithmUnit26 shortest pathalgorithm
Unit26 shortest pathalgorithmmeisamstar
 
Floyd warshall algo {dynamic approach}
Floyd warshall algo {dynamic approach}Floyd warshall algo {dynamic approach}
Floyd warshall algo {dynamic approach}Shubham Shukla
 
Single source stortest path bellman ford and dijkstra
Single source stortest path bellman ford and dijkstraSingle source stortest path bellman ford and dijkstra
Single source stortest path bellman ford and dijkstraRoshan Tailor
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2MuradAmn
 
Shortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRoutingShortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRoutingantonpa
 
Dijkstra's algorithm presentation
Dijkstra's algorithm presentationDijkstra's algorithm presentation
Dijkstra's algorithm presentationSubid Biswas
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problemIfra Ilyas
 
Algorithm Design and Complexity - Course 8
Algorithm Design and Complexity - Course 8Algorithm Design and Complexity - Course 8
Algorithm Design and Complexity - Course 8Traian Rebedea
 
Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7Traian Rebedea
 
Algorithms explained
Algorithms explainedAlgorithms explained
Algorithms explainedPIYUSH Dubey
 
Parallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsParallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsShaleen Kumar Gupta
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescentXudong Sun
 
Ram minimum spanning tree
Ram   minimum spanning treeRam   minimum spanning tree
Ram minimum spanning treeRama Prasath A
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanationsGopi Saiteja
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomeworkGopi Saiteja
 
minimum spanning trees Algorithm
minimum spanning trees Algorithm minimum spanning trees Algorithm
minimum spanning trees Algorithm sachin varun
 

Was ist angesagt? (20)

Unit26 shortest pathalgorithm
Unit26 shortest pathalgorithmUnit26 shortest pathalgorithm
Unit26 shortest pathalgorithm
 
Floyd warshall algo {dynamic approach}
Floyd warshall algo {dynamic approach}Floyd warshall algo {dynamic approach}
Floyd warshall algo {dynamic approach}
 
Single source stortest path bellman ford and dijkstra
Single source stortest path bellman ford and dijkstraSingle source stortest path bellman ford and dijkstra
Single source stortest path bellman ford and dijkstra
 
04 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x204 greedyalgorithmsii 2x2
04 greedyalgorithmsii 2x2
 
Shortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRoutingShortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRouting
 
Dijkstra's algorithm presentation
Dijkstra's algorithm presentationDijkstra's algorithm presentation
Dijkstra's algorithm presentation
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problem
 
Algorithm Design and Complexity - Course 8
Algorithm Design and Complexity - Course 8Algorithm Design and Complexity - Course 8
Algorithm Design and Complexity - Course 8
 
Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7
 
Cpsc125 ch6sec3
Cpsc125 ch6sec3Cpsc125 ch6sec3
Cpsc125 ch6sec3
 
Algorithms explained
Algorithms explainedAlgorithms explained
Algorithms explained
 
Parallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsParallel Coordinate Descent Algorithms
Parallel Coordinate Descent Algorithms
 
Subquad multi ff
Subquad multi ffSubquad multi ff
Subquad multi ff
 
19 Minimum Spanning Trees
19 Minimum Spanning Trees19 Minimum Spanning Trees
19 Minimum Spanning Trees
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescent
 
Ram minimum spanning tree
Ram   minimum spanning treeRam   minimum spanning tree
Ram minimum spanning tree
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomework
 
04 greedyalgorithmsii
04 greedyalgorithmsii04 greedyalgorithmsii
04 greedyalgorithmsii
 
minimum spanning trees Algorithm
minimum spanning trees Algorithm minimum spanning trees Algorithm
minimum spanning trees Algorithm
 

Andere mochten auch

Andere mochten auch (6)

Part4 graph algorithms
Part4 graph algorithmsPart4 graph algorithms
Part4 graph algorithms
 
XXL Graph Algorithms__HadoopSummit2010
XXL Graph Algorithms__HadoopSummit2010XXL Graph Algorithms__HadoopSummit2010
XXL Graph Algorithms__HadoopSummit2010
 
Algorithms of graph
Algorithms of graphAlgorithms of graph
Algorithms of graph
 
1535 graph algorithms
1535 graph algorithms1535 graph algorithms
1535 graph algorithms
 
Ch18
Ch18Ch18
Ch18
 
Graph algorithm
Graph algorithmGraph algorithm
Graph algorithm
 

Ähnlich wie Graph Algorithms Overview

Ähnlich wie Graph Algorithms Overview (20)

Chap10 slides
Chap10 slidesChap10 slides
Chap10 slides
 
Lecture_10_Parallel_Algorithms_Part_II.ppt
Lecture_10_Parallel_Algorithms_Part_II.pptLecture_10_Parallel_Algorithms_Part_II.ppt
Lecture_10_Parallel_Algorithms_Part_II.ppt
 
12_Graph.pptx
12_Graph.pptx12_Graph.pptx
12_Graph.pptx
 
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
 
Dijkstra
DijkstraDijkstra
Dijkstra
 
Advances in Directed Spanners
Advances in Directed SpannersAdvances in Directed Spanners
Advances in Directed Spanners
 
Unit ix graph
Unit   ix    graph Unit   ix    graph
Unit ix graph
 
Ppt 1
Ppt 1Ppt 1
Ppt 1
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Unit 9 graph
Unit   9 graphUnit   9 graph
Unit 9 graph
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Graphs
GraphsGraphs
Graphs
 
Graph
GraphGraph
Graph
 
ADA - Minimum Spanning Tree Prim Kruskal and Dijkstra
ADA - Minimum Spanning Tree Prim Kruskal and Dijkstra ADA - Minimum Spanning Tree Prim Kruskal and Dijkstra
ADA - Minimum Spanning Tree Prim Kruskal and Dijkstra
 
Metric dimesion of circulsnt graphs
Metric dimesion of circulsnt graphsMetric dimesion of circulsnt graphs
Metric dimesion of circulsnt graphs
 
DIJKSTRA_123.pptx
DIJKSTRA_123.pptxDIJKSTRA_123.pptx
DIJKSTRA_123.pptx
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Analysis and design of a half hypercube interconnection network topology
Analysis and design of a half hypercube interconnection network topologyAnalysis and design of a half hypercube interconnection network topology
Analysis and design of a half hypercube interconnection network topology
 
DAA_Presentation - Copy.pptx
DAA_Presentation - Copy.pptxDAA_Presentation - Copy.pptx
DAA_Presentation - Copy.pptx
 

Kürzlich hochgeladen

Rights of under-trial Prisoners in India
Rights of under-trial Prisoners in IndiaRights of under-trial Prisoners in India
Rights of under-trial Prisoners in IndiaAbheet Mangleek
 
Comparison of GenAI benchmarking models for legal use cases
Comparison of GenAI benchmarking models for legal use casesComparison of GenAI benchmarking models for legal use cases
Comparison of GenAI benchmarking models for legal use casesritwikv20
 
Attestation presentation under Transfer of property Act
Attestation presentation under Transfer of property ActAttestation presentation under Transfer of property Act
Attestation presentation under Transfer of property Act2020000445musaib
 
Special Accounting Areas - Hire purchase agreement
Special Accounting Areas - Hire purchase agreementSpecial Accounting Areas - Hire purchase agreement
Special Accounting Areas - Hire purchase agreementShubhiSharma858417
 
Illinois Department Of Corrections reentry guide
Illinois Department Of Corrections reentry guideIllinois Department Of Corrections reentry guide
Illinois Department Of Corrections reentry guideillinoisworknet11
 
The Patents Act 1970 Notes For College .pptx
The Patents Act 1970 Notes For College .pptxThe Patents Act 1970 Notes For College .pptx
The Patents Act 1970 Notes For College .pptxAdityasinhRana4
 
Analysis on Law of Domicile under Private International laws.
Analysis on Law of Domicile under Private International laws.Analysis on Law of Domicile under Private International laws.
Analysis on Law of Domicile under Private International laws.2020000445musaib
 
Presentation1.pptx on sedition is a good legal point
Presentation1.pptx on sedition is a good legal pointPresentation1.pptx on sedition is a good legal point
Presentation1.pptx on sedition is a good legal pointMohdYousuf40
 
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTS
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTSTHE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTS
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTSRoshniSingh312153
 
Grey Area of the Information Technology Act, 2000.pptx
Grey Area of the Information Technology Act, 2000.pptxGrey Area of the Information Technology Act, 2000.pptx
Grey Area of the Information Technology Act, 2000.pptxBharatMunjal4
 
Conditions Restricting Transfer Under TPA,1882
Conditions Restricting Transfer Under TPA,1882Conditions Restricting Transfer Under TPA,1882
Conditions Restricting Transfer Under TPA,18822020000445musaib
 
Understanding Cyber Crime Litigation: Key Concepts and Legal Frameworks
Understanding Cyber Crime Litigation: Key Concepts and Legal FrameworksUnderstanding Cyber Crime Litigation: Key Concepts and Legal Frameworks
Understanding Cyber Crime Litigation: Key Concepts and Legal FrameworksFinlaw Associates
 
Alexis OConnell mugshot Lexileeyogi 512-840-8791
Alexis OConnell mugshot Lexileeyogi 512-840-8791Alexis OConnell mugshot Lexileeyogi 512-840-8791
Alexis OConnell mugshot Lexileeyogi 512-840-8791BlayneRush1
 
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书1k98h0e1
 
Vanderburgh County Sheriff says he will Not Raid Delta 8 Shops
Vanderburgh County Sheriff says he will Not Raid Delta 8 ShopsVanderburgh County Sheriff says he will Not Raid Delta 8 Shops
Vanderburgh County Sheriff says he will Not Raid Delta 8 ShopsAbdul-Hakim Shabazz
 
Guide for Drug Education and Vice Control.docx
Guide for Drug Education and Vice Control.docxGuide for Drug Education and Vice Control.docx
Guide for Drug Education and Vice Control.docxjennysansano2
 
Hungarian legislation made by Robert Miklos
Hungarian legislation made by Robert MiklosHungarian legislation made by Robert Miklos
Hungarian legislation made by Robert Miklosbeduinpower135
 
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis Lee
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis LeeAlexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis Lee
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis LeeBlayneRush1
 
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptx
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptxSarvesh Raj IPS - A Journey of Dedication and Leadership.pptx
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptxAnto Jebin
 
Alexis O'Connell Lexileeyogi 512-840-8791
Alexis O'Connell Lexileeyogi 512-840-8791Alexis O'Connell Lexileeyogi 512-840-8791
Alexis O'Connell Lexileeyogi 512-840-8791BlayneRush1
 

Kürzlich hochgeladen (20)

Rights of under-trial Prisoners in India
Rights of under-trial Prisoners in IndiaRights of under-trial Prisoners in India
Rights of under-trial Prisoners in India
 
Comparison of GenAI benchmarking models for legal use cases
Comparison of GenAI benchmarking models for legal use casesComparison of GenAI benchmarking models for legal use cases
Comparison of GenAI benchmarking models for legal use cases
 
Attestation presentation under Transfer of property Act
Attestation presentation under Transfer of property ActAttestation presentation under Transfer of property Act
Attestation presentation under Transfer of property Act
 
Special Accounting Areas - Hire purchase agreement
Special Accounting Areas - Hire purchase agreementSpecial Accounting Areas - Hire purchase agreement
Special Accounting Areas - Hire purchase agreement
 
Illinois Department Of Corrections reentry guide
Illinois Department Of Corrections reentry guideIllinois Department Of Corrections reentry guide
Illinois Department Of Corrections reentry guide
 
The Patents Act 1970 Notes For College .pptx
The Patents Act 1970 Notes For College .pptxThe Patents Act 1970 Notes For College .pptx
The Patents Act 1970 Notes For College .pptx
 
Analysis on Law of Domicile under Private International laws.
Analysis on Law of Domicile under Private International laws.Analysis on Law of Domicile under Private International laws.
Analysis on Law of Domicile under Private International laws.
 
Presentation1.pptx on sedition is a good legal point
Presentation1.pptx on sedition is a good legal pointPresentation1.pptx on sedition is a good legal point
Presentation1.pptx on sedition is a good legal point
 
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTS
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTSTHE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTS
THE INDIAN CONTRACT ACT 1872 NOTES FOR STUDENTS
 
Grey Area of the Information Technology Act, 2000.pptx
Grey Area of the Information Technology Act, 2000.pptxGrey Area of the Information Technology Act, 2000.pptx
Grey Area of the Information Technology Act, 2000.pptx
 
Conditions Restricting Transfer Under TPA,1882
Conditions Restricting Transfer Under TPA,1882Conditions Restricting Transfer Under TPA,1882
Conditions Restricting Transfer Under TPA,1882
 
Understanding Cyber Crime Litigation: Key Concepts and Legal Frameworks
Understanding Cyber Crime Litigation: Key Concepts and Legal FrameworksUnderstanding Cyber Crime Litigation: Key Concepts and Legal Frameworks
Understanding Cyber Crime Litigation: Key Concepts and Legal Frameworks
 
Alexis OConnell mugshot Lexileeyogi 512-840-8791
Alexis OConnell mugshot Lexileeyogi 512-840-8791Alexis OConnell mugshot Lexileeyogi 512-840-8791
Alexis OConnell mugshot Lexileeyogi 512-840-8791
 
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书
昆士兰科技大学毕业证学位证成绩单-补办步骤澳洲毕业证书
 
Vanderburgh County Sheriff says he will Not Raid Delta 8 Shops
Vanderburgh County Sheriff says he will Not Raid Delta 8 ShopsVanderburgh County Sheriff says he will Not Raid Delta 8 Shops
Vanderburgh County Sheriff says he will Not Raid Delta 8 Shops
 
Guide for Drug Education and Vice Control.docx
Guide for Drug Education and Vice Control.docxGuide for Drug Education and Vice Control.docx
Guide for Drug Education and Vice Control.docx
 
Hungarian legislation made by Robert Miklos
Hungarian legislation made by Robert MiklosHungarian legislation made by Robert Miklos
Hungarian legislation made by Robert Miklos
 
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis Lee
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis LeeAlexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis Lee
Alexis O'Connell lexileeyogi Bond revocation for drug arrest Alexis Lee
 
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptx
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptxSarvesh Raj IPS - A Journey of Dedication and Leadership.pptx
Sarvesh Raj IPS - A Journey of Dedication and Leadership.pptx
 
Alexis O'Connell Lexileeyogi 512-840-8791
Alexis O'Connell Lexileeyogi 512-840-8791Alexis O'Connell Lexileeyogi 512-840-8791
Alexis O'Connell Lexileeyogi 512-840-8791
 

Graph Algorithms Overview

  • 1. Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text “Introduction to Parallel Computing”, Addison Wesley, 2003.
  • 2. Topic Overview • Definitions and Representation • Minimum Spanning Tree: Prim’s Algorithm • Single-Source Shortest Paths: Dijkstra’s Algorithm • All-Pairs Shortest Paths • Transitive Closure • Connected Components • Algorithms for Sparse Graphs
  • 3. Definitions and Representation • An undirected graph G is a pair (V, E), where V is a finite set of points called vertices and E is a finite set of edges. • An edge e ∈ E is an unordered pair (u, v), where u, v ∈ V . • In a directed graph, the edge e is an ordered pair (u, v). An edge (u, v) is incident from vertex u and is incident to vertex v. • A path from a vertex v to a vertex u is a sequence v0, v1, v2, . . . , vk of vertices where v0 = v, vk = u, and (vi, vi+1) ∈ E for i = 0, 1, . . . , k − 1. • The length of a path is defined as the number of edges in the path.
  • 4. Definitions and Representation 1 2 3 45 6 1 2 3 4 56 f e (a) (b) (a) An undirected graph and (b) a directed graph.
  • 5. Definitions and Representation • An undirected graph is connected if every pair of vertices is connected by a path. • A forest is an acyclic graph, and a tree is a connected acyclic graph. • A graph that has weights associated with each edge is called a weighted graph.
  • 6. Definitions and Representation • Graphs can be represented by their adjacency matrix or an edge (or vertex) list. • Adjacency matrices have a value ai,j = 1 if nodes i and j share an edge; 0 otherwise. In case of a weighted graph, ai,j = wi,j, the weight of the edge. • The adjacency list representation of a graph G = (V, E) consists of an array Adj[1..|V |] of lists. Each list Adj[v] is a list of all vertices adjacent to v. • For a grapn with n nodes, adjacency matrices take Theta(n2 ) space and adjacency list takes Θ(|E|) space.
  • 7. Definitions and Representation 1 32 4 5 0 1 0 0 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 1 1 1 0 A = An undirected graph and its adjacency matrix representation. 3 1 2 4 5 1 2 3 4 5 2 2 5 5 2 3 4 1 53 An undirected graph and its adjacency list representation.
  • 8. Minimum Spanning Tree • A spanning tree of an undirected graph G is a subgraph of G that is a tree containing all the vertices of G. • In a weighted graph, the weight of a subgraph is the sum of the weights of the edges in the subgraph. • A minimum spanning tree (MST) for a weighted undirected graph is a spanning tree with minimum weight.
  • 9. Minimum Spanning Tree 2 4 1 2 3 5 28 4 3 2 1 3 2 An undirected graph and its minimum spanning tree.
  • 10. Minimum Spanning Tree: Prim’s Algorithm • Prim’s algorithm for finding an MST is a greedy algorithm. • Start by selecting an arbitrary vertex, include it into the current MST. • Grow the current MST by inserting into it the vertex closest to one of the vertices already in current MST.
  • 11. Minimum Spanning Tree: Prim’s Algorithm 11 5 0 d[] d[] 1 12 4 3 3 1 1 2 5 1 4 b a f d c e 3 3 1 1 2 5 1 4 b a f d c e 3 (d) Final minimum spanning tree (c) has been selected After the second edge After the first edge has been selected (b) 3 1 1 2 5 1 4 b a f d c e 3 3 1 1 2 5 1 4 b a f d c e 3 5 5 5 5 0 3 1 0 0 2 1 0 1 4 0 5 0 3a b c d e f 0 1 3 1 5 5 2 1 1 4 5 2 0 3 1 0 0 2 1 0 1 4 0 5 0 3a b c d e f d[] 3 1 5 5 2 1 1 4 5 2 0 3 1 0 0 2 1 0 1 4 0 5 0 3a b c d e f d[] 3 1 5 5 2 1 1 4 5 2 2 4 ca b d e ca b d e ca b d e 0 3 1 0 0 2 1 0 1 4 0 5 0 3a b c d e f 0 3 1 5 5 2 1 1 4 5 2 ca b d e f f f f (a) Original graph 0 11 2 1 3 1 PSfrag replacements ∞∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞ ∞∞ ∞ ∞ ∞ ∞ ∞∞ ∞
  • 12. Prim’s minimum spanning tree algorithm.
  • 13. Minimum Spanning Tree: Prim’s Algorithm 1. procedure PRIM MST(V, E, w, r) 2. begin 3. VT := {r}; 4. d[r] := 0; 5. for all v ∈ (V − VT ) do 6. if edge (r, v) exists set d[v] := w(r, v); 7. else set d[v] := ∞; 8. while VT = V do 9. begin 10. find a vertex u such that d[u] := min{d[v]|v ∈ (V − VT )}; 11. VT := VT ∪ {u}; 12. for all v ∈ (V − VT ) do 13. d[v] := min{d[v], w(u, v)}; 14. endwhile 15. end PRIM MST Prim’s sequential minimum spanning tree algorithm.
  • 14. Prim’s Algorithm: Parallel Formulation • The algorithm works in n outer iterations – it is hard to execute these iterations concurrently. • The inner loop is relatively easy to parallelize. Let p be the number of processes, and let n be the number of vertices. • The adjacency matrix is partitioned in a 1-D block fashion, with distance vector d partitioned accordingly. • In each step, a processor selects the locally closest node. followed by a global reduction to select globally closest node. • This node is inserted into MST, and the choice broadcast to all processors. • Each processor updates its part of the d vector locally.
  • 15. Prim’s Algorithm: Parallel Formulation iProcessors 0 1 p-1 (b) (a) PSfrag replacements n p n d[1..n] A The partitioning of the distance array d and the adjacency matrix A among p processes.
  • 16. Prim’s Algorithm: Parallel Formulation • The cost to select the minimum entry is O(n/p + log p). • The cost of a broadcast is O(log p). • The cost of local updation of the d vector is O(n/p). • The parallel time per iteration is O(n/p + log p). • The total parallel time is given by O(n2 /p + n log p). • The corresponding isoefficiency is O(p2 log2 p).
  • 17. Single-Source Shortest Paths • For a weighted graph G = (V, E, w), the single-source shortest paths problem is to find the shortest paths from a vertex v ∈ V to all other vertices in V . • Dijkstra’s algorithm is similar to Prim’s algorithm. It maintains a set of nodes for which the shortest paths are known. • It grows this set based on the node closest to source using one of the nodes in the current shortest path set.
  • 18. Single-Source Shortest Paths: Dijkstra’s Algorithm 1. procedure DIJKSTRA SINGLE SOURCE SP(V, E, w, s) 2. begin 3. VT := {s}; 4. for all v ∈ (V − VT ) do 5. if (s, v) exists set l[v] := w(s, v); 6. else set l[v] := ∞; 7. while VT = V do 8. begin 9. find a vertex u such that l[u] := min{l[v]|v ∈ (V − VT )}; 10. VT := VT ∪ {u}; 11. for all v ∈ (V − VT ) do 12. l[v] := min{l[v], l[u] + w(u, v)}; 13. endwhile 14. end DIJKSTRA SINGLE SOURCE SP Dijkstra’s sequential single-source shortest paths algorithm.
  • 19. Dijkstra’s Algorithm: Parallel Formulation • Very similar to the parallel formulation of Prim’s algorithm for minimum spanning trees. • The weighted adjacency matrix is partitioned using the 1-D block mapping. • Each process selects, locally, the node closest to the source, followed by a global reduction to select next node. • The node is broadcast to all processors and the l-vector updated. • The parallel performance of Dijkstra’s algorithm is identical to that of Prim’s algorithm.
  • 20. All-Pairs Shortest Paths • Given a weighted graph G(V, E, w), the all-pairs shortest paths problem is to find the shortest paths between all pairs of vertices vi, vj ∈ V . • A number of algorithms are known for solving this problem.
  • 21. All-Pairs Shortest Paths: Matrix-Multiplication Based Algorithm • Consider the multiplication of the weighted adjacency matrix with itself – except, in this case, we replace the multiplication operation in matrix multiplication by addition, and the addition operation by minimization. • Notice that the product of weighted adjacency matrix with itself returns a matrix that contains shortest paths of length 2 between any pair of nodes. • It follows from this argument that An contains all shortest paths.
  • 22. Matrix-Multiplication Based Algorithm 1 2 21 2 2 3 2 3 1 1 1 G FB E H D I C A A1 = 0 B B B B B B B B B B B B B @ 0 2 3 ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ 1 ∞ ∞ ∞ ∞ ∞ 0 1 2 ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ 2 ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 2 3 2 ∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0 1 C C C C C C C C C C C C C A A2 = 0 B B B B B B B B B B B B B @ 0 2 3 4 5 3 ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ 1 3 4 3 ∞ ∞ 0 1 2 ∞ 3 ∞ ∞ ∞ ∞ ∞ 0 3 ∞ 2 3 ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 3 0 2 3 2 ∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0 1 C C C C C C C C C C C C C A A4 = 0 B B B B B B B B B B B B B @ 0 2 3 4 5 3 5 6 5 ∞ 0 ∞ ∞ 4 1 3 4 3 ∞ ∞ 0 1 2 ∞ 3 4 ∞ ∞ ∞ ∞ 0 3 ∞ 2 3 ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 3 0 2 3 2 ∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0 1 C C C C C C C C C C C C C A A8 = 0 B B B B B B B B B B B B B @ 0 2 3 4 5 3 5 6 5 ∞ 0 ∞ ∞ 4 1 3 4 3 ∞ ∞ 0 1 2 ∞ 3 4 ∞ ∞ ∞ ∞ 0 3 ∞ 2 3 ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 3 0 2 3 2 ∞ ∞ ∞ ∞ 1 ∞ 0 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 1 0 1 C C C C C C C C C C C C C A
  • 23. Matrix-Multiplication Based Algorithm • An is computed by doubling powers – i.e., as A, A2 , A4 , A8 , and so on. • We need log n matrix multiplications, each taking time O(n3 ). • The serial complexity of this procedure is O(n3 log n). • This algorithm is not optimal, since the best known algorithms have complexity O(n3 ).
  • 24. Matrix-Multiplication Based Algorithm: Parallel Formulation • Each of the log n matrix multiplications can be performed in parallel. • We can use n3 / log n processors to compute each matrix-matrix product in time log n. • The entire process takes O(log2 n) time.
  • 25. Dijkstra’s Algorithm • Execute n instances of the single-source shortest path problem, one for each of the n source vertices. • Complexity is O(n3 ).
  • 26. Dijkstra’s Algorithm: Parallel Formulation • Two parallelization strategies – execute each of the n shortest path problems on a different processor (source partitioned), or use a parallel formulation of the shortest path problem to increase concurrency (source parallel).
  • 27. Dijkstra’s Algorithm: Source Partitioned Formulation • Use n processors, each processor Pi finds the shortest paths from vertex vi to all other vertices by executing Dijkstra’s sequential single-source shortest paths algorithm. • It requires no interprocess communication (provided that the adjacency matrix is replicated at all processes). • The parallel run time of this formulation is: Θ(n2 ). • While the algorithm is cost optimal, it can only use n processors. Therefore, the isoefficiency due to concurrency is p3 .
  • 28. Dijkstra’s Algorithm: Source Parallel Formulation • In this case, each of the shortest path problems is further executed in parallel. We can therefore use up to n2 processors. • Given p processors (p > n), each single source shortest path problem is executed by p/n processors. • Using previous results, this takes time: TP = computation Θ n3 p + communication Θ(n log p). (1) • For cost optimality, we have p = O(n2 / log n) and the isoefficiency is Θ((p log p)1.5 ).
  • 29. Floyd’s Algorithm • For any pair of vertices vi, vj ∈ V , consider all paths from vi to vj whose intermediate vertices belong to the set {v1, v2, . . . , vk}. Let p (k) i,j (of weight d (k) i,j be the minimum-weight path among them. • If vertex vk is not in the shortest path from vi to vj, then p (k) i,j is the same as p (k−1) i,j . • If f vk is in p (k) i,j , then we can break p (k) i,j into two paths – one from vi to vk and one from vk to vj. Each of these paths uses vertices from {v1, v2, . . . , vk−1}.
  • 30. Floyd’s Algorithm From our observations, the following recurrence relation follows: d (k) i,j = w(vi, vj) if k = 0 min d (k−1) i,j , d (k−1) i,k + d (k−1) k,j if k ≥ 1 (2) This equation must be computed for each pair of nodes and for k = 1, n. The serial complexity is O(n3 ).
  • 31. Floyd’s Algorithm 1. procedure FLOYD ALL PAIRS SP(A) 2. begin 3. D(0) = A; 4. for k := 1 to n do 5. for i := 1 to n do 6. for j := 1 to n do 7. d (k) i,j := min “ d (k−1) i,j , d (k−1) i,k + d (k−1) k,j ” ; 8. end FLOYD ALL PAIRS SP Floyd’s all-pairs shortest paths algorithm. This program computes the all-pairs shortest paths of the graph G = (V, E) with adjacency matrix A.
  • 32. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping • Matrix D(k) is divided into p blocks of size (n/ √ p) × (n/ √ p). • Each processor updates its part of the matrix during each iteration. • To compute d (k) l,r processor Pi,j must get d (k−1) l,k and d (k−1) k,r . • In general, during the kth iteration, each of the √ p processes containing part of the kth row send it to the √ p − 1 processes in the same column. • Similarly, each of the √ p processes containing part of the kth column sends it to the √ p − 1 processes in the same row.
  • 33. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping (a) (2,1) (i,j) (b) (1,1) (1,2) PSfrag replacements n√ p n√ p (i − 1) n√ p + 1, (j − 1) n√ p + 1 i n√ p , j n√ p (a) Matrix D(k) distributed by 2-D block mapping into √ p × √ p subblocks, and (b) the subblock of D(k) assigned to process Pi,j.
  • 34. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping (a) (b) k k column k column row PSfrag replacements d (k−1) l,k d (k−1) k,r d (k) l,r (a) Communication patterns used in the 2-D block mapping. When computing d (k) i,j , information must be sent to the highlighted process from two other processes along the same row and column. (b) The row and column of √ p processes that contain the kth row and column send them along process columns and rows.
  • 35. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping 1. procedure FLOYD 2DBLOCK(D(0) ) 2. begin 3. for k := 1 to n do 4. begin 5. each process Pi,j that has a segment of the kth row of D(k−1) ; broadcasts it to the P∗,j processes; 6. each process Pi,j that has a segment of the kth column of D(k−1) ; broadcasts it to the Pi,∗ processes; 7. each process waits to receive the needed segments; 8. each process Pi,j computes its part of the D(k) matrix; 9. end 10. end FLOYD 2DBLOCK Floyd’s parallel formulation using the 2-D block mapping. P∗,j denotes all the processes in the jth column, and Pi,∗ denotes all the processes in the ith row. The matrix D(0) is the adjacency matrix.
  • 36. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping • During each iteration of the algorithm, the kth row and kth column of processors perform a one-to-all broadcast along their rows/columns. • The size of this broadcast is n/ √ p elements, taking time Θ((n log p)/ √ p). • The synchronization step takes time Θ(log p). • The computation time is Θ(n2 /p). • The parallel run time of the 2-D block mapping formulation of Floyd’s algorithm is TP = computation Θ n3 p + communication Θ n2 √ p log p .
  • 37. Floyd’s Algorithm: Parallel Formulation Using 2-D Block Mapping • The above formulation can use O(n2 / log2 n) processors cost- optimally. • The isoefficiency of this formulation is Θ(p1.5 log3 p). • This algorithm can be further improved by relaxing the strict synchronization after each iteration.
  • 38. Floyd’s Algorithm: Speeding Things Up by Pipelining • The synchronization step in parallel Floyd’s algorithm can be removed without affecting the correctness of the algorithm. • A process starts working on the kth iteration as soon as it has computed the (k − 1)th iteration and has the relevant parts of the D(k−1) matrix.
  • 39. Floyd’s Algorithm: Speeding Things Up by Pipelining 1 2 3 4 5 6 7 8 9 10 t t+1 t+2 t+3 t+4 t+5 Processors Time Communication protocol followed in the pipelined 2-D block mapping formulation of Floyd’s algorithm. Assume that process 4 at time t has just computed a segment of the kth column of the D(k−1) matrix. It sends the segment to processes 3 and 5. These processes receive the segment at time t + 1 (where the time unit is the time it takes for a matrix segment to travel over the communication link between adjacent processes). Similarly, processes farther away from process 4 receive the segment later. Process 1 (at the boundary) does not forward the segment after receiving it.
  • 40. Floyd’s Algorithm: Speeding Things Up by Pipelining • In each step, n/ √ p elements of the first row are sent from process Pi,j to Pi+1,j. • Similarly, elements of the first column are sent from process Pi,j to process Pi,j+1. • Each such step takes time Θ(n/ √ p). • After Θ( √ p) steps, process P√ p, √ p gets the relevant elements of the first row and first column in time Θ(n). • The values of successive rows and columns follow after time Θ(n2 /p) in a pipelined mode. • Process P√ p, √ p finishes its share of the shortest path computation in time Θ(n3 /p) + Θ(n). • When process P√ p, √ p has finished the (n − 1)th iteration, it sends the relevant values of the nth row and column to the other processes.
  • 41. Floyd’s Algorithm: Speeding Things Up by Pipelining • The overall parallel run time of this formulation is TP = computation Θ n3 p + communication Θ(n). • The pipelined formulation of Floyd’s algorithm uses up to O(n2 ) processes efficiently. • The corresponding isoefficiency is Θ(p1.5 ).
  • 42. All-pairs Shortest Path: Comparison The performance and scalability of the all-pairs shortest paths algorithms on various architectures with O(p) bisection bandwidth. Similar run times apply to all k − d cube architectures, provided that processes are properly mapped to the underlying processors. Maximum Number of Processes Corresponding Isoefficiency for E = Θ(1) Parallel Run Time Function Dijkstra source-partitioned Θ(n) Θ(n2 ) Θ(p3 ) Dijkstra source-parallel Θ(n2 / log n) Θ(n log n) Θ((p log p)1.5 ) Floyd 1-D block Θ(n/ log n) Θ(n2 log n) Θ((p log p)3 ) Floyd 2-D block Θ(n2 / log2 n) Θ(n log2 n) Θ(p1.5 log3 p) Floyd pipelined 2-D block Θ(n2 ) Θ(n) Θ(p1.5 )
  • 43. Transitive Closure • If G = (V, E) is a graph, then the /em transitive closure of G is defined as the graph G∗ = (V, E∗ ), where E∗ = {(vi, vj)| there is a path from vi to vj in G}. • The /em connectivity matrix of G is a matrix A∗ = (a∗ i,j) such that a∗ i,j = 1 if there is a path from vi to vj or i = j, and a∗ i,j = ∞ otherwise. • To compute A∗ we assign a weight of 1 to each edge of E and use any of the all-pairs shortest paths algorithms on this weighted graph.
  • 44. Connected Components The connected components of an undirected graph are the equivalence classes of vertices under the “is reachable from” relation. 1 2 3 4 5 6 7 8 9 A graph with three connected components: {1, 2, 3, 4}, {5, 6, 7}, and {8, 9}.
  • 45. Connected Components: Depth-First Search Based Algorithm Perform DFS on the graph to get a forest – eac tree in the forest corresponds to a separate connected component. (a) (b) 2 5 41 6 10 9 11 12 3 2 5 41 6 10 9 11 12 3 Part (b) is a depth-first forest obtained from depth-first traversal of the graph in part (a). Each of these trees is a connected component of the graph in part (a).
  • 46. Connected Components: Parallel Formulation • Partition the graph across processors and run independent connected component algorithms on each processor. At this point, we have p spanning forests. • In the second step, spanning forests are merged pairwise until only one spanning forest remains.
  • 47. Connected Components: Parallel Formulation 1 0 0 0 76 7 6 0 0 1 0 5 2 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0 1 0 54321 0 1 0 4 3 1 0 0 0 1 1 0 1 00 0 0 1 1 0 17 6 5 4 3 2 1 Processor 1 Processor 2 1 2 3 4 5 6 7 1 2 3 4 5 6 7 (b) (d) 1 2 3 4 5 6 7 1 2 3 4 5 6 7 (e) (f) (c) (a) Computing connected components in parallel. The adjacency matrix of the graph G in (a) is partitioned into two parts (b). Each process gets a subgraph of G ((c) and (e)). Each process then computes the spanning forest of the subgraph ((d) and (f)). Finally, the two spanning trees are merged to form the solution.
  • 48. Connected Components: Parallel Formulation • To merge pairs of spanning forests efficiently, the algorithm uses disjoint sets of edges. • We define the following operations on the disjoint sets: find(x) returns a pointer to the representative element of the set containing x. Each set has its own unique representative. union(x, y) unites the sets containing the elements x and y. The two sets are assumed to be disjoint prior to the operation.
  • 49. Connected Components: Parallel Formulation • For merging forest A into forest B, for each edge (u, v) of A, a find operation is performed to determine if the vertices are in the same tree of B. • If not, then the two trees (sets) of B containing u and v are united by a union operation. • Otherwise, no union operation is necessary. • Hence, merging A and B requires at most 2(n − 1) find operations and (n − 1) union operations.
  • 50. Connected Components: Parallel 1-D Block Mapping • The n × n adjacency matrix is partitioned into p blocks. • Each processor can compute its local spanning forest in time Θ(n2 /p). • Merging is done by embedding a logical tree into the topology. There are log p merging stages, and each takes time Θ(n). Thus, the cost due to merging is Θ(n log p). • During each merging stage, spanning forests are sent between nearest neighbors. Recall that Θ(n) edges of the spanning forest are transmitted.
  • 51. Connected Components: Parallel 1-D Block Mapping • The parallel run time of the connected-component algorithm is TP = local computation Θ n2 p + forest merging Θ(n log p). • For a cost-optimal formulation p = O(n/ log n). The corresponding isoefficiency is Θ(p2 log2 p).
  • 52. Algorithms for Sparse Graphs A graph G = (V, E) is sparse if |E| is much smaller than |V |2 . (b)(a) (c) Examples of sparse graphs: (a) a linear graph, in which each vertex has two incident edges; (b) a grid graph, in which each vertex has four incident vertices; and (c) a random sparse graph.
  • 53. Algorithms for Sparse Graphs • Dense algorithms can be improved significantly if we make use of the sparseness. For example, the run time of Prim’s minimum spanning tree algorithm can be reduced from Θ(n2 ) to Θ(|E| log n). • Sparse algorithms use adjacency list instead of an adjacency matrix. • Partitioning adjacency lists is more difficult for sparse graphs – do we balance number of vertices or edges? • Parallel algorithms typically make use of graph structure or degree information for performance.
  • 54. Algorithms for Sparse Graphs (a) (b) A street map (a) can be represented by a graph (b). In the graph shown in (b), each street intersection is a vertex and each edge is a street segment. The vertices of (b) are the intersections of (a) marked by dots.
  • 55. Finding a Maximal Independent Set A set of vertices I ⊂ V is called /em independent if no pair of vertices in I is connected via an edge in G. An independent set is called /em maximal if by including any other vertex not in I, the independence property is violated. {a, d, i, h} is an independent set {a, c, j, f, g} is a maximal independent set {a, d, h, f} is a maximal independent set e b d i h e a c f g j Examples of independent and maximal independent sets.
  • 56. Finding a Maximal Independent Set (MIS) • Simple algorithms start by MIS I to be empty, and assigning all vertices to a candidate set C. • Vertex v from C is moved into I and all vertices adjacent to v are removed from C. • This process is repeated until C is empty. • This process is inherently serial!
  • 57. Finding a Maximal Independent Set (MIS) • Parallel MIS algorithms use randimization to gain concurrency (Luby’s algorithm for graph coloring). • Initially, each node is in the candidate set C. Each node generates a (unique) random number and communicates it to its neighbors. • If a nodes number exceeds that of all its neighbors, it joins set I. All of its neighbors are removed from C. • This process continues until C is empty. • On average, this algorithm converges after O(log |V |) such steps.
  • 58. Finding a Maximal Independent Set (MIS) Vertex adjacent to a vertex in the independent set Vertex in the independent set (b) After the 2nd random number assignment (c) Final maximal independent set (a) After the 1st random number assignment 0 1 0 1 2 3 4 15 15 6 14 7 10 8 9 11 12 13 11 15 The different augmentation steps of Luby’s randomized maximal independent set algorithm. The numbers inside each vertex correspond to the random number assigned to the vertex.
  • 59. Finding a Maximal Independent Set (MIS): Parallel Formulation • We use three arrays, each of length n – I, which stores nodes in MIS, C, which stores the candidate set, and R, the random numbers. • Partition C across p processors. Each processor generates the corresponding values in the R array, and from this, computes which candidate vertices can enter MIS. • The C array is updated by deleting all the neighbors of vertices that entered MIS. • The performance of this algorithm is dependent on the structure of the graph.
  • 60. Single-Source Shortest Paths • Dijkstra’s algorithm, modified to handle sparse graphs is called Johnson’s algorithm. • The modification accounts for the fact that the minimization step in Dijkstra’s algorithm needs to be performed only for those nodes adjacent to the previously selected nodes. • Johnson’s algorithm uses a priority queue Q to store the value l[v] for each vertex v ∈ (V − VT ).
  • 61. Single-Source Shortest Paths: Johnson’s Algorithm 1. procedure JOHNSON SINGLE SOURCE SP(V, E, s) 2. begin 3. Q := V ; 4. for all v ∈ Q do 5. l[v] := ∞; 6. l[s] := 0; 7. while Q = ∅ do 8. begin 9. u := extract min(Q); 10. for each v ∈ Adj[u] do 11. if v ∈ Q and l[u] + w(u, v) < l[v] then 12. l[v] := l[u] + w(u, v); 13. endwhile 14. end JOHNSON SINGLE SOURCE SP Johnson’s sequential single-source shortest paths algorithm.
  • 62. Single-Source Shortest Paths: Parallel Johnson’s Algorithm • Maintaining strict order of Johnson’s algorithm generally leads to a very restrictive class of parallel algorithms. • We need to allow exploration of multiple nodes concurrently. This is done by simultaneously extracting p nodes from the priority queue, updating the neighbors’ cost, and augmenting the shortest path. • If an error is made, it can be discovered (as a shorter path) and the node can be reinserted with this shorter path.
  • 63. Single-Source Shortest Paths: Parallel Johnson’s Algorithm 0 1 7 0 1 4 3 6 10 47 0 1 4 3 107 0 1 4 3 6 5 4 67 b d f h ia c e g b:1, d:7, c:inf, e:inf, f:inf, g:inf, h:inf, i:inf e:3, c:4, g:10, f:inf, h:inf, i:inf h:4, f:6, i:inf g:5, i:6 Priority Queue Array l[] (1) (2) (3) (4) 2 8 1 2 1 3 2 1 5 3 7 a b c d e f g h i 1 ments ∞∞∞ ∞ ∞∞∞∞∞∞ An example of the modified Johnson’s algorithm for processing unsafe vertices concurrently.
  • 64. Single-Source Shortest Paths: Parallel Johnson’s Algorithm • Even if we can extract and process multiple nodes from the queue, the queue itself is a major bottleneck. • For this reason, we use multiple queues, one for each processor. Each processor builds its priority queue only using its own vertices. • When process Pi extracts the vertex u ∈ Vi, it sends a message to processes that store vertices adjacent to u. • Process Pj, upon receiving this message, sets the value of l[v] stored in its priority queue to min{l[v], l[u] + w(u, v)}.
  • 65. Single-Source Shortest Paths: Parallel Johnson’s Algorithm • If a shorter path has been discovered to node v, it is reinserted back into the local priority queue. • The algorithm terminates only when all the queues become empty. • A number of node paritioning schemes can be used to exploit graph structure for performance.