3. Applications for DFS
Directed graphs
We have seen topological sorting
Strongly connected components
Undirected graphs
Articulation points
Bridges
Biconnected components
4. Strongly Connected Components
Applications:
Definitions:
Given a directed graph G(V, E), G is strongly connected if ∀ u,
v ∈ V, there exists a path u..v and a path v..u
Compiler analysis, data mining, scientific computing
u∈R(v) and v∈R(u) ∀ u, v ∈ V
Not all directed graphs are strongly connected
Given a directed graph G(V, E), a strongly connected
component (SCC) is a maximal subset of vertices C⊆V such
that ∀ u, v ∈ C, there exists a path u..v and a path v..u
5. SCC - Example
For any graph, we have a single way to partition it into
SCCs
I
I
A
A
J
G
G
B
B
K
C
H
D
E
J
K
C
L
H
D
E
F
F
L
6. SCCs - Properties
Let C⊆V – a SCC of G and ∀ u, v ∈ C. Any vertex z on a path
u..z..v is also in C.
Let x be any vertex in C
=> x∈R(v), but v∈R(z) => x∈R(z)
=> u∈R(x), but z∈R(u) => z∈R(x)
Therefore z ∈ C
Let C⊆V – a SCC of G. In any DFS traversal, all the vertices in C
are always in the same DFS tree!
Let u be the first vertex in C that is discovered by the DFS
At d[u], all the other vertices in C are WHITE
There exists a path from u to all the other vertices in C that consists of
intermediate vertices that are all in C (def. of a SCC)
Therefore, there exists a white path from u to all the other vertices in
C => all the others vertices in C become descendants of u in the same
DFS tree
7. Run DFS on the Example Graph
I
A
J
G
B
K
C
H
L
D
E
F
8. Transposing a Graph
Given G(V, E)
GT – transpose of G: a graph that has the same vertices,
but the direction of the edges is reversed
GT (V, ET)
(u, v) ∈ ET <=> (v, u) ∈ E
We can compute the transpose of a graph in:
Θ(n+m) for adjacency lists (need to recompute the lists)
Θ(1) for adjacency matrix (just use A[j, i] = AT[i, j])
9. SCCs – Property of G and GT
G and GT have the same SCCs
Let C be a SCC in G and ∀ u, v ∈ C:
u∈R(v) in G => u∈R(v) in GT
v∈R(u) in G => v∈R(u) in GT
=> C is also a SCC in GT
10. The SCCs Graph
GSCC = (VSCC, ESCC)
VSCC contains a vertex for each SCC in G
ESCC contains an edge between two different SCC C1, C2 if there
exists in G at least an edge between two vertices u ∈ C1 and v ∈
C2 ((u, v) ∈ E[G])
Property: GSCC is a DAG
Let C1, C2 be two distinct SCC in G. Let u, v ∈ C1 and u’, v’ ∈ C2.
If there exists a path u .. u’ in G, then there cannot be any path v’ ..
v in G
Why? If it was such a path, then there is also a path u .. u’ .. v’ .. v ..
u , therefore all the vertices would be in the same SCC
11. SCC – Algorithm
We can find the SCCs of a graph using two DFS traversals:
Complexity:
A DFS for G, regardless of the order for considering the roots of
the DFS trees
GT = Transpose G
A second DFS for GT, taking the roots of the DFS trees in an order
provided by the first DFS for G
Θ(n+m) – adjacency lists
Θ(n2) – adjacency matrix
Name: Kosaraju’s algorithm
12. SCC – Algorithm
SCC(G)
f[1..n] = DFS(G)
// call DFS for G in order to compute the finish times
// f[u] for all u
Gt = transpose(G)
DFS_modified(Gt, f[1..n])
// call the modified DFS for G that picks the
// roots of the DFS trees in decreasing
// order of f[u] (plus u being WHITE as
// usual)
PRINT the vertices in each separate DFS tree computed by the call to
DFS_modified. Each DFS tree contains the nodes of a SCC of G
15. Idea for the Algorithm
If GSCC is a DAG => (GSCC )T is also a DAG
How many DFS trees do we find when running DFS for
GSCC?
It depends on how we pick the roots of the trees:
A single tree that contains all SCCs
|VSCC| trees, each containing a single node
We want to find how to walk GSCC in order to have a tree
for each node => an order for choosing the roots of the
DFS trees
16. Idea for the Algorithm (2)
Knowing this order, we can also walk the original graph and choose
each time as a new root any node in the SCC that is next in order.
This order is the topological sorting order for GSCC
We can compute a topological sorting of GSCC regardless of how we
choose the roots of the DFS trees
This is done by running the first DFS on G
We run DFS on (GSCC )T considering the vertices in the topological
sorting order we have computed for GSCC
This is done by running the first DFS on G T by choosing the root
vertices ordered descending on the finish time of the first DFS
17. Topological Sorting of GSCC
By choosing the vertices in the topological ordering of
GSCC:
The first node is the one with the highest finish time
The last node is the one with the lowest finish time
We want to show that for any edge of GSCC: the source
vertex has a higher finish time than the one of the
destination vertex
Therefore, for any edge of (GSCC )T: the source vertex has a
lower finish time than the one of the destination vertex
18. Demonstration
Consider the results of the first DFS (for G):
d[u] and f[u]
Extend the notation for d and f to sets of vertices
Let C⊆V:
d(C) = min(d[u]) for all u∈C
Earliest discovery time of any vertex in C
f(C) = max(f[u]) for all u∈C
Latest finish time of any vertex in C
19. Lemma
Let C and C’ be 2 distinct SCCs in G. Suppose there exists an edge
(u, v)∈E[G], u∈C, v∈C’ (then there also exists an edge in GSCC
between the vertices corresponding to C and C’) => f(C) > f(C’)
Two possibilities:
d(C) < d(C’)
Let x∈C be the first vertex discovered in C and C’
White path theorem: all nodes in C and C’ are reachable from x using only
intermediate vertices that are WHITE
f(C) = f[x] > f(C’)
d(C) > d(C’)
Let y∈C’ be the first vertex discovered in C and C’
f(C’) = f[y]
At f[y], all nodes in C are WHITE => f(C) > d(C) > f(C’)
21. Corollary
Let C and C’ be 2 distinct SCCs in G. Suppose there
exists an edge (u, v)∈ET, u∈C, v∈C’ (then there also
exists an edge in (GSCC)T between the vertices
corresponding to C and C’) => f(C) < f(C’)
(u, v)∈ET => (v, u)∈E, u∈C, v∈C’ =>f(C’) > f(C)
Using previous lemma
22. Corollary – Conclusion
Let C and C’ be 2 distinct SCCs in G.
If f(C) > f(C’) there cannot be an edge from C to C’ in
(GSCC)T
If f(C) > f(C’) there cannot be an edge from C to C’ in GT
23. SCC Algorithm – Conclusion
When running the second DFS on GT , we pick the vertex with the
highest finish time, x∈V such that f(x)=maximum
Therefore, if C is the SCC that contains x, f(C)>f(C’) for any other
C’ a SCC of G.
Thus, by choosing x as the root of the first DFS tree we can only
explore the nodes that are in C
And we explore all of them => we color them in BLACK
The second vertex that we pick as a root is x’ that has the
maximum finish time out of the remaining white vertices. Let C’ be
the SCC that x’ is part of. Then:
There can be an edge only from C’ to C, but not to any other SCC
And we repeat this process until no vertices are left unvisited!
Thus, we are visiting (GSCC)T in reverse topological sorting order!
24. Articulation Points
Given a connected undirected graph G(V, E)
An articulation point (or cut vertex) is any u∈V such that
by removing it from the graph (and all the adjacent
edges), the remaining graph becomes disconnected
The definition is similar for graphs that are not
connected. An articulation point is any vertex that after
removal increases the number of connected components
of the resulting graph
25. Articulation Points (2)
Formal definition:
Given G(V, E), u is an articulation point if there exist two
vertices x, y ∈ V, x != y != u such that any path from x..y
contains u
u – articulation point
u – not an articulation point
26. Articulation Points (3)
Lots of applications for detecting critical points in any
kind of networks for communication, airline traffic,
electric circuits, etc.
27. Simple Algorithm
Given G(V, E) that is connected
Foreach (u∈V)
G’ = remove u from G
if (G’ is not connected)
// can use BFS or DFS
u is an articulation point
Complexity: θ(n * (n+m)) = θ(n2 + n*m)
We want a better algorithm!
28. Theorem
Let G(V, E) undirected graph
Let any DFS of G
u∈V is an articulation point of G if and only if
p[u] == null and u has at least two children
p[u] != null and u has at least a child v such that no vertex in
Subtree(v) is connected to an ancestor of u using a back edge
u is a root of a DFS tree
u is not a root of a DFS tree
Demonstration: on the whiteboard
29. Implementation
The first case of the previous theorem is very simple to
detect
Count the number of children for each node
children[u]
The second case is trickier
We need to remember for each vertex u the earliest node
that can be reached from Subtree(u)
Earliest means minimum discovery time, therefore we want to
discover if we can reach the ancestors of a node u from
Subtree(u)
low[u]
30. low[u]
Can be defined recursively:
low[u] =
min(
d[u],
d[x], for all x that can be reached from
any node in Subtree(x) by a back edge
)
We can improve the definition in order to integrate it
efficiently into a DFS
31. low[u] Revisited
low[u] =
min(
d[u],
d[w], for all back edges (u, w) visited from u
low[v], for all v children of u in the DFS tree
)
Thus, we can take advantage that low[v] has been computed and
will never change when we compute low[u], for all v – children of u
in the DFS tree
Because color[v] is BLACK and we have already explored everything
reachable from v !
32. Articulation Points - Algorithm
We can use a simple DFS algorithm and make small
changes
We do not need to compute f[u]
We need color[u] to detect back edges, d[u], children[u]
and low[u]
This algorithm is called Tarjan’s algorithm
33. Tarjan’s Algorithm
DFS(G)
FOREACH (u ∈ V)
color[u] = WHITE; p[u] = NULL; articulation[u] = FALSE ;
time = 0
FOREACH (u ∈ V)
IF (color[u] == WHITE)
DFS_Visit (u)
IF (children[u] > 1)
articulation[u] = TRUE
// the first case of the theorem
34. Tarjan’s Algorithm
DFS_Visit(u)
d[u] = ++time
low[u] = d[u]
// low[u] = min(d[u], …)
color[u] = GREY
children[u] = 0
FOREACH (v ∈ Adj(u))
// look for undiscovered neighbors
IF (color[v] == WHITE)
children[u]++
p[v] = u
DFS_Visit(v)
low[u] = min(low[v], low[u])
// finished a child… look for new minimum
IF (p[u] != NULL AND low[v] >= d[u])
articulation[u] = TRUE
// the second case of the theorem
ELSE IF (color[v] == GREY AND p[u] != v)
// back edge… look for new minimum
low[u] = min(low[u], d[v])
color[u] = BLACK
Complexity: θ(n + m) if using adjacency lists, just like normal
DFS
37. Bridges
Given a connected undirected graph G(V, E)
A bridge (or cut edge) is any edge (u, v)∈E such that by
removing it from the graph, the remaining graph becomes
disconnected
The definition is similar for graphs that are not
connected. A brdige is any edge that after removal
increases the number of connected components of the
resulting graph
38. Simple Algorithm
Given G(V, E) that is connected
Foreach ((u, v)∈E)
G’ = remove (u, v) from G
if (G’ is not connected)
// can use BFS or DFS
(u, v) is a bridge
Complexity: θ(m * (n+m)) = θ(m2 + n*m)
We want a better algorithm!
Use an adapted version of Tarjan’s algorithm for
articulation points
39. Tarjan’s Algorithm for Finding Bridges
For any edge in the DFS search (v, u = p[v])
It is a bridge if and only if low[v] > d[u]
Only tree edges can be bridges!
There does not exist any other alternative to reach u or an
ancestor of u from Subtree(v)!
Why?
bridge[u] = TRUE if (u, p[u]) is a bridge
40. Tarjan’s Algorithm for Finding Bridges
DFS(G)
FOREACH (u ∈ V)
color[u] = WHITE; p[u] = NULL; bridge[u] = FALSE ;
time = 0
FOREACH (u ∈ V)
IF (color[u] == WHITE)
DFS_Visit (u)
41. Tarjan’s Algorithm for Finding Bridges
DFS_Visit(u)
d[u] = ++time
low[u] = d[u]
color[u] = GREY
FOREACH (v ∈ Adj(u))
IF (color[v] == WHITE)
children[u]++
p[v] = u
DFS_Visit(v)
low[u] = min(low[v], low[u])
IF (low[v] > d[u])
bridge[v] = TRUE
ELSE IF (color[v] == GREY AND p[u] != v)
low[u] = min(low[u], d[v])
color[u] = BLACK
// look for undiscovered neighbors
// finished a child… look for new minimum
// check if the explored edge is a bridge
// back edge… look for new minimum
Complexity: θ(n + m) if using adjacency lists, just like normal
DFS
42. Biconnected Graph
A biconnected graph is a graph that has no articulation
points
Between any two vertices in the graph, there are at least
two different paths
The graph is also called 2-connected!
43. Biconnected components
As not all the graphs are biconnected, an interesting
problem is to find the biconnected components of a
graph
Biconnected component = maximal subgraph that is
biconnected
Any two edges in the same biconnected component lie on a
common simple cycle
Can also use Tarjan’s algorithm for computing the
biconnected components of a graph
45. Tarjan’s Algorithm for SCCs
Tarjan’s algorithm can also be adapted for computing the
SCCs of a directed graph
Use only a single DFS walk instead of two walks that
Kosaraju’s algorithm uses
However, the two algorithms have the same order of
growth
How would you adapt Tarjan’s algortihm for computing
the SCCs ?