Find Strongly Connected Components in a Directed Graph: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 16: Line 16:
1. Reverse all of the arcs of the given graph. Let G<sup>rev</sup> = G with all arcs revered.
1. Reverse all of the arcs of the given graph. Let G<sup>rev</sup> = G with all arcs revered.


2. Do the first [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|depth-first search]] pass on the reversed graph G<sup>rev</sup>. The practical implementation should run DFS on the original graph G but going in the opposite direction of the arcs, backwards, instead of creating a copy of the graph where the arcs are effectively reversed. The [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|recursive depth-first search implementation]] can be used as is, with just a minimal modification: when looking for edges to explore, use edges '''incident to''' the current node instead of edges '''incident from''' the current node. This pass computes an ordering of nodes that allows the second pass to discover the strongly connected components, if loops over nodes in the given order. Let f(v) = "finishing time" of each v ∈ V. Once f(v) for each node is computed at the end of the first pass, the nodes are relabeled with their finishing time.
2. Do the first [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|depth-first search]] pass on the reversed graph G<sup>rev</sup>. The practical implementation should run DFS on the original graph G but going in the opposite direction of the arcs, backwards, instead of creating a copy of the graph where the arcs are effectively reversed. The [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|recursive depth-first search implementation]] can be used as is, with just a minimal modification: when looking for edges to explore, use edges '''incident to''' the current node instead of edges '''incident from''' the current node. This pass computes an ordering of nodes that allows the second pass to discover the strongly connected components. Let f(v) = "finishing time" of each v ∈ V. Once f(v) for each node is computed at the end of the first pass, the nodes are relabeled with their finishing time.


3. Do the second [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|depth-first search]] pass again, on the original graph G. The "right" order is to go through the vertices in the decreasing order of the "finishing time" computed during the first pass. In this pass we use the order computed by the first pass - remember that the nodes have been relabeled according to their computed finishing time - and we discover the strongly connected components one by one. During this second pass we will label each node with its "leader". The idea is all nodes in the same strongly connected component will be labeled with the same leader node.
3. Do the second [[Graph_Search#DFS_Recursive_Algorithm_Pseudocode|depth-first search]] pass again, on the original graph G. The "right" order is to go through the vertices in the decreasing order of the "finishing time" computed during the first pass. In this pass we use the order computed by the first pass - remember that the nodes have been relabeled according to their computed finishing time - and we discover the strongly connected components one by one. During this second pass we will label each node with its "leader". The idea is all nodes in the same strongly connected component will be labeled with the same leader node.

Revision as of 04:41, 5 October 2021

External

Internal

Overview

Finding strongly connected components in a directed graph is a form of clustering heuristics: strongly connected components represent clusters where the objects represented by the vertices are clustered in some way.

Strongly connected components of a directed graph can be computed with Kosaraju's Two-Pass algorithm, which consists of two passes of depth-first search. The key idea of the algorithm is that for directed graphs, starting a depth-first search from the "right" nodes discovers strongly connected components, while starting from the "wrong" nodes may discover the entire graph, which is not useful. The first pass of the algorithm computes the right order in which to use the nodes as start nodes in the second depth-first search pass.

Kosaraju's Two-Pass Algorithm

The algorithm has three steps:

1. Reverse all of the arcs of the given graph. Let Grev = G with all arcs revered.

2. Do the first depth-first search pass on the reversed graph Grev. The practical implementation should run DFS on the original graph G but going in the opposite direction of the arcs, backwards, instead of creating a copy of the graph where the arcs are effectively reversed. The recursive depth-first search implementation can be used as is, with just a minimal modification: when looking for edges to explore, use edges incident to the current node instead of edges incident from the current node. This pass computes an ordering of nodes that allows the second pass to discover the strongly connected components. Let f(v) = "finishing time" of each v ∈ V. Once f(v) for each node is computed at the end of the first pass, the nodes are relabeled with their finishing time.

3. Do the second depth-first search pass again, on the original graph G. The "right" order is to go through the vertices in the decreasing order of the "finishing time" computed during the first pass. In this pass we use the order computed by the first pass - remember that the nodes have been relabeled according to their computed finishing time - and we discover the strongly connected components one by one. During this second pass we will label each node with its "leader". The idea is all nodes in the same strongly connected component will be labeled with the same leader node.

DFS_Loop(graph G)
  global variable t = 0 # Counts the total number of nodes processed so far. 
                        # Used in the first DFS pass to compute the "finishing time".
  global variable s = NULL # Keeps track of the most recent vertex from which the DFS
                           # was initiated (the leader). Used in the second pass.
  Assume nodes labelled 1 to n.
  for i = n down to 1
    if i not yet explored
      s = i
      DFS(G, i)

DFS(graph G, node i)
  mark i as explored # Once the node is marked explored, it is explored for the rest of DFS_Loop
  set leader(i) = s # Mark the node with its "leader" in the second pass
  for ech arc (i, j) ∈ G 
    if j not yet explored
      DFS(G, j)
  t ++         # Increment the finishing time, in the first pass
  set f(i) = t # Finishing time of i, in the first pass

Proof

https://www.coursera.org/learn/algorithms-graphs-data-structures/lecture/QFOFt/computing-strong-components-the-analysis