Find Connected Components in an Undirected Graph: Difference between revisions
(13 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=External= | |||
* https://www.coursera.org/learn/algorithms-graphs-data-structures/lecture/BTVWn/bfs-and-undirected-connectivity | |||
=Internal= | =Internal= | ||
* [[Graphs#Subjects|Graphs]] | * [[Graphs#Subjects|Graphs]] | ||
* [[Graph_Concepts#Undirected_Graph_Connectivity|Undirected Graph Connectivity]] | * [[Graph_Concepts#Undirected_Graph_Connectivity|Undirected Graph Connectivity]] | ||
* [[Graph_Search#Breadth-First_Search_.28BFS.29|Graph Search | Breadth-First Search]] | |||
=Overview= | =Overview= | ||
Finding connected components in an undirected graph is a form of clustering heuristics: connected vertices represent clusters where the objects represented by the vertices are related in some way. Connected components in an undirected graph are computed with a variation of the [[Graph_Search#Breadth-First_Search_.28BFS.29|breadth-first search (BFS)]] algorithm. | Finding connected components in an undirected graph is a form of clustering heuristics: connected vertices represent clusters where the objects represented by the vertices are related in some way. Connected components in an undirected graph are computed with a variation of the [[Graph_Search#Breadth-First_Search_.28BFS.29|breadth-first search (BFS)]] algorithm. This BFS-based algorithm computes undirected connectivity in a graph in O(n + m) time. For a more formal definition of a connected component of a graph see: {{Internal|Graph_Concepts#Undirected_Graph_Connectivity|Undirected Graph Connectivity}} | ||
Determining connectivity is useful in a variety of cases: | |||
* Check if a physical network is broken into pieces. | |||
* If you want to visualize network data, you may want to know if there are multiple pieces and display the pieces differently. | |||
* Clustering and similarity data. Determine "clusters" of highly related objects. | |||
⚠️ [[Graph_Concepts#Directed_Graph_Connectivity|Directed connectivity]] is given by a different, DSF-based algorithm named [[Find_Strongly_Connected_Components_in_a_Directed_Graph|Kosaraju's two-pass algorithm]]. | |||
=Algorithm= | |||
The algorithm, which is linear in n + m iterates over all nodes in the graph and for unexplored nodes only, invokes BFS. The BFS invocation will explore the entire connected component the source node is part of. The essential part of the algorithm is that when we mark a node as explored, that is also visible to the outer loop. | |||
<font size=-1> | |||
Compute_Connected_Components_of_Undirected_Graph(graph G) | |||
mark all nodes as unexplored | |||
<font color=teal># Assume nodes are labeled from 1 to n</font> | |||
for i = 1 to n | |||
if i not yet explored | |||
BFS(G, i) | |||
</font> | |||
Each BFS invocation we identify a connected component and mark all its nodes as "explored". <font color=darkkhaki>The above algorithm is underspecified, how do we actually get the list of components, and how do we do the bookkeeping?</font> |
Latest revision as of 21:51, 20 October 2021
External
Internal
Overview
Finding connected components in an undirected graph is a form of clustering heuristics: connected vertices represent clusters where the objects represented by the vertices are related in some way. Connected components in an undirected graph are computed with a variation of the breadth-first search (BFS) algorithm. This BFS-based algorithm computes undirected connectivity in a graph in O(n + m) time. For a more formal definition of a connected component of a graph see:
Determining connectivity is useful in a variety of cases:
- Check if a physical network is broken into pieces.
- If you want to visualize network data, you may want to know if there are multiple pieces and display the pieces differently.
- Clustering and similarity data. Determine "clusters" of highly related objects.
⚠️ Directed connectivity is given by a different, DSF-based algorithm named Kosaraju's two-pass algorithm.
Algorithm
The algorithm, which is linear in n + m iterates over all nodes in the graph and for unexplored nodes only, invokes BFS. The BFS invocation will explore the entire connected component the source node is part of. The essential part of the algorithm is that when we mark a node as explored, that is also visible to the outer loop.
Compute_Connected_Components_of_Undirected_Graph(graph G) mark all nodes as unexplored # Assume nodes are labeled from 1 to n for i = 1 to n if i not yet explored BFS(G, i)
Each BFS invocation we identify a connected component and mark all its nodes as "explored". The above algorithm is underspecified, how do we actually get the list of components, and how do we do the bookkeeping?