Disjoint Sets (Union-Find)
Disjoint Sets, also known as Union-Find, are a powerful data structure for efficiently managing a collection of non-overlapping sets. They are widely used in algorithms that require grouping elements and quickly checking if two elements belong to the same group.
What Are Disjoint Sets?
A Disjoint Set structure maintains a partition of a set into non-overlapping subsets. It is especially useful in graph algorithms, such as Kruskal's Minimum Spanning Tree, and in applications like network connectivity and image segmentation.
Equivalence Relations and Classes
Let \(S\) be a set containing elements and a relation \(R\) is defined on it. That means for every pair \(a,~b\in S\), \(a~R~b\) is true then we say \(a\) is related to \(b\), otherwise \(a\) is not related to \(b\).
A relation is called equivalence relation if it satisfies the three properties:
- Reflexive: $ \forall a \in S$, \(a~R~a\) is true
- Symmetric: \(a, b \in S\), if \(a~R~b\) is true then \(b~R~a\) is true
- Transitive: \(a, b, c \in S\), if \(a~R~b\) and \(b~R~c\) is true then \(a~R~c\) is true
The equivalence class of an element \(a ~\in S\) is a subset of \(S\) that contains all elements that are related to \(a\). Equivalence classes create a partition of \(S\). Since intersection of any two equivalence classes in empty they are also called disjoint sets.
Key Operations
A Disjoint Set data structure supports three primary operations:
- MakeSet(x): Create a new set containing the single element
x. - Find(x): Determine which subset a particular element
xis in. This can be used for checking if two elements are in the same subset. - Union(x, y): Merge the subsets containing elements
xandyinto a single subset.
Data Structure Implementation
The most common implementation uses two techniques for efficiency:
- Parent Array: Each element points to a parent, and the root of each set is its representative.
- Union by Rank/Size: Always attach the smaller tree (size or height) to the root of the larger tree to keep the tree shallow.
- Path Compression: During
Find, make each node on the path point directly to the root, flattening the structure.
Implementation
class DisjointSets(n: Int) {
val parent = IntArray(n) { it }
val rank = IntArray(n) { 0 }
fun find(x: Int): Int {
if(parent[x] != x)
parent[x] = find(parent[x])
return parent[x]
}
fun union(x: Int, y: Int) {
val xRoot = find(x)
val yRoot = find(y)
if(xRoot == yRoot) return
if(rank[xRoot] < rank[yRoot]) {
parent[xRoot] = yRoot
} else if(rank[xRoot] > rank[yRoot]) {
parent[yRoot] = xRoot
} else {
parent[yRoot] = xRoot
rank[xRoot] += 1
}
}
}
Applications
- Kruskal's Algorithm: Used to detect cycles and build Minimum Spanning Trees efficiently.
- Connected Components: Quickly determine if two nodes are in the same connected component in a graph.
- Network Connectivity: Model dynamic connectivity in networks.
- Image Processing: Group pixels into connected regions.
- Percolation Theory: Model the flow of fluids through porous materials.