12 Jul 2020

Disjoint-Set Data Structures

The disjoint-set data structure is a data structure that represents set partitions.

A set partition of a set S is a collection of disjoint subsets of S whose union is S.

Sets are disjoint if and only if they have no elements in common.

The union of a collection of sets is the set obtained by combining the elements of each set.

So, when we partition a set, we put all of its elements into a collection of subsets, making sure that none of the subsets share any elements. For example, the set partitions of {1, 2, 3} are:

{{1, 2, 3}}
{{1}, {2}, {3}}
{{1}, {2, 3}}
{{1, 2}, {3}}
{{1, 3}, {2}}.

A disjoint-set data structure lets us efficiently model a set partition on a computer. It has two operations defined on it:

1. Union: Replaces two subsets with their union.
2. Find: Locates an element's subset.

For this reason, disjoint-set data structures are also called "union-find data structures".

To see my Python implementation of a disjoint-set data structure, visit my GitHub.

Kruskal's algorithm uses disjoint-set data structures to find minimum spanning trees.