indices_union_find

Crates.ioindices_union_find
lib.rsindices_union_find
version0.1.2
created_at2025-07-18 07:57:17.180494+00
updated_at2025-11-28 08:17:26.309742+00
descriptionAn efficient and minimal union-find (disjoint-set) implementation over generic unsigned index types.
homepage
repositoryhttps://github.com/enjuichang123/indices_union_find.git
max_upload_size
id1758715
size91,971
En-Jui Chang (enjuichang123)

documentation

https://docs.rs/indices_union_find/latest/indices_union_find/index.html

README

indices_union_find

An efficient and minimal union-find (disjoint-set) implementation over generic unsigned index types.

This crate provides:

  • UnionFind — a classic union–find with union by size and path compression, achieving amortized near-constant-time operations.

  • Partition — a compact interval representation of sets of indices, supporting fast membership queries, set algebra, and normalized interval operations.

Both structures operate over any unsigned integer type implementing the UnsignedIndex trait (u8, u16, u32, …).

Union–Find

The UnionFind structure maintains a dynamic partition of elements indexed by T. Each element belongs to exactly one disjoint set, encoded as a forest of rooted trees.

Internally, the structure stores:

  • parent: Vec — parent pointers for each node; roots store their own index.

  • size: Vec — the size of each tree, used to attach the smaller tree during union.

Two standard optimizations ensure efficiency:

  • Path compression — flattens trees during find, reducing lookup cost.

  • Union by size — keeps trees shallow by attaching the smaller subtree to the larger.

Together, these yield effectively constant-time find and union, even for large index sets.

Interval Partitions

The companion type, Partition, represents a subset of indices as a sequence of disjoint, half-open filled intervals. Rather than storing every element, a partition is encoded by its boundary points:

/// An ordered set of boundary nodes defining alternating interval membership. /// /// Each interval [aᵢ, aᵢ₊₁) is considered filled if i is even, /// and empty if i is odd. /// /// The boundaries must be strictly ordered and deduplicated; any /// overlapping or unordered input must be resolved during construction. pub boundaries: BTreeSet,

This representation enables:

  • efficient membership tests,

  • fast iteration over contiguous ranges,

  • compact storage for sparse or clustered sets,

  • linear-time set algebra (union, intersection, difference, symmetric difference).

The structure can be constructed from common unordered index collections such as:

  • BTreeSet

  • HashSet

  • key sets of BTreeMap<T, _> or HashMap<T, _>

The constructor automatically merges consecutive indices into minimal contiguous intervals. Even in worst-case alternating patterns, the number of stored boundaries is at most one greater than the number of filled indices.

This interval-based design is well-suited for applications where many indices share the same value or state. Grouping indices into contiguous classes prevents redundant work and lets algorithms operate once per interval instead of once per element.

License

This project is licensed under either of

at your option.

Contributions

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you shall be licensed as above, without any additional terms or conditions.

Commit count: 0

cargo fmt