indices_union_find

Crates.ioindices_union_find
lib.rsindices_union_find
version0.1.1
created_at2025-07-18 07:57:17.180494+00
updated_at2025-07-29 06:19:33.123738+00
descriptionAn efficient and minimal union-find (disjoint-set) implementation over generic unsigned index types.
homepage
repositoryhttps://github.com/enjuichang123/indices_union_find.git
max_upload_size
id1758715
size50,168
En-Jui Chang (enjuichang123)

documentation

https://docs.rs/indices_union_find/latest/indices_union_find/index.html

README

indices_union_find

An efficient and minimal union-find (disjoint-set) implementation over generic unsigned index types.

This crate provides a UnionFind<T> structure for managing partitions (disjoint sets) of indices. It supports union by size and path compression, enabling nearly-constant-time operations. The index type T can be any unsigned integer that implements the UnsignedIndex trait (e.g., u8, u16, u32, etc.).

Design

The core data structure, UnionFind, maintains a dynamic partition of elements indexed by T. Each element belongs to exactly one disjoint set, represented internally as a tree with a designated root. The structure includes:

  • parent: Vec — stores the parent of each element, enabling efficient path compression during find operations.

  • size: Vec — tracks the size of each tree, used during union operations to attach smaller trees to larger ones.

These optimizations ensure that find and union operations remain efficient, even for large sets of indices.

The companion structure, Partition, represents disjoint sets via ordered boundaries separating filled and unfilled index intervals. This supports efficient set membership checks and range-based queries.

/// An ordered set of boundary nodes defining alternating interval membership. /// /// Each interval [aᵢ, aᵢ₊₁) is considered filled if i is even, /// and empty if i is odd. /// /// The boundaries must be strictly ordered and deduplicated; any /// overlapping or unordered input must be resolved during construction. pub boundaries: BTreeSet,

Functions are provided to construct a Partition from sets of unsigned integers, such as:

  • BTreeSet

  • HashSet

  • keys of BTreeMap<T, _> or HashMap<T, _>

The conversion process groups neighboring indices into compact intervals, reducing memory usage and avoiding redundant storage. Even in the worst case, where filled intervals alternate with empty ones, the number of stored boundaries is at most the number of input indices plus one.

This design is motivated by the common degeneracy in index-value mappings, where many indices may share the same value. In such cases, grouping indices into equivalence classes allows logical operations to be applied once per class rather than redundantly for each index.

License

This project is licensed under either of

at your option.

Contributions

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you shall be licensed as above, without any additional terms or conditions.

Commit count: 0

cargo fmt