Crates.io | reunion |
lib.rs | reunion |
version | 0.1.14 |
source | src |
created_at | 2021-09-15 03:57:30.70637 |
updated_at | 2021-09-19 05:29:46.175589 |
description | A generic implementation of the Union-Find w/ Rank data structure. |
homepage | |
repository | https://www.github.com/aalekhpatel07/reunion |
max_upload_size | |
id | 451664 |
size | 17,041 |
Suppose you have a collection S
of elements e1
, e2
, ...
, en
, and wish to group them into different collections using operations:
ei
and ej
into the same group" (union),ei
belongs to" (find).Then a Union-Find data structure helps to store the underlying groups very efficiently and implements this API.
Note: The variant implemented uses Path Compression to further improve the performance.
Detect Cycles in Graph: Given a graph G
, we can put the endpoints of edges into the same group (same connected component) unless there is a pair of endpoints (ei, ej)
that share a group representative. If that happens, there was already a path existing between them, and adding this edge will add multiple paths, which cannot be the case for acyclic graphs.
Number of connected components in Graph: Given a graph G
, put the endpoints of edges into the same group (same connected component). Once all nodes are exhausted, the number of groups formed is the number of connected components in G
.
Some interesting lecture notes regarding Union-Find.
In Cargo.toml
, add this crate as a dependency.
[dependencies]
reunion = { version = "0.1" }
Task: Create a UnionFind data structure of arbitrary size that contains usize
at its elements.
Then, union a few elements and capture the state of the data structure after that.
Solution:
use reunion::{UnionFind, UnionFindTrait};
use std::collections::HashSet;
fn main() {
// Create a UnionFind data structure of arbitrary size that contains subsets of usizes.
let mut uf1 = UnionFind::<usize>::new();
println!("Initial state: {}", &uf);
println!("All elements form their own group (singletons).");
println!(format!("{:?}", uf.subsets());
uf.union(2, 1);
println!("After combining the groups that contains 2 and 1: {}", &uf);
uf.union(4, 3);
println!("After combining the groups that contains 4 and 3: {}", &uf);
uf.union(6, 5);
println!("After combining the groups that contains 6 and 5: {}", &uf);
let mut hs1 = HashSet::new();
hs1.insert(1);
hs1.insert(2);
let mut hs2 = HashSet::new();
hs2.insert(3);
hs2.insert(4);
let mut hs3 = HashSet::new();
hs3.insert(5);
hs3.insert(6);
let mut subsets = uf.subsets();
assert_eq!(subsets.len(), 3);
assert!(&subsets.contains(&hs1));
assert!(&subsets.contains(&hs2));
assert!(&subsets.contains(&hs3));
uf.union(1, 5);
println!("After combining the groups that contains 1 and 5: {}", &uf);
subsets = uf.subsets();
assert_eq!(subsets.len(), 2);
hs3.extend(&hs1);
assert!(&subsets.contains(&hs3));
assert!(&subsets.contains(&hs2));
let mut uf_clone = uf.clone();
uf_clone.find(2);
assert_eq!(&uf, &uf_clone);
println!("{}", &uf);
// It is possible to iterate over the subsets.
for partition in uf1 {
println!("{:?}", partition);
}
}
Task: Create a UnionFind data structure of size at least 10
, that contains u16
at its elements.
Note: The size business only helps for reducing the number of memory reallocations required. Therefore, it is not too special and is totally optional.
Solution:
// Create a UnionFind data structure of a fixed size that contains subsets of u16.
let mut uf2 = UnionFind::<u16>::with_capacity(10);
println!("{}", uf2);
To benchmark on your machine:
cargo bench
You should see some output like this:
#Find: 2497150, #Union: 1048575, #Total: 3545725, Time: 1.013497126s, Time per operation: 285ns
#Find: 2497150, #Union: 1048575, #Total: 3545725, Time: 1.013323348s, Time per operation: 285ns
#Find: 2497150, #Union: 1048575, #Total: 3545725, Time: 1.012333206s, Time per operation: 285ns
...
Big Merge (20, 10000) time: [1.0175 s 1.0190 s 1.0205 s]
change: [-0.4773% -0.2721% -0.0647%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
10 (10.00%) high mild
3 (3.00%) high severe
...
On a AMD Ryzen 9 3900X 12-Core Processor (with lots of other processes running),
working with a UnionFind of size 2 ** 20
, a total of 3,545,725
operations take roughly 1
second, which is expected because the time complexity
for these operations is effectively O(1)
(in truth it is O(alpha(n))
where alpha(n)
is the inverse Ackermann function but it grows so slow that we can hand wave it asa constant).