thinset
A data structure optimized for sparse sets of unsigned integers.
[![crates.io][crates.io shield]][crates.io link]
[![Documentation][docs.rs badge]][docs.rs link]
![Rust CI][github ci badge]
[![rustc 1.0+]][Rust 1.0]
[![Dependency Status][deps.rs status]][deps.rs link]
[![Download Status][shields.io download count]][crates.io link]
[crates.io shield]: https://img.shields.io/crates/v/thinset?label=latest
[crates.io link]: https://crates.io/crates/thinset
[docs.rs badge]: https://docs.rs/thinset/badge.svg?version=0.4.0
[docs.rs link]: https://docs.rs/thinset/0.4.0/thinset/
[github ci badge]: https://github.com/Chriscbr/thinset/actions/workflows/rust.yml/badge.svg
[rustc 1.0+]: https://img.shields.io/badge/rustc-1.0%2B-blue.svg
[Rust 1.0]: https://blog.rust-lang.org/2015/05/15/Rust-1.0.html
[deps.rs status]: https://deps.rs/repo/github/Chriscbr/thinset/status.svg
[deps.rs link]: https://deps.rs/crate/thinset/0.4.0
[shields.io download count]: https://img.shields.io/crates/d/thinset.svg
## Usage
Add this to your Cargo.toml:
```toml
[dependencies]
thinset = "0.1"
```
### Description
An implementation of a set and a map using a pair of sparse and dense arrays as backing stores,
based on the paper "An efficient representation for sparse sets" (1993) by Briggs and Torczon.
This type of set is useful when you need to efficiently track set membership for integers
from a large universe, but the values are relatively spread apart.
Compared to the standard library's `HashSet`, clearing a set is O(1) instead of O(n).
Compared to a bitmap-based set, iteration over the set is
proportional to the cardinality of the set (how many elements you have) instead of proportional to the maximum size of the set.
The main downside is that the set uses more memory than other set implementations.
The map behaves identically to the set with the exception that it tracks data alongside
the values that are stored in the set. Under the hood, `SparseSet` is a `SparseMap` of keys to `()`.
The table below compares the asymptotic complexities of several set operations for the sparse set when compared a bit set.
`n` is the number of elements in the set and `u` is the size of the set's universe.
| Operation | Sparse Set | Bit Set |
| --------- | ---------- | ------- |
| Insertion | O(1) | O(1) |
| Removal | O(1) | O(1) |
| Lookup | O(1) | O(1) |
| Clear | O(1) | O(u) |
| Count | O(1) | O(u) |
| Iteration | O(n) | O(u) |
| Union | O(n) | O(u) |
| Intersection | O(n) | O(u) |
| Difference | O(n) | O(u) |
| Complement | O(n) | O(u) |
#### Benchmarks
The following benchmarks were run on a 2020 MacBook Pro with a 2 GHz Intel Core i5 processor.
The benchmark compares `SparseSet` to the standard library's `HashSet` and the `bit-set` crate's `BitSet`.
When inserting 1000 random elements into the set from a universe of [0, 2^16) and then iterating over the set,
the sparse set is **4.1x** faster than the `HashSet` and **1.7x** faster than the `BitSet`:
- `SparseSet`: 160,329 ns/iter (+/- 55,664)
- `BitSet`: 278,428 ns/iter (+/- 42,477)
- `HashSet`: 662,964 ns/iter (+/- 56,851)
Benchmarks are available in examples/bench.rs and can be run with the following command:
```bash
cargo run --example bench
```
#### Examples
```rust
use thinset::SparseSet;
let mut s: SparseSet