Crates.io | refset |
lib.rs | refset |
version | 0.2.0 |
source | src |
created_at | 2020-10-19 16:30:44.129041 |
updated_at | 2020-10-22 01:54:32.255609 |
description | A non-owning HashSet |
homepage | |
repository | https://git.flanchan.moe/flanchan/refset |
max_upload_size | |
id | 302965 |
size | 25,434 |
HashSet
A hash-set analogue that does not own its data.
It can be used to "mark" items without the need to transfer ownership to the map
/// Process arguments while ignoring duplicates
fn process_args(args: impl IntoIterator<Item=String>) {
let mut same= HashRefSet::new();
for argument in args.into_iter()
{
if !same.insert(argument.as_str()) {
// Already processed this input, ignore
continue;
}
//do work...
}
}
serde
crateHashRefSet
and HashType
both implement Serialize
and Deserialize
from the serde
crate if the serde
feature is enabled. By default it is not.
We use the SHA512 hashing algorithm for the implementation at present. I may implement the ability to choose different types, but as of now I think it is sufficient.
Since the item is not inserted itself, we cannot use Eq
to double check there was not a hash collision.
While the hashing algorithm used (Sha512) is extremely unlikely to produce collisions, especially for small data types, keep in mind that it is not infallible.
HashRefSet
is significantly slower than HashSet
, so HashSet
should be preferred in most cases.
Even when Clone
is required to insert into HashSet
, it can be ~10x faster for trivial data structures.
HashRefSet
should be used if Clone
is either not an option, or Clone
is a significantly heavy operation on the type you're inserting.
Benchmark | Tests | Result |
---|---|---|
owning_strings | Inserts String into HashSet by cloning |
~4,538 ns/iter |
non_owning_strings | Inserts str into HashRefSet by reference |
~48,271 ns/iter |
owning_ints | Inserts u32 into HashSet by copy |
~937 ns/iter |
non_owning_ints | Inserts &u32 into HashRefSet by reference |
~31,089 ns/iter |
HashSet
Clone
to insert a copy of the item into a HashSet
is not possible (non-Clone
type) or is a significantly heavy operation. (see benchmarks)HashSet
With the smallmap
feature enabled, the small
module also provides the same API as HashRefSet
via SmallRefMap
.
It is backed by smallmap::Map
instead of HashSet
, which could potentially have some performance or memory usage impacts, or not.
The hashing algorithm and usage is otherwise identical for now, but this may change.
SmallRefMap
Comparing with cloning or copying into smallmap::Map
.
Largely there are the same performance penalties as the above table, with very minor differences.
Benchmark | Tests | Result |
---|---|---|
owning_strings | Inserts String into SmallMap by cloning |
~3,096 ns/iter |
non_owning_strings | Inserts str into SmallRefMap by reference |
~47,302 ns/iter |
owning_ints | Inserts u32 into SmallMap by copy |
~316 ns/iter |
non_owning_ints | Inserts &u32 into SmallRefMap by reference |
~30,046 ns/iter |
Each page of the SmallRefMap
will consume at least 16kb of memory however.
This may not be very desireable, but is still an available feature.
MIT