| Crates.io | small-map |
| lib.rs | small-map |
| version | 0.1.5 |
| created_at | 2023-11-02 06:51:32.722293+00 |
| updated_at | 2026-01-07 23:04:47.693469+00 |
| description | An inline SIMD accelerated hashmap designed for small amount of data. |
| homepage | |
| repository | https://github.com/ihciah/small-map |
| max_upload_size | |
| id | 1022342 |
| size | 180,852 |
An inline SIMD accelerated hashmap designed for small amount of data.
use small_map::FxSmallMap;
// Don't worry about the 32 here, it's the inline size, not the limitation.
// If the size grows more than 32, it will automatically switch to heap impl.
type MySmallMap<K, V> = FxSmallMap<32, K, V>;
let mut map = MySmallMap::new();
map.insert(1_u8, 2_u8);
assert_eq!(*map.get(&1).unwrap(), 2);
Usually you can use this map for short lifetime and small key/values, for example, http or RPC headers.
You can use it like other normal hashmaps without worrying about its size.
The choice of N involves a trade-off between inline storage benefits and struct size overhead:
Struct size = N * (1 + sizeof(K) + sizeof(V)) + 2 * sizeof(usize) + sizeof(Hasher)
Tip: Choose N as a multiple of SIMD_WIDTH (16 on x86_64, 8 on aarch64) for optimal SIMD utilization. When uncertain, start with N = 16 or N = 32.
The performance of SmallMap depends on the capacity, Key/Value size and operation scenario.
It is recommended to set the size to 16(or 32) because with SSE2 it can search 16 items within a single instruction. It is only recommended to use SmallMap for small Key/Values, such as numbers and strings.
[!NOTE] The lower the time(ns), the better the performance. You can find the benchmark source code in the
benchesfolder.Benchmark with u8 key/value on AMD Ryzen 9 7950X.
Like SmallVec, for HashMap with a small amount of data, inlining the data can avoid heap allocation overhead and improve performance.
SmallMap stores key-value pairs in a contiguous inline array, with an additional control byte array storing the hash fingerprint for SIMD-accelerated lookup.
The lookup strategy is determined at compile time based on N and at runtime based on current length:
Linear Search (no hash computation): When N < SIMD_WIDTH or len < SIMD_WIDTH
SIMD Search (with h2 hash): When N >= SIMD_WIDTH and len >= SIMD_WIDTH
Heap Fallback: When len > N
┌─────────────────────────────────────┐
│ AlignedGroups (N bytes) │ ← h2 control bytes for SIMD
├─────────────────────────────────────┤
│ len: usize │
├─────────────────────────────────────┤
│ data: [(K, V); N] │ ← contiguous key-value pairs
└─────────────────────────────────────┘
Let W = SIMD_WIDTH (16 on x86_64, 8 on aarch64)
| Operation | len ≤ W | W < len ≤ N | len > N |
|---|---|---|---|
get |
O(len) | O(len/W) | O(1)* |
insert |
O(len) | O(len/W) | O(1)* |
insert_unique_unchecked |
O(1) | O(1) | O(1)* |
remove |
O(len) | O(len/W) | O(1)* |
iter |
O(len) | O(len) | O(capacity) |
* amortized
Hashbrown is heavily referenced to and copied by this project, and is a very elegant and efficient implementation.