cacheline-ef

Crates.iocacheline-ef
lib.rscacheline-ef
version1.1.0
created_at2025-03-28 01:49:16.951699+00
updated_at2025-03-28 23:20:48.002886+00
descriptionPer-cacheline encoding of sorted integer sequences
homepage
repositoryhttps://github.com/RagnarGrootKoerkamp/cacheline-ef
max_upload_size
id1609075
size28,888
Ragnar Groot Koerkamp (RagnarGrootKoerkamp)

documentation

README

Cacheline Elias-Fano

crates.io docs.rs

CachelineEf is an integer encoding that packs chunks of 44 sorted 40-bit values into a single cacheline, using 64/44*8 = 11.6 bits per value. Each chunk can hold increasing values in a range of length 256*84=21504.

CachelineEfVec stores a vector of CachelineEf and provides CachelineEfVec::index and CachelineEfVec::prefetch.

This encoding is efficient when consecutive values differ by roughly 100, where using Elias-Fano directly on the full list would use around 9 bits/value.

The main benefit is that CachelineEf only requires reading a single cacheline per query, where Elias-Fano encoding usually needs 3 reads.

Epserde is supported when the epserde feature flag is enabled.

Layout

The layout is described in detail in the PtrHash paper (arXiv, blog version).

In summary:

  • First store a 4 byte offset, corresponding to the 32 high bits of the smallest value.
  • Then store for each of 44 values the 8 low bits.
  • Lastly we have 16 bytes (128 bits) to encode the high parts. For the i'th value x[i], we set the bit at position i+(x[i]/256 - x[0]/256) to 1.
Commit count: 14

cargo fmt