| Crates.io | spann |
| lib.rs | spann |
| version | 0.2.0 |
| created_at | 2025-12-25 12:44:40.002489+00 |
| updated_at | 2025-12-25 13:32:34.567183+00 |
| description | Proof-of-concept SPANN-style approximate nearest neighbor index in Rust. |
| homepage | |
| repository | https://github.com/RustedBytes/spann |
| max_upload_size | |
| id | 2004518 |
| size | 60,900 |
A proof-of-concept implementation of a SPANN-style approximate nearest neighbor index in Rust. This project focuses on a readable, hackable baseline that still exercises core SPANN ideas: coarse centroids, soft assignment (epsilon closure), and a simple on-disk vector store.
f16 and f32 vector support through a shared Scalar trait.f32 on x86/x86_64.Create an index file and its on-disk vector store:
cargo run --release -- create /path/to/index.bin --store-dir /path/to/store
Optionally pass vectors_per_file to control batch size:
cargo run --release -- create /path/to/index.bin --store-dir /path/to/store --vectors-per-file 2048
Query an existing index file:
cargo run --release -- query /path/to/index.bin --k 3 --rng-factor 0.1
The demo in src/main.rs generates 1,000,000 f16 vectors of dimension 128 (two clusters),
builds the index with k=2 centroids and epsilon=0.15, writes the index metadata to the
requested index file, and serves queries by loading that file.
Create an index and query it from your own code:
use ndarray::Array1;
use spann::SpannIndex;
let dimension = 128;
let data: Vec<Array1<f32>> = /* your vectors */;
let k_centroids = 64;
let epsilon_closure = 0.2;
let index = SpannIndex::build(dimension, data, k_centroids, epsilon_closure)?;
let query: Array1<f32> = /* query vector */;
let k = 10;
let rng_factor = 0.1;
let results = index.search(&query, k, rng_factor);
results contains (vector_id, squared_distance) pairs ordered from nearest to farthest.
Attach metadata to each vector and retrieve it alongside query results:
use ndarray::Array1;
use spann::SpannIndex;
let dimension = 128;
let data: Vec<Array1<f32>> = /* your vectors */;
let metadata: Vec<String> = /* same length as data */;
let k_centroids = 64;
let epsilon_closure = 0.2;
let index = SpannIndex::<f32, String>::build_with_metadata(
dimension,
data,
metadata,
k_centroids,
epsilon_closure,
)?;
let query: Array1<f32> = /* query vector */;
let k = 10;
let rng_factor = 0.1;
let results = index.search_with_metadata(&query, k, rng_factor);
for (id, dist, meta) in results {
println!("{id} {dist} {meta}");
}
Metadata is kept aligned with vector IDs; you can also call index.metadata(id).
Build
k centroids from the first vectors and run a small, fixed number of k-means
iterations for refinement.Search
(1 + rng_factor) of the nearest.src/lib.rs - SPANN index, data store, and distance kernels.src/main.rs - CLI demo that builds and queries the index.Cargo.toml - crate metadata and dependencies.MIT.
Yehor Smoliakov