| Crates.io | embedvec |
| lib.rs | embedvec |
| version | 0.5.1 |
| created_at | 2026-01-22 00:01:01.946231+00 |
| updated_at | 2026-01-22 01:08:21.156515+00 |
| description | Fast, lightweight, in-process vector database with HNSW indexing, metadata filtering, E8 quantization, and PyO3 bindings |
| homepage | |
| repository | https://github.com/WeaveITMeta/embedvec |
| max_upload_size | |
| id | 2060474 |
| size | 193,173 |
Fast, lightweight, in-process vector database with HNSW indexing, SIMD-accelerated distance calculations, metadata filtering, E8 lattice quantization, and optional persistence via Sled or RocksDB.
768-dimensional vectors, 10k dataset, AVX2 enabled:
| Operation | Time | Throughput |
|---|---|---|
| Search (ef=32) | 3.0 ms | 3,300 queries/sec |
| Search (ef=64) | 4.9 ms | 2,000 queries/sec |
| Search (ef=128) | 16.1 ms | 620 queries/sec |
| Search (ef=256) | 23.2 ms | 430 queries/sec |
| Insert (768-dim) | 25.5 ms/100 | 3,900 vectors/sec |
| Distance (cosine) | 122 ns/pair | 8.2M ops/sec |
| Distance (euclidean) | 108 ns/pair | 9.3M ops/sec |
| Distance (dot product) | 91 ns/pair | 11M ops/sec |
Run cargo bench to reproduce on your hardware.
| Feature | Description |
|---|---|
| HNSW Indexing | Hierarchical Navigable Small World graph for O(log n) ANN search |
| SIMD Distance | AVX2/FMA accelerated cosine, euclidean, dot product |
| E8 Quantization | Lattice-based compression (4-6× memory reduction) |
| Metadata Filtering | Composable filters: eq, gt, lt, contains, AND/OR/NOT |
| Dual Persistence | Sled (pure Rust) or RocksDB (high performance) |
| Async API | Tokio-compatible async operations |
| PyO3 Bindings | First-class Python support with numpy interop |
| WASM Support | Feature-gated for browser/edge deployment |
[dependencies]
embedvec = "0.5"
tokio = { version = "1.0", features = ["rt-multi-thread", "macros"] }
serde_json = "1.0"
use embedvec::{Distance, EmbedVec, FilterExpr, Quantization};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create in-memory index with E8 quantization
let mut db = EmbedVec::builder()
.dimension(768)
.metric(Distance::Cosine)
.m(32) // HNSW connections per layer
.ef_construction(200) // Build-time beam width
.quantization(Quantization::e8_default()) // 4-6× memory savings
.build()
.await?;
// Add vectors with metadata
let vectors = vec![
vec![0.1; 768],
vec![0.2; 768],
];
let payloads = vec![
serde_json::json!({"doc_id": "123", "category": "finance", "timestamp": 1737400000}),
serde_json::json!({"doc_id": "456", "category": "tech", "timestamp": 1737500000}),
];
db.add_many(&vectors, payloads).await?;
// Search with metadata filter
let filter = FilterExpr::eq("category", "finance")
.and(FilterExpr::gt("timestamp", 1730000000));
let results = db.search(
&vec![0.15; 768], // query vector
10, // k
128, // ef_search
Some(filter)
).await?;
for hit in results {
println!("id: {}, score: {:.4}, payload: {:?}", hit.id, hit.score, hit.payload);
}
Ok(())
}
pip install embedvec-py
import embedvec_py
import numpy as np
# Create database with E8 quantization
db = embedvec_py.EmbedVec(
dim=768,
metric="cosine",
m=32,
ef_construction=200,
quantization="e8-10bit", # or None, "e8-8bit", "e8-12bit"
persist_path=None, # or "/tmp/embedvec.db"
)
# Add vectors (numpy array or list-of-lists)
vectors = np.random.randn(50000, 768).tolist()
payloads = [{"doc_id": str(i), "tag": "news" if i % 3 == 0 else "blog"}
for i in range(50000)]
db.add_many(vectors, payloads)
# Search with filter
query = np.random.randn(768).tolist()
hits = db.search(
query_vector=query,
k=10,
ef_search=128,
filter={"tag": "news"} # simple exact-match shorthand
)
for hit in hits:
print(f"score: {hit['score']:.4f} id: {hit['id']} {hit['payload']}")
EmbedVec::builder()
.dimension(768) // Vector dimension (required)
.metric(Distance::Cosine) // Distance metric
.m(32) // HNSW M parameter
.ef_construction(200) // HNSW build parameter
.quantization(Quantization::None) // Or E8 for compression
.persistence("path/to/db") // Optional disk persistence
.build()
.await?;
| Method | Description |
|---|---|
add(vector, payload) |
Add single vector with metadata |
add_many(vectors, payloads) |
Batch add vectors |
search(query, k, ef_search, filter) |
Find k nearest neighbors |
len() |
Number of vectors |
clear() |
Remove all vectors |
flush() |
Persist to disk (if enabled) |
// Equality
FilterExpr::eq("category", "finance")
// Comparisons
FilterExpr::gt("timestamp", 1730000000)
FilterExpr::gte("score", 0.5)
FilterExpr::lt("price", 100)
FilterExpr::lte("count", 10)
// String operations
FilterExpr::contains("name", "test")
FilterExpr::starts_with("path", "/api")
// Membership
FilterExpr::in_values("status", vec!["active", "pending"])
// Existence
FilterExpr::exists("optional_field")
// Boolean composition
FilterExpr::eq("a", 1)
.and(FilterExpr::eq("b", 2))
.or(FilterExpr::not(FilterExpr::eq("c", 3)))
| Mode | Bits/Dim | Memory/Vector (768d) | Recall@10 |
|---|---|---|---|
None |
32 | ~3.1 KB | 100% |
E8 8-bit |
~1.0 | ~170 B | 92–97% |
E8 10-bit |
~1.25 | ~220 B | 96–99% |
E8 12-bit |
~1.5 | ~280 B | 98–99% |
// No quantization (full f32)
Quantization::None
// E8 with Hadamard preprocessing (recommended)
Quantization::E8 {
bits_per_block: 10,
use_hadamard: true,
random_seed: 0xcafef00d,
}
// Convenience constructor
Quantization::e8_default() // 10-bit with Hadamard
embedvec implements state-of-the-art E8 lattice quantization based on QuIP#/NestQuant/QTIP research (2024-2025):
The E8 lattice has exceptional packing density in 8 dimensions, providing better rate-distortion than scalar quantization or product quantization for normalized embeddings typical in LLM/RAG applications.
| Operation | Time | Throughput |
|---|---|---|
| Search (ef=32) | 3.0 ms | 3,300 queries/sec |
| Search (ef=64) | 4.9 ms | 2,000 queries/sec |
| Search (ef=128) | 16.1 ms | 620 queries/sec |
| Search (ef=256) | 23.2 ms | 430 queries/sec |
| Insert (768-dim) | 25.5 ms/100 | 3,900 vectors/sec |
| Distance (cosine) | 122 ns/pair | 8.2M ops/sec |
| Distance (euclidean) | 108 ns/pair | 9.3M ops/sec |
| Distance (dot product) | 91 ns/pair | 11M ops/sec |
| Operation | ~1M vectors | ~10M vectors | Notes |
|---|---|---|---|
| Query (k=10, ef=128) | 0.4–1.2 ms | 1–4 ms | Cosine, no filter |
| Query + filter | 0.6–2.5 ms | 2–8 ms | Depends on selectivity |
| Memory (FP32) | ~3.1 GB | ~31 GB | Full precision |
| Memory (E8-10bit) | ~0.5 GB | ~5 GB | 4-6× reduction |
[dependencies]
embedvec = { version = "0.5", features = ["persistence-sled", "async"] }
| Feature | Description | Default |
|---|---|---|
persistence-sled |
On-disk storage via Sled (pure Rust) | ✓ |
persistence-rocksdb |
On-disk storage via RocksDB (higher perf) | ✗ |
async |
Tokio async API | ✓ |
python |
PyO3 bindings | ✗ |
simd |
SIMD distance optimizations | ✗ |
wasm |
WebAssembly support | ✗ |
embedvec supports two persistence backends:
Pure Rust embedded database. Good default for most use cases.
use embedvec::{EmbedVec, Distance, BackendConfig, BackendType};
// Simple path-based persistence (uses Sled)
let db = EmbedVec::with_persistence("/path/to/db", 768, Distance::Cosine, 32, 200).await?;
// Or via builder
let db = EmbedVec::builder()
.dimension(768)
.persistence("/path/to/db")
.build()
.await?;
Higher performance LSM-tree database. Better for write-heavy workloads and large datasets.
[dependencies]
embedvec = { version = "0.5", features = ["persistence-rocksdb", "async"] }
use embedvec::{EmbedVec, Distance, BackendConfig, BackendType};
// Configure RocksDB backend
let config = BackendConfig::new("/path/to/db")
.backend(BackendType::RocksDb)
.cache_size(256 * 1024 * 1024); // 256MB cache
let db = EmbedVec::with_backend(config, 768, Distance::Cosine, 32, 200).await?;
# Run all tests
cargo test
# Run with specific features
cargo test --features "persistence"
# Run benchmarks
cargo bench
# Install criterion
cargo install cargo-criterion
# Run benchmarks
cargo criterion
# Memory profiling (requires jemalloc)
cargo bench --features "jemalloc"
MIT OR Apache-2.0
Contributions welcome! Please read CONTRIBUTING.md before submitting PRs.
embedvec — The "SQLite of vector search" for Rust-first teams in 2026.