| Crates.io | ruvector-router-core |
| lib.rs | ruvector-router-core |
| version | 0.1.30 |
| created_at | 2025-11-21 15:07:32.815511+00 |
| updated_at | 2026-01-04 19:43:24.603995+00 |
| description | Core vector database and neural routing inference engine |
| homepage | |
| repository | https://github.com/ruvnet/ruvector |
| max_upload_size | |
| id | 1943701 |
| size | 114,922 |
High-performance vector database and neural routing inference engine built in Rust.
Core engine powering Ruvector's intelligent request distribution, model selection, and sub-millisecond vector similarity search. Combines advanced indexing algorithms with SIMD-optimized distance calculations for maximum performance.
Router Core is the foundation of Ruvector's vector database capabilities, providing:
Add to your Cargo.toml:
[dependencies]
router-core = "0.1.0"
Or use the full ruvector package:
[dependencies]
ruvector-core = "0.1.0"
use router_core::{VectorDB, VectorEntry, SearchQuery, DistanceMetric};
use std::collections::HashMap;
// Create database with builder pattern
let db = VectorDB::builder()
.dimensions(384) // Vector dimensions
.distance_metric(DistanceMetric::Cosine)
.hnsw_m(32) // HNSW connections per node
.hnsw_ef_construction(200) // Construction accuracy
.storage_path("./vectors.db")
.build()?;
// Insert vectors
let entry = VectorEntry {
id: "doc1".to_string(),
vector: vec![0.1; 384],
metadata: HashMap::new(),
timestamp: chrono::Utc::now().timestamp(),
};
db.insert(entry)?;
// Search for similar vectors
let query = SearchQuery {
vector: vec![0.1; 384],
k: 10, // Top 10 results
filters: None,
threshold: Some(0.8), // Minimum similarity
ef_search: Some(100), // Search accuracy
};
let results = db.search(query)?;
for result in results {
println!("{}: {}", result.id, result.score);
}
use router_core::{VectorDB, VectorEntry};
// Insert multiple vectors efficiently
let entries: Vec<VectorEntry> = (0..1000)
.map(|i| VectorEntry {
id: format!("doc{}", i),
vector: vec![0.1; 384],
metadata: HashMap::new(),
timestamp: chrono::Utc::now().timestamp(),
})
.collect();
// Batch insert (much faster than individual inserts)
db.insert_batch(entries)?;
// Check statistics
let stats = db.stats();
println!("Total vectors: {}", stats.total_vectors);
println!("Avg latency: {:.2}ฮผs", stats.avg_query_latency_us);
use router_core::{VectorDB, DistanceMetric, QuantizationType};
let db = VectorDB::builder()
.dimensions(768) // Larger embeddings
.max_elements(10_000_000) // 10M vectors
.distance_metric(DistanceMetric::Cosine) // Cosine similarity
.hnsw_m(64) // More connections = higher recall
.hnsw_ef_construction(400) // Higher accuracy during build
.hnsw_ef_search(200) // Search-time accuracy
.quantization(QuantizationType::Scalar) // 4x memory compression
.mmap_vectors(true) // Memory-mapped storage
.storage_path("./large_db.redb")
.build()?;
Router Core supports multiple routing strategies for intelligent request distribution:
Simple load balancing across endpoints:
use router_core::routing::{Router, RoundRobinStrategy};
let router = Router::new(RoundRobinStrategy::new(vec![
"http://model1:8080",
"http://model2:8080",
"http://model3:8080",
]));
let endpoint = router.select_endpoint(&query)?;
Route to fastest available endpoint:
use router_core::routing::{Router, LatencyBasedStrategy};
let router = Router::new(LatencyBasedStrategy::new(vec![
("http://model1:8080", 50), // 50ms avg latency
("http://model2:8080", 30), // 30ms avg latency (preferred)
("http://model3:8080", 100), // 100ms avg latency
]));
Route based on query similarity to model specializations:
use router_core::routing::{Router, SemanticStrategy};
// Define model specializations with example vectors
let models = vec![
("general-model", vec![0.1; 384]), // General queries
("code-model", vec![0.8, 0.2, ...]), // Code-related queries
("math-model", vec![0.3, 0.9, ...]), // Math queries
];
let router = Router::new(SemanticStrategy::new(models));
// Routes to most appropriate model based on query vector
let endpoint = router.select_endpoint(&query_vector)?;
Learn optimal routing decisions over time:
use router_core::routing::{Router, AdaptiveStrategy};
let mut router = Router::new(AdaptiveStrategy::new());
// Router learns from feedback
router.record_request(&query, &endpoint, latency, success)?;
// Routing improves with more data
let best_endpoint = router.select_endpoint(&query)?;
Router Core supports multiple distance metrics with SIMD optimization:
Best for normalized embeddings (recommended for most AI applications):
use router_core::{DistanceMetric, distance::calculate_distance};
let a = vec![1.0, 0.0, 0.0];
let b = vec![0.9, 0.1, 0.0];
let dist = calculate_distance(&a, &b, DistanceMetric::Cosine)?;
// Returns 1 - cosine_similarity (0 = identical, 2 = opposite)
Measures absolute geometric distance:
let dist = calculate_distance(&a, &b, DistanceMetric::Euclidean)?;
// Returns sqrt(sum((a[i] - b[i])^2))
Fast similarity for pre-normalized vectors:
let dist = calculate_distance(&a, &b, DistanceMetric::DotProduct)?;
// Returns -sum(a[i] * b[i]) (negated for distance)
Sum of absolute differences:
let dist = calculate_distance(&a, &b, DistanceMetric::Manhattan)?;
// Returns sum(|a[i] - b[i]|)
Reduce memory usage with minimal accuracy loss:
Compress float32 to int8:
use router_core::{QuantizationType, VectorDB};
let db = VectorDB::builder()
.dimensions(384)
.quantization(QuantizationType::Scalar)
.build()?;
// Automatic quantization on insert
// 384 dims ร 4 bytes = 1536 bytes โ 384 bytes + overhead
Divide vector into subspaces and quantize independently:
let db = VectorDB::builder()
.dimensions(384)
.quantization(QuantizationType::Product {
subspaces: 8, // Divide into 8 subspaces
k: 256, // 256 centroids per subspace
})
.build()?;
// 384 dims ร 4 bytes = 1536 bytes โ 8 bytes + overhead
Compress to 1 bit per dimension:
let db = VectorDB::builder()
.dimensions(384)
.quantization(QuantizationType::Binary)
.build()?;
// 384 dims ร 4 bytes = 1536 bytes โ 48 bytes + overhead
// Fast Hamming distance for similarity
use router_core::quantization::calculate_compression_ratio;
let dims = 384;
let none_ratio = calculate_compression_ratio(dims, QuantizationType::None);
// 1x - no compression
let scalar_ratio = calculate_compression_ratio(dims, QuantizationType::Scalar);
// ~4x compression
let product_ratio = calculate_compression_ratio(
dims,
QuantizationType::Product { subspaces: 8, k: 256 }
);
// ~8-16x compression
let binary_ratio = calculate_compression_ratio(dims, QuantizationType::Binary);
// ~32x compression
Tune the HNSW index for your performance/accuracy requirements:
Controls graph connectivity and search accuracy:
// Low M = faster build, less memory, lower recall
let db_fast = VectorDB::builder()
.hnsw_m(16) // Minimal connections
.build()?;
// Medium M = balanced (default)
let db_balanced = VectorDB::builder()
.hnsw_m(32) // Default setting
.build()?;
// High M = slower build, more memory, higher recall
let db_accurate = VectorDB::builder()
.hnsw_m(64) // Maximum accuracy
.build()?;
Controls accuracy during index construction:
// Fast build, lower recall
let db_fast = VectorDB::builder()
.hnsw_ef_construction(100)
.build()?;
// Balanced (default)
let db_balanced = VectorDB::builder()
.hnsw_ef_construction(200)
.build()?;
// Slow build, maximum recall
let db_accurate = VectorDB::builder()
.hnsw_ef_construction(400)
.build()?;
Can be adjusted per query for dynamic performance/accuracy tradeoff:
// Fast search, lower recall
let query_fast = SearchQuery {
vector: query_vec,
k: 10,
ef_search: Some(50), // Override default
..Default::default()
};
// Accurate search
let query_accurate = SearchQuery {
vector: query_vec,
k: 10,
ef_search: Some(200), // Higher accuracy
..Default::default()
};
Route queries to specialized models based on content:
// Route code questions to code model, math to math model, etc.
let router = SemanticRouter::new(vec![
("gpt-4-code", code_specialization_vector),
("gpt-4-math", math_specialization_vector),
("gpt-4-general", general_specialization_vector),
]);
let best_model = router.route(&user_query_embedding)?;
Distribute inference load across multiple servers:
// Balance load across 10 GPU servers
let router = LoadBalancer::new(vec![
"gpu-0.internal:8080",
"gpu-1.internal:8080",
// ... gpu-9
]);
let endpoint = router.next_endpoint()?;
Fast context retrieval for LLMs:
// Store document embeddings
for doc in documents {
let embedding = embed_model.encode(&doc.text)?;
db.insert(VectorEntry {
id: doc.id,
vector: embedding,
metadata: doc.metadata,
timestamp: now(),
})?;
}
// Retrieve relevant context for query
let query_embedding = embed_model.encode(&user_query)?;
let context_docs = db.search(SearchQuery {
vector: query_embedding,
k: 5, // Top 5 most relevant
threshold: Some(0.7),
..Default::default()
})?;
Build intelligent search engines:
// Index product catalog
for product in catalog {
let embedding = encode_product(&product)?;
db.insert(VectorEntry {
id: product.sku,
vector: embedding,
metadata: product.to_metadata(),
timestamp: now(),
})?;
}
// Search by natural language
let search_embedding = encode_query("comfortable running shoes")?;
let results = db.search(SearchQuery {
vector: search_embedding,
k: 20,
filters: Some(HashMap::from([
("category", "footwear"),
("in_stock", true),
])),
..Default::default()
})?;
Store and retrieve agent experiences:
// Store agent observations
struct AgentMemory {
db: VectorDB,
}
impl AgentMemory {
pub fn remember(&self, observation: &str, context: Vec<f32>) -> Result<()> {
self.db.insert(VectorEntry {
id: uuid::Uuid::new_v4().to_string(),
vector: context,
metadata: HashMap::from([
("observation", observation.into()),
("timestamp", now().into()),
]),
timestamp: now(),
})
}
pub fn recall(&self, query_context: Vec<f32>, k: usize) -> Result<Vec<String>> {
let results = self.db.search(SearchQuery {
vector: query_context,
k,
..Default::default()
})?;
Ok(results.iter()
.filter_map(|r| r.metadata.get("observation"))
.map(|v| v.as_str().unwrap().to_string())
.collect())
}
}
let db = VectorDB::builder()
.dimensions(384)
.hnsw_m(16) // Lower M for faster queries
.hnsw_ef_construction(100) // Faster build
.hnsw_ef_search(50) // Lower default search accuracy
.quantization(QuantizationType::Scalar) // Compress for speed
.mmap_vectors(true) // Reduce memory pressure
.build()?;
let db = VectorDB::builder()
.dimensions(768)
.hnsw_m(64) // Maximum connections
.hnsw_ef_construction(400) // High build accuracy
.hnsw_ef_search(200) // High search accuracy
.quantization(QuantizationType::None) // No compression
.build()?;
let db = VectorDB::builder()
.dimensions(256) // Smaller embeddings
.max_elements(100_000) // Limit dataset size
.hnsw_m(16) // Fewer connections
.quantization(QuantizationType::Binary) // 32x compression
.mmap_vectors(true) // Use disk instead of RAM
.build()?;
let db = VectorDB::builder()
.dimensions(384)
.hnsw_m(32)
.hnsw_ef_construction(200)
.hnsw_ef_search(100)
.quantization(QuantizationType::Scalar)
.mmap_vectors(true)
.build()?;
Configuration Query Latency (p50) Recall@10
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Uncompressed, M=64 0.3ms 98.5%
Scalar Quant, M=32 0.4ms 96.2%
Product Quant, M=32 0.5ms 94.8%
Binary Quant, M=16 0.6ms 91.3%
Quantization Memory Usage Compression Ratio
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
None (float32) 1536 MB 1x
Scalar (int8) 392 MB 3.9x
Product (8 subspaces) 120 MB 12.8x
Binary (1 bit/dim) 52 MB 29.5x
Operation Throughput Notes
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Single Insert ~100K/sec Sequential
Batch Insert ~500K/sec Parallel (rayon)
Query (k=10) ~50K QPS ef_search=100
Query (k=100) ~20K QPS ef_search=100
Router Core integrates seamlessly with the main Ruvector database:
use ruvector_core::VectorDB as MainDB;
use router_core::VectorDB as RouterDB;
// Use router-core for specialized routing logic
let router_db = RouterDB::builder()
.dimensions(384)
.build()?;
// Or use main ruvector-core for full features
let main_db = MainDB::builder()
.dimensions(384)
.build()?;
// Both share the same API!
# Build library
cargo build --release -p router-core
# Build with all features
cargo build --release -p router-core --all-features
# Build static library
cargo build --release -p router-core --lib
# Run all tests
cargo test -p router-core
# Run specific test
cargo test -p router-core test_hnsw_insert_and_search
# Run with logging
RUST_LOG=debug cargo test -p router-core
# Run benchmarks
cargo bench -p router-core
# Run specific benchmark
cargo bench -p router-core --bench vector_search
# With criterion output
cargo bench -p router-core -- --output-format verbose
VectorDB: Main database interfaceVectorEntry: Vector with ID, data, and metadataSearchQuery: Query parameters for similarity searchSearchResult: Search result with ID, score, and metadataDistanceMetric: Enum for distance calculation methodsQuantizationType: Enum for compression methods// VectorDB
pub fn new(config: VectorDbConfig) -> Result<Self>
pub fn builder() -> VectorDbBuilder
pub fn insert(&self, entry: VectorEntry) -> Result<String>
pub fn insert_batch(&self, entries: Vec<VectorEntry>) -> Result<Vec<String>>
pub fn search(&self, query: SearchQuery) -> Result<Vec<SearchResult>>
pub fn delete(&self, id: &str) -> Result<bool>
pub fn get(&self, id: &str) -> Result<Option<VectorEntry>>
pub fn stats(&self) -> VectorDbStats
pub fn count(&self) -> Result<usize>
// Distance calculations
pub fn calculate_distance(a: &[f32], b: &[f32], metric: DistanceMetric) -> Result<f32>
pub fn batch_distance(query: &[f32], vectors: &[Vec<f32>], metric: DistanceMetric) -> Result<Vec<f32>>
// Quantization
pub fn quantize(vector: &[f32], qtype: QuantizationType) -> Result<QuantizedVector>
pub fn dequantize(quantized: &QuantizedVector) -> Vec<f32>
pub fn calculate_compression_ratio(original_dims: usize, qtype: QuantizationType) -> f32
ruvector-core: Full-featured vector database (superset of router-core)ruvector-node: Node.js bindings via NAPI-RSruvector-wasm: WebAssembly bindings for browsersrouter-cli: Command-line interface for router operationsrouter-ffi: Foreign function interface for C/C++router-wasm: WebAssembly bindings for routerContributions are welcome! Please see:
MIT License - see LICENSE for details.
Built with battle-tested technologies:
Part of the Ruvector ecosystem
Built by rUv โข Production Ready โข MIT Licensed
Documentation โข API Reference โข Examples โข Benchmarks