| Crates.io | oxirs-vec |
| lib.rs | oxirs-vec |
| version | 0.1.0 |
| created_at | 2025-09-30 08:29:43.33543+00 |
| updated_at | 2026-01-20 21:15:26.209532+00 |
| description | Vector index abstractions for semantic similarity and AI-augmented querying |
| homepage | https://github.com/cool-japan/oxirs |
| repository | https://github.com/cool-japan/oxirs |
| max_upload_size | |
| id | 1860777 |
| size | 3,936,949 |
Status: Production Release (v0.1.0) - Released January 7, 2026
✨ Production Release: Production-ready with API stability guarantees and comprehensive testing.
High-performance vector search infrastructure for semantic similarity search in RDF knowledge graphs.
Add to your Cargo.toml:
# Experimental feature
[dependencies]
oxirs-vec = "0.1.0"
use oxirs_vec::{VectorStore, IndexType, DistanceMetric};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create vector store with HNSW index
let mut store = VectorStore::builder()
.index_type(IndexType::HNSW)
.dimension(768) // Embedding dimension
.distance_metric(DistanceMetric::Cosine)
.build()?;
// Add vectors
store.add_vector("entity1", &embedding1)?;
store.add_vector("entity2", &embedding2)?;
// Build index
store.build_index()?;
// Search for similar vectors
let results = store.search(&query_vector, 10, 0.8)?;
for result in results {
println!("ID: {}, Score: {}", result.id, result.score);
}
Ok(())
}
use oxirs_vec::sparql::VectorFunctions;
let sparql = r#"
PREFIX vec: <http://oxirs.org/vec/>
SELECT ?entity ?score WHERE {
?entity a foaf:Person .
# Vector similarity search
?entity vec:similarTo "machine learning researcher" .
?entity vec:similarity ?score .
FILTER (?score > 0.8)
}
ORDER BY DESC(?score)
LIMIT 10
"#;
pub enum DistanceMetric {
Cosine, // For normalized embeddings
Euclidean, // For absolute distances
Manhattan, // For high-dimensional spaces
DotProduct, // For similarity scores
}
Combine vector similarity with RDF constraints:
use oxirs_vec::FilteredSearch;
let filters = FilteredSearch::builder()
.add_constraint("rdf:type", "foaf:Person")
.add_constraint("foaf:age", |age: i32| age > 18)
.build();
let results = store.filtered_search(&query_vector, filters, 10)?;
Efficient bulk indexing:
let batch = vec![
("entity1", embedding1),
("entity2", embedding2),
("entity3", embedding3),
];
store.add_batch(batch)?;
store.build_index()?;
// Add without full rebuild
store.add_incremental("new_entity", &embedding)?;
// Periodic optimization
store.optimize_index()?;
| Dataset Size | Index Type | Build Time | Query Time (10-NN) |
|---|---|---|---|
| 10K vectors | HNSW | 2.5s | 0.5ms |
| 100K vectors | HNSW | 28s | 1.2ms |
| 1M vectors | HNSW | 320s | 2.8ms |
| 10K vectors | Flat | 0.1s | 12ms |
| 100K vectors | IVF | 15s | 3.5ms |
Benchmarked on M1 Mac with 768-dimensional vectors
let config = VectorStoreConfig {
index_type: IndexType::HNSW,
dimension: 768,
distance_metric: DistanceMetric::Cosine,
// HNSW-specific parameters
hnsw_m: 16, // Number of connections per node
hnsw_ef_construction: 200, // Construction time accuracy
hnsw_ef_search: 100, // Search time accuracy
// Storage options
persist_path: Some("./vector_index".into()),
cache_size: 1000,
};
use oxirs_embed::EmbeddingModel;
use oxirs_vec::VectorStore;
// Generate embeddings
let model = EmbeddingModel::load("sentence-transformers/all-mpnet-base-v2")?;
let embedding = model.encode("Machine learning research")?;
// Index and search
let mut store = VectorStore::new(IndexType::HNSW, 768)?;
store.add_vector("doc1", &embedding)?;
use oxirs_core::Dataset;
use oxirs_vec::RdfVectorIndex;
let dataset = Dataset::from_file("knowledge_graph.ttl")?;
let mut index = RdfVectorIndex::new(&dataset)?;
// Index entities by their descriptions
for entity in dataset.subjects() {
if let Some(description) = dataset.get_description(&entity) {
let embedding = model.encode(&description)?;
index.add_entity(&entity, &embedding)?;
}
}
This is an experimental module. Feedback and contributions are welcome!
MIT OR Apache-2.0