ruvector-router-cli

Crates.ioruvector-router-cli
lib.rsruvector-router-cli
version0.1.29
created_at2025-11-21 15:11:08.415724+00
updated_at2025-12-29 19:17:33.550528+00
descriptionCLI for testing and benchmarking ruvector-router-core
homepage
repositoryhttps://github.com/ruvnet/ruvector
max_upload_size
id1943705
size60,923
rUv (ruvnet)

documentation

README

Router CLI (ruvector)

Crate License: MIT Rust

High-performance command-line interface for the Ruvector vector database.

The ruvector CLI provides powerful tools for managing, testing, and benchmarking vector databases with sub-millisecond performance. Perfect for development, testing, and operational workflows.

๐ŸŒŸ Features

  • โšก Fast Operations: Sub-millisecond vector operations with HNSW indexing
  • ๐Ÿ”ง Database Management: Create, configure, and manage vector databases
  • ๐Ÿ“Š Performance Benchmarking: Built-in benchmarks for insert and search operations
  • ๐Ÿ“ˆ Real-time Statistics: Monitor database metrics and performance
  • ๐ŸŽฏ Production Ready: Battle-tested CLI for operational workflows
  • ๐Ÿ› ๏ธ Developer Friendly: Intuitive commands with helpful output formatting

๐Ÿ“ฆ Installation

From Crates.io (Recommended)

cargo install router-cli

From Source

# Clone the repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector

# Build and install from workspace
cargo install --path crates/router-cli

Verify Installation

ruvector --help

โšก Quick Start

Create a Database

# Create a database with default settings (384 dimensions, cosine similarity)
ruvector create

# Create with custom configuration
ruvector create \
  --path ./my_vectors.db \
  --dimensions 768 \
  --metric cosine

Insert Vectors

# Insert a single vector
ruvector insert \
  --path ./vectors.db \
  --id "doc1" \
  --vector "0.1,0.2,0.3,0.4"

Search Similar Vectors

# Search for top 10 similar vectors
ruvector search \
  --path ./vectors.db \
  --vector "0.1,0.2,0.3,0.4" \
  --k 10

View Statistics

# Get database statistics and metrics
ruvector stats --path ./vectors.db

Run Benchmarks

# Benchmark with 1000 vectors of 384 dimensions
ruvector benchmark \
  --path ./vectors.db \
  --num-vectors 1000 \
  --dimensions 384

๐Ÿ“š Command Reference

create - Create Vector Database

Create a new vector database with specified configuration.

Usage:

ruvector create [OPTIONS]

Options:

  • -p, --path <PATH> - Database file path (default: ./vectors.db)
  • -d, --dimensions <DIMS> - Vector dimensions (default: 384)
  • -m, --metric <METRIC> - Distance metric (default: cosine)

Distance Metrics:

  • cosine - Cosine similarity (best for normalized vectors)
  • euclidean, l2 - Euclidean distance
  • dot, dotproduct - Dot product similarity
  • manhattan, l1 - Manhattan distance

Examples:

# Create database for sentence embeddings (384D)
ruvector create --dimensions 384 --metric cosine

# Create database for image embeddings (512D, L2 distance)
ruvector create --dimensions 512 --metric euclidean --path ./images.db

# Create database for large language model embeddings (1536D)
ruvector create --dimensions 1536 --metric cosine --path ./llm_embeddings.db

insert - Insert Vector

Insert a single vector into the database.

Usage:

ruvector insert [OPTIONS] --id <ID> --vector <VECTOR>

Options:

  • -p, --path <PATH> - Database file path (default: ./vectors.db)
  • -i, --id <ID> - Unique vector identifier (required)
  • -v, --vector <VECTOR> - Comma-separated vector values (required)

Examples:

# Insert a document embedding
ruvector insert \
  --id "doc_001" \
  --vector "0.23,0.45,0.67,0.12"

# Insert into specific database
ruvector insert \
  --path ./embeddings.db \
  --id "user_profile_42" \
  --vector "0.1,0.2,0.3,0.4,0.5"

Performance:

  • Typical insert latency: <1ms
  • Includes HNSW index update
  • Thread-safe for concurrent inserts

search - Search Similar Vectors

Search for the most similar vectors in the database.

Usage:

ruvector search [OPTIONS] --vector <VECTOR>

Options:

  • -p, --path <PATH> - Database file path (default: ./vectors.db)
  • -v, --vector <VECTOR> - Query vector (comma-separated values, required)
  • -k <K> - Number of results to return (default: 10)

Examples:

# Find 10 most similar vectors
ruvector search --vector "0.1,0.2,0.3,0.4" --k 10

# Find top 100 matches with specific database
ruvector search \
  --path ./my_vectors.db \
  --vector "0.5,0.3,0.1,0.7" \
  --k 100

Output Format:

โœ“ Found 10 results
  Query time: 423ยตs

1. doc_001 (score: 0.9823)
2. doc_045 (score: 0.9456)
3. doc_123 (score: 0.9234)
...

Performance:

  • Typical query latency: <0.5ms (p50)
  • HNSW-based approximate nearest neighbor search
  • 95%+ recall accuracy

stats - Database Statistics

Display comprehensive database statistics and performance metrics.

Usage:

ruvector stats [OPTIONS]

Options:

  • -p, --path <PATH> - Database file path (default: ./vectors.db)

Example:

ruvector stats --path ./vectors.db

Output:

โœ“ Database Statistics

  Total vectors: 50,000
  Average query latency: 423.45 ฮผs
  QPS: 2,361.23
  Index size: 12,345,678 bytes

Metrics Explained:

  • Total vectors: Number of vectors stored
  • Average query latency: Mean search time in microseconds
  • QPS: Queries per second (throughput)
  • Index size: HNSW index size in bytes

benchmark - Performance Benchmarking

Run comprehensive performance benchmarks for insert and search operations.

Usage:

ruvector benchmark [OPTIONS]

Options:

  • -p, --path <PATH> - Database file path (default: ./vectors.db)
  • -n, --num-vectors <N> - Number of vectors to test (default: 1000)
  • -d, --dimensions <DIMS> - Vector dimensions (default: 384)

Examples:

# Standard benchmark (1K vectors, 384D)
ruvector benchmark

# Large-scale benchmark (100K vectors, 768D)
ruvector benchmark \
  --num-vectors 100000 \
  --dimensions 768

# Quick test (100 vectors)
ruvector benchmark --num-vectors 100

Output:

โ†’ Running benchmark...
  Vectors: 1000
  Dimensions: 384

โ†’ Generating vectors...
โ†’ Inserting vectors...
โœ“ Inserted 1000 vectors in 1.234s
  Throughput: 810 inserts/sec

โ†’ Running search benchmark...
โœ“ Completed 100 queries in 42.3ms
  Average latency: 423ยตs
  QPS: 2,364

Benchmark Process:

  1. Generates random vectors with specified dimensions
  2. Measures batch insert performance
  3. Runs 100 search queries
  4. Reports throughput and latency metrics

๐ŸŽฏ Use Cases

Development Workflows

# 1. Create database for development
ruvector create --dimensions 384 --path ./dev.db

# 2. Insert test vectors
ruvector insert --id "test1" --vector "0.1,0.2,0.3,..." --path ./dev.db

# 3. Test search functionality
ruvector search --vector "0.1,0.2,0.3,..." --k 5 --path ./dev.db

# 4. Monitor performance
ruvector stats --path ./dev.db

Performance Testing

# Test different vector sizes
for dims in 128 384 768 1536; do
  echo "Testing ${dims} dimensions..."
  ruvector benchmark --dimensions $dims --num-vectors 10000
done

# Compare distance metrics
for metric in cosine euclidean dot manhattan; do
  ruvector create --metric $metric --path ./test_${metric}.db
  ruvector benchmark --path ./test_${metric}.db
done

Production Operations

# Check production database health
ruvector stats --path /var/lib/vectors/prod.db

# Benchmark production-scale data
ruvector benchmark \
  --path /var/lib/vectors/prod.db \
  --num-vectors 1000000 \
  --dimensions 1536

# Verify search performance
ruvector search \
  --path /var/lib/vectors/prod.db \
  --vector "$(cat query_vector.txt)" \
  --k 100

๐Ÿ”ง Configuration

Database Configuration

The CLI uses the router-core configuration system with the following defaults:

VectorDbConfig {
    dimensions: 384,              // Vector dimensions
    max_elements: 1_000_000,      // Maximum vectors
    distance_metric: Cosine,      // Distance metric
    hnsw_m: 32,                   // HNSW connections per node
    hnsw_ef_construction: 200,    // HNSW build-time parameter
    hnsw_ef_search: 100,          // HNSW search-time parameter
    quantization: None,           // Quantization type
    storage_path: "./vectors.db", // Database path
    mmap_vectors: true,           // Enable memory mapping
}

HNSW Parameters

M (connections per node):

  • Lower values (16): Less memory, slower search
  • Higher values (64): More memory, faster search
  • Default: 32 (balanced)

ef_construction:

  • Build-time quality parameter
  • Higher = better index quality, slower construction
  • Default: 200

ef_search:

  • Search-time accuracy parameter
  • Higher = better recall, slower search
  • Default: 100

Distance Metrics

Choose based on your data characteristics:

Metric Best For Normalization Required
Cosine Text embeddings, semantic search Yes (recommended)
Euclidean Image embeddings, spatial data No
Dot Product Pre-normalized vectors Yes (required)
Manhattan High-dimensional sparse data No

๐Ÿ“Š Performance Tuning

Optimize for Speed

# Use dot product for pre-normalized vectors (fastest)
ruvector create --metric dot --dimensions 384

# Reduce ef_search for faster queries (lower recall)
# Note: Currently requires code modification

Optimize for Accuracy

# Use higher dimensions for better semantic separation
ruvector create --dimensions 1536

# Use cosine similarity for normalized embeddings
ruvector create --metric cosine

Optimize for Memory

# Use lower dimensions
ruvector create --dimensions 128

# Consider quantization (requires code configuration)
# Product quantization: 4-8x memory reduction
# Scalar quantization: 4x memory reduction

๐Ÿ”— Integration with Router Core

The CLI is built on router-core and provides access to its features:

Core Features

  • HNSW Indexing: Fast approximate nearest neighbor search
  • Multiple Distance Metrics: Cosine, Euclidean, Dot Product, Manhattan
  • SIMD Optimization: Hardware-accelerated vector operations
  • Memory Mapping: Efficient large-scale data handling
  • Thread Safety: Concurrent operations support
  • Persistent Storage: Durable vector storage with redb

API Compatibility

The CLI uses the same VectorDB API available in Rust applications:

use router_core::{VectorDB, VectorEntry, SearchQuery};

// Same underlying implementation as CLI
let db = VectorDB::builder()
    .dimensions(384)
    .storage_path("./vectors.db")
    .build()?;

๐Ÿ› Troubleshooting

Common Issues

Database not found:

# Ensure you've created the database first
ruvector create --path ./vectors.db

# Or specify the correct path
ruvector search --path ./correct/path/vectors.db --vector "..."

Dimension mismatch:

# Error: Expected 384 dimensions, got 768

# Solution: Use consistent dimensions
ruvector create --dimensions 768
ruvector insert --vector "..." --dimensions 768

Parse errors:

# Ensure vector values are comma-separated floats
ruvector insert --vector "0.1,0.2,0.3" --id "test"

# Not: "0.1 0.2 0.3" or "[0.1,0.2,0.3]"

Performance Issues

Slow inserts:

  • Use batch insert operations in your application code
  • Reduce hnsw_ef_construction for faster builds
  • Consider quantization for very large datasets

Slow searches:

  • Reduce k (number of results)
  • Reduce ef_search parameter (requires code modification)
  • Ensure proper distance metric for your data

๐Ÿ“– Examples

RAG System Vector Database

# Create database for document embeddings
ruvector create \
  --dimensions 384 \
  --metric cosine \
  --path ./documents.db

# Insert document embeddings (from your application)
# Typically done via Rust/Node.js API, not CLI

# Search for relevant documents
ruvector search \
  --path ./documents.db \
  --vector "$(cat query_embedding.txt)" \
  --k 5

Semantic Search Testing

# Create test database
ruvector create --dimensions 768 --path ./semantic.db

# Run benchmark to establish baseline
ruvector benchmark \
  --path ./semantic.db \
  --num-vectors 10000 \
  --dimensions 768

# Test search with different query vectors
for query in query_*.txt; do
  echo "Testing $query..."
  ruvector search \
    --path ./semantic.db \
    --vector "$(cat $query)" \
    --k 10
done

Performance Comparison

# Compare metrics
metrics=("cosine" "euclidean" "dot" "manhattan")

for metric in "${metrics[@]}"; do
  echo "=== Testing $metric ==="
  ruvector create --metric $metric --path ./test_$metric.db
  ruvector benchmark --path ./test_$metric.db --num-vectors 5000
  echo ""
done

๐Ÿ”— Related Documentation

Ruvector Core Documentation

Getting Started

Advanced Topics

๐Ÿค Contributing

We welcome contributions! Here's how to contribute to the router-cli:

  1. Fork the repository at github.com/ruvnet/ruvector
  2. Create a feature branch (git checkout -b feature/cli-improvement)
  3. Make your changes to crates/router-cli/
  4. Test thoroughly:
    cd crates/router-cli
    cargo test
    cargo clippy -- -D warnings
    cargo fmt --check
    
  5. Build and test the binary:
    cargo build --release
    ./target/release/ruvector --help
    
  6. Commit your changes (git commit -m 'Add amazing CLI feature')
  7. Push to the branch (git push origin feature/cli-improvement)
  8. Open a Pull Request

Development Setup

# Clone and navigate to CLI crate
git clone https://github.com/ruvnet/ruvector.git
cd ruvector/crates/router-cli

# Build in development mode
cargo build

# Run with cargo
cargo run -- create --dimensions 384

# Run tests
cargo test

# Run with detailed logging
RUST_LOG=debug cargo run -- benchmark

๐Ÿ“œ License

MIT License - see LICENSE for details.

Part of the Ruvector project by rUv.

๐Ÿ™ Acknowledgments

Built with:

  • clap - Command-line argument parsing
  • colored - Terminal color output
  • router-core - Vector database engine
  • chrono - Timestamp handling

Built by rUv โ€ข Part of Ruvector

GitHub Documentation License

Main Documentation โ€ข API Reference โ€ข Contributing

Commit count: 729

cargo fmt