ruvector-hyperbolic-hnsw

Crates.ioruvector-hyperbolic-hnsw
lib.rsruvector-hyperbolic-hnsw
version0.1.0
created_at2026-01-23 02:19:38.8515+00
updated_at2026-01-23 02:19:38.8515+00
descriptionHyperbolic (Poincare ball) embeddings with HNSW integration for hierarchy-aware vector search, enabling efficient similarity search in non-Euclidean spaces for taxonomies, ontologies, and hierarchical data
homepagehttps://ruv.io/ruvector
repositoryhttps://github.com/ruvnet/ruvector
max_upload_size
id2063181
size134,818
rUv (ruvnet)

documentation

https://docs.rs/ruvector-hyperbolic-hnsw

README

ruvector-hyperbolic-hnsw

Hyperbolic (Poincaré ball) embeddings with HNSW integration for hierarchy-aware vector search.

Why Hyperbolic Space?

Hierarchies compress naturally in hyperbolic space. Taxonomies, catalogs, ICD trees, product facets, org charts, and long-tail tags all fit better than in Euclidean space, which means higher recall on deep leaves without blowing up memory or latency.

Key Features

  • Poincaré Ball Model: Store vectors in the Poincaré ball with clamp r < 1 − eps
  • HNSW Speed Trick: Prune with cheap tangent-space proxy, rank with true hyperbolic distance
  • Per-Shard Curvature: Different parts of the hierarchy can have different optimal curvatures
  • Dual-Space Index: Keep a synchronized Euclidean ANN for fallback and mutual-ranking fusion
  • Production Guardrails: Numerical stability, canary testing, hot curvature reload

Installation

Rust

[dependencies]
ruvector-hyperbolic-hnsw = "0.1.0"

WebAssembly

cd crates/ruvector-hyperbolic-hnsw-wasm
wasm-pack build --target web --release

TypeScript/JavaScript

import init, {
  HyperbolicIndex,
  poincareDistance,
  mobiusAdd,
  expMap,
  logMap
} from 'ruvector-hyperbolic-hnsw-wasm';

await init();

const index = new HyperbolicIndex(16, 1.0);
index.insert(new Float32Array([0.1, 0.2, 0.3]));
const results = index.search(new Float32Array([0.15, 0.1, 0.2]), 5);

Quick Start

use ruvector_hyperbolic_hnsw::{HyperbolicHnsw, HyperbolicHnswConfig};

// Create index with default settings
let mut index = HyperbolicHnsw::default_config();

// Insert vectors (automatically projected to Poincaré ball)
index.insert(vec![0.1, 0.2, 0.3]).unwrap();
index.insert(vec![-0.1, 0.15, 0.25]).unwrap();
index.insert(vec![0.2, -0.1, 0.1]).unwrap();

// Search for nearest neighbors
let results = index.search(&[0.15, 0.1, 0.2], 2).unwrap();
for r in results {
    println!("ID: {}, Distance: {:.4}", r.id, r.distance);
}

HNSW Speed Trick

The core optimization:

  1. Precompute u = log_c(x) at a shard centroid c
  2. During neighbor selection, use Euclidean ||u_q - u_p|| to prune
  3. Run exact Poincaré distance only on top N candidates before final ranking
use ruvector_hyperbolic_hnsw::{HyperbolicHnsw, HyperbolicHnswConfig};

let mut config = HyperbolicHnswConfig::default();
config.use_tangent_pruning = true;
config.prune_factor = 10; // Consider 10x candidates in tangent space

let mut index = HyperbolicHnsw::new(config);

// ... insert vectors ...

// Build tangent cache for pruning optimization
index.build_tangent_cache().unwrap();

// Search with pruning (faster!)
let results = index.search_with_pruning(&[0.1, 0.15], 5).unwrap();

Core Mathematical Operations

use ruvector_hyperbolic_hnsw::poincare::{
    mobius_add, exp_map, log_map, poincare_distance, project_to_ball
};

let x = vec![0.3, 0.2];
let y = vec![-0.1, 0.4];
let c = 1.0; // Curvature

// Möbius addition (hyperbolic vector addition)
let z = mobius_add(&x, &y, c);

// Geodesic distance in hyperbolic space
let d = poincare_distance(&x, &y, c);

// Map to tangent space at x
let v = log_map(&y, &x, c);

// Map back to manifold
let y_recovered = exp_map(&v, &x, c);

Sharded Index with Per-Shard Curvature

use ruvector_hyperbolic_hnsw::{ShardedHyperbolicHnsw, ShardStrategy};

let mut manager = ShardedHyperbolicHnsw::new(1.0);

// Insert with hierarchy depth information
manager.insert(vec![0.1, 0.2], Some(0)).unwrap(); // Root level
manager.insert(vec![0.3, 0.1], Some(3)).unwrap(); // Deeper level

// Update curvature for specific shard
manager.update_curvature("radius_1", 0.5).unwrap();

// Canary testing for new curvature
manager.registry.set_canary("radius_1", 0.3, 10); // 10% traffic

// Search across all shards
let results = manager.search(&[0.2, 0.15], 5).unwrap();

Numerical Stability

All operations include numerical safeguards:

  • Norm clamping: Points projected with eps = 1e-5
  • Projection after updates: All operations keep points inside the ball
  • Stable acosh: Uses log1p expansions for safety
  • Clamp arguments: arctanh and atanh arguments bounded away from ±1

Evaluation Protocol

Datasets

  • WordNet
  • DBpedia slices
  • Synthetic scale-free tree
  • Domain taxonomy

Primary Metrics

  • recall@k (1, 5, 10)
  • Mean rank
  • NDCG

Hierarchy Metrics

  • Radius vs depth Spearman correlation
  • Distance distortion
  • Ancestor AUPRC

Baselines

  • Euclidean HNSW
  • OPQ/PQ compressed
  • Simple mutual-ranking fusion

Ablations

  • Tangent proxy vs full hyperbolic
  • Fixed vs learnable curvature c
  • Global vs shard centroids

Production Integration

Reflex Loop (on writes)

Small Möbius deltas and tangent-space micro updates that never push points outside the ball.

use ruvector_hyperbolic_hnsw::tangent_micro_update;

let updated = tangent_micro_update(
    &point,
    &delta,
    &centroid,
    curvature,
    0.1  // max step size
);

Habit (nightly)

Riemannian SGD passes to clean neighborhoods and optionally relearn per-shard curvature. Run canary first.

Structural (periodic)

Rebuild of HNSW with true hyperbolic metric, curvature retune, and shard reshuffle if hierarchy preservation drops below SLO.

Dependencies (Exact Versions)

nalgebra = "0.34.1"
ndarray = "0.17.1"
wasm-bindgen = "0.2.106"

Benchmarks

cd crates/ruvector-hyperbolic-hnsw
cargo bench

Benchmark suite includes:

  • Poincaré distance computation
  • Möbius addition
  • exp/log map operations
  • HNSW insert and search
  • Tangent cache building
  • Search with vs without pruning

License

MIT

Related

Commit count: 729

cargo fmt