diskann-vector

Crates.io	diskann-vector
lib.rs	diskann-vector
version	0.1.0
created_at	2025-08-03 07:16:58.810087+00
updated_at	2025-08-03 07:16:58.810087+00
description	Vector operations and distance metrics for DiskANN, supporting f32, f16, and various distance functions
homepage
repository	https://github.com/infinilabs/diskann
max_upload_size
id	1779429
size	48,204

Medcl (medcl)

documentation

README

Vector Library

A high-performance vector operations library providing distance metrics, data types, and utilities for vector computations in Rust.

Overview

The Vector library provides essential functionality for vector operations, including distance metrics, data types, and utilities used by the DiskANN project. It's designed for high-performance vector computations with support for different numeric types and distance functions.

Features

Multiple distance metrics - L2, cosine, inner product, and more
Flexible data types - Support for f32, f16, and other numeric types
High performance - Optimized vector operations
Type safety - Strong typing for vector dimensions
SIMD support - Vectorized operations where available
Cross-platform - Works on multiple architectures

Quick Start

use vector::{Metric, FullPrecisionDistance};

// Define a vector type with specific dimension
type Vector128 = [f32; 128];

// Create vectors
let v1: Vector128 = [1.0; 128];
let v2: Vector128 = [2.0; 128];

// Calculate distance using different metrics
let l2_distance = v1.l2_distance(&v2);
let cosine_distance = v1.cosine_distance(&v2);
let inner_product = v1.inner_product(&v2);

println!("L2 distance: {}", l2_distance);
println!("Cosine distance: {}", cosine_distance);
println!("Inner product: {}", inner_product);

Distance Metrics

L2 Distance (Euclidean)

use vector::Metric;

let v1 = [1.0, 2.0, 3.0];
let v2 = [4.0, 5.0, 6.0];

let distance = v1.l2_distance(&v2);
// Calculates: sqrt((4-1)² + (5-2)² + (6-3)²)

Cosine Distance

let v1 = [1.0, 2.0, 3.0];
let v2 = [4.0, 5.0, 6.0];

let distance = v1.cosine_distance(&v2);
// Calculates: 1 - (v1·v2) / (||v1|| * ||v2||)

Inner Product

let v1 = [1.0, 2.0, 3.0];
let v2 = [4.0, 5.0, 6.0];

let similarity = v1.inner_product(&v2);
// Calculates: v1·v2 = Σ(v1[i] * v2[i])

Data Types

Supported Types

f32 - 32-bit floating point (most common)
f16 - 16-bit floating point (memory efficient)
f64 - 64-bit floating point (high precision)

Type Conversion

use vector::Half;

// Convert between types
let f32_vector: [f32; 128] = [1.0; 128];
let f16_vector: [Half; 128] = f32_vector.map(|x| Half::from_f32(x));

// Convert back
let back_to_f32: [f32; 128] = f16_vector.map(|x| x.to_f32());

Dimension Support

The library supports fixed-size arrays for different dimensions:

// Common dimensions
type Vector64 = [f32; 64];
type Vector128 = [f32; 128];
type Vector256 = [f32; 256];
type Vector512 = [f32; 512];

// Custom dimensions
type CustomVector = [f32; 1024];

Performance Optimizations

SIMD Operations

The library automatically uses SIMD instructions when available:

// These operations are automatically vectorized
let v1: [f32; 128] = [1.0; 128];
let v2: [f32; 128] = [2.0; 128];

let distance = v1.l2_distance(&v2); // Uses SIMD if available

Memory Alignment

For optimal performance, ensure vectors are properly aligned:

use std::alloc::{alloc, Layout};

// Allocate aligned memory
let layout = Layout::from_size_align(1024, 32).unwrap();
let ptr = unsafe { alloc(layout) };

Advanced Usage

Custom Distance Metrics

use vector::{FullPrecisionDistance, Metric};

// Implement custom distance for your type
impl FullPrecisionDistance<f32, 128> for [f32; 128] {
    fn l2_distance(&self, other: &[f32; 128]) -> f32 {
        self.iter()
            .zip(other.iter())
            .map(|(a, b)| (a - b).powi(2))
            .sum::<f32>()
            .sqrt()
    }
}

Batch Operations

use rayon::prelude::*;

let vectors: Vec<[f32; 128]> = vec![/* your vectors */];
let query: [f32; 128] = [/* query vector */];

// Parallel distance calculation
let distances: Vec<f32> = vectors
    .par_iter()
    .map(|v| v.l2_distance(&query))
    .collect();

Integration with DiskANN

The Vector library is designed to work seamlessly with DiskANN:

use diskann::{IndexBuilder, Metric};
use vector::FullPrecisionDistance;

// Create index with vector types
let mut index = IndexBuilder::new()
    .with_dimension(128)
    .with_metric(Metric::L2)
    .build_in_memory::<f32>()?;

// Insert vectors
let vectors: Vec<[f32; 128]> = vec![/* your vectors */];
index.insert_batch(&vectors)?;

Benchmarks

Performance comparison of different distance metrics (Intel i7-8700K):

Metric	128-dim	256-dim	512-dim	1024-dim
L2	0.8μs	1.2μs	2.1μs	4.3μs
Cosine	1.1μs	1.8μs	3.2μs	6.1μs
Inner Product	0.6μs	1.0μs	1.8μs	3.5μs

Times are per vector pair comparison

Development

Building

cargo build --release

Testing

cargo test
cargo test --benches

Benchmarks

cargo bench

API Reference

Core Traits

FullPrecisionDistance<T, DIM> - Distance calculation trait
Metric - Distance metric enumeration
Half - 16-bit floating point type

Main Functions

l2_distance() - Calculate L2 distance
cosine_distance() - Calculate cosine distance
inner_product() - Calculate inner product
normalize() - Normalize vector to unit length

Utility Functions

round_up() - Round up to nearest multiple
is_floating_point() - Check if type is floating point
get_distance_function() - Get distance function for metric

Dependencies

rayon - Parallel processing
half - 16-bit floating point support
bytemuck - Memory operations
serde - Serialization (optional)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

We welcome contributions! Please see the main README for contribution guidelines.

Commit count: 0

diskann-vector

documentation

README

Vector Library

Overview

Features

Quick Start

Distance Metrics

L2 Distance (Euclidean)

Cosine Distance

Inner Product

Data Types

Supported Types

Type Conversion

Dimension Support

Performance Optimizations

SIMD Operations

Memory Alignment

Advanced Usage

Custom Distance Metrics

Batch Operations

Integration with DiskANN

Benchmarks

Development

Building

Testing

Benchmarks

API Reference

Core Traits

Main Functions

Utility Functions

Dependencies

License

Contributing

cargo fmt