embeddenator-testkit

Crates.ioembeddenator-testkit
lib.rsembeddenator-testkit
version0.20.0
created_at2026-01-09 23:04:21.329074+00
updated_at2026-01-25 18:29:00.153693+00
descriptionComprehensive testing utilities and performance benchmarking for embeddenator VSA operations
homepage
repositoryhttps://github.com/tzervas/embeddenator-testkit
max_upload_size
id2033121
size600,926
Tyler Zervas (tzervas)

documentation

https://docs.rs/embeddenator-testkit

README

Embeddenator TestKit

Comprehensive testing utilities and performance benchmarking for embeddenator VSA operations.

Independent component extracted from the Embeddenator monolithic repository. Part of the Embeddenator workspace.

Repository: https://github.com/tzervas/embeddenator-testkit

Features

  • Test Data Generators: Create random and deterministic sparse vectors for reproducible testing
  • Performance Metrics: Granular timing, memory tracking, and throughput measurements
  • Integrity Validation: Verify VSA operation properties and detect data corruption
  • Chaos Injection: Test resilience with bitflip injection, erasures, and noise
  • Test Fixtures: Generate synthetic datasets with various patterns and sizes
  • Test Harness: Manage temporary directories and coordinate complex test scenarios

Installation

Add to your Cargo.toml:

[dev-dependencies]
embeddenator-testkit = { path = "../embeddenator-testkit" }

Usage Examples

Generate Test Vectors

use embeddenator_testkit::*;
use rand::thread_rng;

// Generate random sparse vector
let mut rng = thread_rng();
let vec = random_sparse_vec(&mut rng, 10000, 200);

// Generate deterministic vector for reproducible tests
let vec = deterministic_sparse_vec(10000, 200, 42);

Performance Measurement

use embeddenator_testkit::*;

let mut metrics = TestMetrics::new("bind_operation");

// Time an operation
metrics.start_timing();
let result = a.bind(&b);
metrics.stop_timing();

// Or use closure
let result = metrics.time_operation(|| a.bind(&b));

println!("{}", metrics.summary());

Integrity Validation

use embeddenator_testkit::*;

let validator = IntegrityValidator::new();

// Validate bitsliced vector invariants
let report = validator.validate_bitsliced(&vec);
assert!(report.is_ok());

// Validate bind commutativity
let report = validator.validate_bind_invariants(&a, &b);
println!("{}", report.summary());

Chaos Testing

use embeddenator_testkit::*;

let injector = ChaosInjector::new(42);

// Inject bitflips for resilience testing
let mut vec = vec.clone();
let flipped = injector.inject_bitflips(&mut vec, 10);

// Create corrupted copy
let corrupted = injector.corrupt_copy(&vec, 0.05); // 5% error rate

Test Dataset Generation

use embeddenator_testkit::*;

// Create test data with specific pattern
let data = create_test_data(100, TestDataPattern::Random); // 100MB

// Create test harness with automatic cleanup
let harness = TestHarness::new();
let dataset_dir = harness.create_dataset(500); // 500MB dataset

// Create specific files
let file = harness.create_file("test.bin", b"test data");

Module Overview

generators

  • random_sparse_vec() - Generate random sparse vectors
  • deterministic_sparse_vec() - Reproducible vector generation
  • sparse_dot() - Reference dot product implementation
  • generate_noise_pattern() - Synthetic noise data

metrics

  • TestMetrics - Performance measurement and statistics
  • TimingStats - Timing analysis (mean, median, percentiles)

integrity

  • IntegrityValidator - Verify VSA operation properties
  • IntegrityReport - Validation results and diagnostics

chaos

  • ChaosInjector - Inject errors for resilience testing
  • Bitflip, erasure, and corruption utilities

fixtures

  • TestDataPattern - Data pattern types
  • create_test_data() - Generate test data
  • create_test_dataset() - Multi-file test datasets

harness

  • TestHarness - Unified test management
  • Temporary directory handling
  • Performance metric collection

Migrated from Monolithic Repo

This testkit extracts and consolidates test utilities from:

  • embeddenator/src/testing/mod.rs - Performance metrics and integrity validation
  • embeddenator/tests/common/bt_migration.rs - Vector generators and helpers
  • embeddenator/tests/qa_comprehensive.rs - Test harness and dataset generation
  • Various test modules - Fixture patterns and utilities

Performance Baselines

Example measurements from v0.20.0-alpha.1 on Intel i7-14700K, 46GB RAM. Your results will vary based on hardware, data patterns, and system load.

  • Bundle: ~43ns (sparse), ~32µs (dense packed)
  • Bind: ~11ns (sparse), ~20µs (dense packed)
  • Cosine: ~7ns (sparse), ~14µs (dense packed)
  • Ingestion: ~15 MB/s (2GB dataset)
  • Extraction: ~41 MB/s (2GB dataset)

Run cargo bench to establish baselines for your system.

Testing

Run the testkit's own tests:

cargo test --manifest-path embeddenator-testkit/Cargo.toml

Run with specific features:

cargo test --manifest-path embeddenator-testkit/Cargo.toml --features large-scale

License

MIT

Commit count: 6

cargo fmt