ruvector-attention-wasm

Crates.ioruvector-attention-wasm
lib.rsruvector-attention-wasm
version0.1.32
created_at2025-12-09 01:22:41.070087+00
updated_at2026-01-23 01:20:55.906004+00
descriptionHigh-performance WebAssembly attention mechanisms: Multi-Head, Flash, Hyperbolic, MoE, CGT Sheaf Attention with GPU acceleration for transformers and LLMs
homepagehttps://ruv.io/ruvector
repositoryhttps://github.com/ruvnet/ruvector
max_upload_size
id1974669
size64,971
rUv (ruvnet)

documentation

https://docs.rs/ruvector-attention-wasm

README

ruvector-attention-wasm

WebAssembly bindings for the ruvector-attention package, providing high-performance attention mechanisms for browser and Node.js environments.

Features

  • Multiple Attention Mechanisms:

    • Scaled Dot-Product Attention
    • Multi-Head Attention
    • Hyperbolic Attention (for hierarchical data)
    • Linear Attention (Performer-style)
    • Flash Attention (memory-efficient)
    • Local-Global Attention
    • Mixture of Experts (MoE) Attention
    • CGT Sheaf Attention (coherence-gated via Prime-Radiant)
  • Training Utilities:

    • InfoNCE contrastive loss
    • Adam optimizer
    • AdamW optimizer (with decoupled weight decay)
    • Learning rate scheduler (warmup + cosine decay)
  • TypeScript Support: Full type definitions and modern API

Installation

npm install ruvector-attention-wasm

Usage

TypeScript/JavaScript

import { initialize, MultiHeadAttention, utils } from 'ruvector-attention-wasm';

// Initialize WASM module
await initialize();

// Create multi-head attention
const attention = new MultiHeadAttention({ dim: 64, numHeads: 8 });

// Prepare inputs
const query = new Float32Array(64);
const keys = [new Float32Array(64), new Float32Array(64)];
const values = [new Float32Array(64), new Float32Array(64)];

// Compute attention
const output = attention.compute(query, keys, values);

// Use utilities
const similarity = utils.cosineSimilarity(query, keys[0]);

Advanced Examples

Hyperbolic Attention

import { HyperbolicAttention } from 'ruvector-attention-wasm';

const hyperbolic = new HyperbolicAttention({
  dim: 128,
  curvature: 1.0
});

const output = hyperbolic.compute(query, keys, values);

MoE Attention with Expert Stats

import { MoEAttention } from 'ruvector-attention-wasm';

const moe = new MoEAttention({
  dim: 64,
  numExperts: 4,
  topK: 2
});

const output = moe.compute(query, keys, values);

// Get expert utilization
const stats = moe.getExpertStats();
console.log('Load balance:', stats.loadBalance);

Training with InfoNCE Loss

import { InfoNCELoss, Adam } from 'ruvector-attention-wasm';

const loss = new InfoNCELoss(0.07);
const optimizer = new Adam(paramCount, {
  learningRate: 0.001,
  beta1: 0.9,
  beta2: 0.999,
});

// Training loop
const lossValue = loss.compute(anchor, positive, negatives);
optimizer.step(params, gradients);

Learning Rate Scheduling

import { LRScheduler, AdamW } from 'ruvector-attention-wasm';

const scheduler = new LRScheduler({
  initialLR: 0.001,
  warmupSteps: 1000,
  totalSteps: 10000,
});

const optimizer = new AdamW(paramCount, {
  learningRate: scheduler.getLR(),
  weightDecay: 0.01,
});

// Training loop
for (let step = 0; step < 10000; step++) {
  optimizer.learningRate = scheduler.getLR();
  optimizer.step(params, gradients);
  scheduler.step();
}

Building from Source

Prerequisites

  • Rust 1.70+
  • wasm-pack

Build Commands

# Build for web (ES modules)
wasm-pack build --target web --out-dir pkg

# Build for Node.js
wasm-pack build --target nodejs --out-dir pkg-node

# Build for bundlers (webpack, vite, etc.)
wasm-pack build --target bundler --out-dir pkg-bundler

# Run tests
wasm-pack test --headless --firefox

API Reference

Attention Mechanisms

  • MultiHeadAttention - Standard multi-head attention
  • HyperbolicAttention - Attention in hyperbolic space
  • LinearAttention - Linear complexity attention (Performer)
  • FlashAttention - Memory-efficient attention
  • LocalGlobalAttention - Combined local and global attention
  • MoEAttention - Mixture of Experts attention
  • CGTSheafAttention - Coherence-gated via Prime-Radiant energy
  • scaledDotAttention() - Functional API for basic attention

CGT Sheaf Attention (Prime-Radiant Integration)

The CGT (Coherence-Gated Transformer) Sheaf Attention mechanism uses Prime-Radiant's sheaf Laplacian energy to gate attention based on mathematical consistency:

import { CGTSheafAttention } from 'ruvector-attention-wasm';

const cgtAttention = new CGTSheafAttention({
  dim: 128,
  numHeads: 8,
  coherenceThreshold: 0.3,  // Block if energy > threshold
});

// Attention is gated by coherence energy
const result = cgtAttention.compute(query, keys, values);
console.log('Coherence energy:', result.energy);
console.log('Is coherent:', result.isCoherent);

Key features:

  • Energy-weighted attention: Lower coherence energy → higher attention
  • Automatic hallucination detection via residual analysis
  • GPU-accelerated with wgpu WGSL shaders (vec4 optimized)
  • SIMD fallback (AVX-512/AVX2/NEON)

Training

  • InfoNCELoss - Contrastive loss function
  • Adam - Adam optimizer
  • AdamW - AdamW optimizer with weight decay
  • LRScheduler - Learning rate scheduler

Utilities

  • utils.cosineSimilarity() - Cosine similarity between vectors
  • utils.l2Norm() - L2 norm of a vector
  • utils.normalize() - Normalize vector to unit length
  • utils.softmax() - Apply softmax transformation
  • utils.attentionWeights() - Compute attention weights from scores
  • utils.batchNormalize() - Batch normalization
  • utils.randomOrthogonalMatrix() - Generate random orthogonal matrix
  • utils.pairwiseDistances() - Compute pairwise distances

Performance

The WASM bindings provide near-native performance for attention computations:

  • Optimized with opt-level = "s" and LTO
  • SIMD acceleration where available
  • Efficient memory management
  • Zero-copy data transfer where possible

License

MIT OR Apache-2.0

Commit count: 729

cargo fmt