oxirag

Crates.iooxirag
lib.rsoxirag
version0.1.0
created_at2026-01-24 07:08:56.524872+00
updated_at2026-01-24 07:08:56.524872+00
descriptionA four-layer RAG engine with SMT-based logic verification and knowledge graph support
homepagehttps://github.com/cool-japan/oxirag
repositoryhttps://github.com/cool-japan/oxirag
max_upload_size
id2066327
size2,273,269
KitaSan (cool-japan)

documentation

https://docs.rs/oxirag

README

OxiRAG

Crates.io Documentation License

A four-layer Retrieval-Augmented Generation (RAG) engine in Pure Rust with SMT-based logic verification and knowledge graph support.

Key Innovations:

  • Speculative RAG: Use cache as "drafts" not "answers" - verify with SLM before finalizing
  • Context-Aware Prefix Caching: Efficiently manage KV Cache for "premise knowledge"
  • On-the-fly Distillation: Automatically generate specialized lightweight models for frequent queries
  • Hidden States Manipulation: Direct manipulation of transformer hidden states for verification

Overview

OxiRAG provides a robust RAG pipeline with four specialized layers:

Query
  │
  ├───────────────────────────────┐
  ▼                               ▼
┌─────────────────┐       ┌─────────────────┐
│  Layer 1: Echo  │       │ Layer 4: Graph  │
│  (Vector Store) │       │   (GraphRAG)    │
└────────┬────────┘       └────────┬────────┘
         │                         │
         └───────────┬─────────────┘
                     ▼
         ┌─────────────────────┐
         │ Layer 2: Speculator │  ← Draft verification with SLM
         │  (Draft Checker)    │
         └─────────┬───────────┘
                   │
                   ▼
         ┌─────────────────┐
         │  Layer 3: Judge │  ← Logic verification with SMT solver
         │  (SMT Verifier) │
         └────────┬────────┘
                  │
                  ▼
             Verified Answer

Features

  • Layer 1 (Echo): Semantic search using vector embeddings

    • Cosine, Euclidean, and Dot Product similarity metrics
    • In-memory vector store with configurable capacity
    • Pluggable embedding providers (Candle BERT, mock for testing)
  • Layer 2 (Speculator): Draft verification

    • Rule-based speculator for quick verification
    • Candle-based SLM for advanced verification (optional)
    • Accept/Revise/Reject decision pipeline
  • Layer 3 (Judge): SMT-based logic verification

    • Claim extraction from natural language
    • SMT-LIB encoding of logical claims
    • OxiZ SMT solver integration for formal verification
    • Temporal, causal, and modal claim structures
  • Layer 4 (Graph): Knowledge graph-based retrieval (GraphRAG)

    • Entity and relationship extraction from documents
    • In-memory graph store with traversal algorithms
    • BFS traversal, shortest path, and N-hop queries
    • Hybrid search combining vector and graph results
  • Prefix Caching: Context-aware KV cache management

    • Efficient caching of processed document contexts
    • Context fingerprinting for cache key generation
    • LRU eviction with TTL support
    • Prefix matching for partial cache hits
  • Distillation: On-the-fly model distillation support

    • Query pattern frequency tracking
    • Automatic Q&A pair collection
    • Distillation candidate detection
    • Training example export for LoRA fine-tuning
    • Feature-based distillation (FitNet, attention transfer)
    • Teacher-student architecture with progressive learning
    • Distillation loss functions (KL, MSE, cosine)
  • Cross-Platform: Native and WASM support

    • Full async/await support with Tokio (native)
    • WASM bindings for browser/edge deployment

Installation

Add to your Cargo.toml:

[dependencies]
oxirag = "0.1"

Feature Flags

Feature Description Default
native Enable native runtime with Tokio Yes
wasm Enable WASM bindings No
echo Enable Layer 1 (semantic search) Yes
speculator Enable Layer 2 with Candle SLM No
judge Enable Layer 3 with OxiZ SMT solver No
graphrag Enable Layer 4 (knowledge graph) No
prefix-cache Enable prefix caching for KV cache management No
distillation Enable distillation tracking and Q&A collection No
full Enable all features (native only) No
cuda Enable CUDA acceleration for Candle No
metal Enable Metal acceleration for Candle No

Quick Start

use oxirag::prelude::*;

#[tokio::main]
async fn main() -> Result<(), OxiRagError> {
    // Create the Echo layer
    let echo = EchoLayer::new(
        MockEmbeddingProvider::new(384),
        InMemoryVectorStore::new(384),
    );

    // Create the Speculator layer
    let speculator = RuleBasedSpeculator::default();

    // Create the Judge layer
    let judge = JudgeImpl::new(
        AdvancedClaimExtractor::new(),
        MockSmtVerifier::default(),
        JudgeConfig::default(),
    );

    // Build the pipeline
    let mut pipeline = PipelineBuilder::new()
        .with_echo(echo)
        .with_speculator(speculator)
        .with_judge(judge)
        .build()?;

    // Index documents
    pipeline.index(Document::new("The capital of France is Paris.")).await?;
    pipeline.index(Document::new("Paris is known for the Eiffel Tower.")).await?;

    // Query the pipeline
    let query = Query::new("What is the capital of France?");
    let result = pipeline.process(query).await?;

    println!("Answer: {}", result.final_answer);
    println!("Confidence: {:.2}", result.confidence);
    println!("Layers used: {:?}", result.layers_used);

    Ok(())
}

Architecture

Layer 1: Echo

The Echo layer handles semantic search:

// Configure embedding provider
let provider = MockEmbeddingProvider::new(384);

// Configure vector store
let store = InMemoryVectorStore::new(384)
    .with_metric(SimilarityMetric::Cosine)
    .with_max_capacity(10000);

// Create Echo layer
let echo = EchoLayer::new(provider, store);

Layer 2: Speculator

The Speculator layer verifies draft answers:

// Rule-based (default)
let speculator = RuleBasedSpeculator::default();

// Or with custom config
let config = SpeculatorConfig {
    accept_threshold: 0.9,
    reject_threshold: 0.3,
    ..Default::default()
};
let speculator = RuleBasedSpeculator::new(config);

Layer 3: Judge

The Judge layer performs formal verification:

let judge = JudgeImpl::new(
    AdvancedClaimExtractor::new(),
    MockSmtVerifier::default(),
    JudgeConfig {
        max_claims: 10,
        check_consistency: true,
        ..Default::default()
    },
);

Layer 4: Graph (GraphRAG)

The Graph layer enables knowledge graph-based retrieval:

use oxirag::layer4_graph::*;

// Create extractors
let entity_extractor = PatternEntityExtractor::new();
let relationship_extractor = PatternRelationshipExtractor::new();

// Create graph store
let graph_store = InMemoryGraphStore::new();

// Build the GraphLayer
let mut graph_layer = GraphLayerBuilder::new()
    .with_entity_extractor(entity_extractor)
    .with_relationship_extractor(relationship_extractor)
    .with_graph_store(graph_store)
    .build()?;

// Index documents (extracts entities and relationships)
graph_layer.index_document(&Document::new("Rust was created by Mozilla.")).await?;

// Query the graph
let query = GraphQuery::new(vec!["Rust".to_string()])
    .with_max_hops(2);
let paths = graph_layer.query(&query).await?;

// Find related entities
let related = graph_layer.find_related("Rust", 2).await?;

Prefix Caching

The Prefix Cache module enables efficient caching of processed document contexts:

use oxirag::prefix_cache::*;

// Create a prefix cache
let config = PrefixCacheConfig::default()
    .with_max_entries(1000)
    .with_default_ttl_secs(3600);
let mut cache = InMemoryPrefixCache::new(config);

// Generate a fingerprint for context
let generator = ContextFingerprintGenerator::new();
let fingerprint = generator.generate("This is a large document context...");

// Create and store a KV cache entry
let entry = KVCacheEntry::new(fingerprint.clone(), vec![0.1, 0.2, 0.3], 512);
cache.put(entry).await?;

// Retrieve cached entry
if let Some(cached) = cache.get(&fingerprint).await {
    println!("Cache hit! Sequence length: {}", cached.sequence_length);
}

// Check cache statistics
let stats = cache.stats();
println!("Hits: {}, Misses: {}", stats.hits, stats.misses);

Distillation

The Distillation module tracks query patterns for model fine-tuning:

use oxirag::distillation::*;

// Create a distillation tracker
let config = DistillationConfig::default()
    .with_min_frequency_threshold(10)
    .with_max_qa_pairs_per_pattern(100);
let mut tracker = InMemoryDistillationTracker::new(config);

// Track queries with their answers
tracker.track_query("What is Rust?", Some("Rust is a systems programming language."), 0.95).await?;
tracker.track_query("What is Rust?", Some("Rust focuses on safety and performance."), 0.92).await?;

// Get candidates ready for distillation
let candidates = tracker.get_candidates().await;
for candidate in candidates {
    if candidate.ready_for_distillation {
        println!("Pattern '{}' has {} Q&A pairs",
            candidate.pattern.normalized_text,
            candidate.qa_pairs.len());
    }
}

// Export training examples for LoRA fine-tuning
let examples = tracker.export_training_examples().await;

WASM Usage

Build for WASM:

wasm-pack build --target web --features wasm

Use in JavaScript:

import init, { WasmRagEngine } from './pkg/oxirag.js';

await init();

const engine = new WasmRagEngine(384);
await engine.index("doc-1", "The capital of France is Paris.");
const results = await engine.query("What is the capital of France?", 5);

Configuration

Pipeline Configuration

let config = PipelineConfig {
    fast_path_threshold: 0.95,      // Skip layers if confidence is high
    skip_verification_threshold: 0.9,
    enable_fast_path: true,
    max_retries: 3,
    parallel_execution: false,
    max_search_results: 5,
};

Loading from JSON

let config = OxiRagConfig::from_json(r#"{
    "echo": { "dimension": 384 },
    "pipeline": { "enable_fast_path": true }
}"#)?;

Testing

# Run all tests (1,500 tests)
cargo nextest run --all-features

# Run clippy (no warnings)
cargo clippy --all-features -- -D warnings

# Run rustdoc validation (no warnings)
RUSTDOCFLAGS="-D warnings" cargo doc --all-features --no-deps

# Run doc tests
cargo test --doc --all-features

Benchmarks

cargo bench

Project Statistics

Metric Value
Source Files 91 Rust files
Lines of Code 48,463
Tests 1,500
Clippy Warnings 0
Rustdoc Warnings 0

License

Licensed under either of:

at your option.

Roadmap

See TODO.md for detailed progress on each layer:

Feature Progress
Speculative RAG 99%
Context-Aware Prefix Caching 95%
On-the-fly Distillation 85%
Hidden States Manipulation 90%

COOLJAPAN Ecosystem

OxiRAG is part of the COOLJAPAN Pure Rust ecosystem:

  • OxiZ: SMT solver for logic verification
  • SciRS2: Scientific computing library
  • NumRS2: Numerical computing primitives
  • OxiBLAS: BLAS implementation in Pure Rust

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

COOLJAPAN OU (Team Kitasan)

Commit count: 1

cargo fmt