batuta

Crates.io	batuta
lib.rs	batuta
version	0.5.0
created_at	2025-11-21 08:04:54.931227+00
updated_at	2026-01-14 14:31:36.661029+00
description	Orchestration framework for converting ANY project (Python, C/C++, Shell) to modern Rust
homepage
repository	https://github.com/paiml/Batuta
max_upload_size
id	1943226
size	4,464,908

Noah Gift (noahgift)

documentation

README

batuta

Orchestration framework for the Sovereign AI Stack — privacy-preserving ML infrastructure in pure Rust

Overview
Installation
Quick Start
Stack Components
Architecture
CLI Reference
License

Overview

Batuta coordinates the Sovereign AI Stack, a comprehensive pure-Rust ecosystem for organizations requiring complete control over their ML infrastructure. The stack enables privacy-preserving inference, model management, and data processing without external cloud dependencies.

Key Capabilities

Privacy Tiers: Sovereign (local-only), Private (VPC), Standard (cloud-enabled)
Model Security: Ed25519 signatures, ChaCha20-Poly1305 encryption, BLAKE3 content addressing
API Compatibility: OpenAI-compatible endpoints for drop-in replacement
Observability: Prometheus metrics, distributed tracing, A/B testing
Cost Control: Circuit breakers with configurable daily budgets

Installation

cargo install batuta

Or add to your Cargo.toml:

[dependencies]
batuta = "0.4"

Quick Start

# Analyze project structure and dependencies
batuta analyze --languages --dependencies --tdg

# Query the Sovereign AI Stack
batuta oracle "How do I serve a Llama model locally?"

# Model registry operations
batuta pacha pull llama3-8b-q4
batuta pacha sign model.gguf --identity alice@example.com
batuta pacha verify model.gguf

# Encrypt models for distribution
batuta pacha encrypt model.gguf --password-env MODEL_KEY
batuta pacha decrypt model.gguf.enc --password-env MODEL_KEY

Demo

Live Demo: paiml.github.io/batuta | API Docs

Example Output (batuta analyze --tdg):

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  📊 Technical Debt Gradient Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  Project: my-project
  Language: Rust (confidence: 98%)

  Metrics:
    Cyclomatic Complexity:  4.2 avg (good)
    Test Coverage:          87% (A-)
    Documentation:          92% (A)
    Dependency Health:      95% (A+)

  TDG Score: 91.5/100 (A)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Stack Components

Batuta orchestrates a layered architecture of pure-Rust components:

┌─────────────────────────────────────────────────────────────┐
│                    batuta v0.4.8                            │
│                 (Orchestration Layer)                       │
├─────────────────────────────────────────────────────────────┤
│     realizar v0.5        │         pacha v0.2               │
│   (Inference Engine)     │      (Model Registry)            │
├──────────────────────────┴──────────────────────────────────┤
│   aprender v0.24   │  entrenar v0.5  │  alimentar v0.2      │
│    (ML Algorithms) │    (Training)   │   (Data Loading)     │
├─────────────────────────────────────────────────────────────┤
│   trueno v0.11     │  repartir v2.0  │   renacer v0.9       │
│ (SIMD/GPU Compute) │  (Distributed)  │  (Syscall Tracing)   │
└─────────────────────────────────────────────────────────────┘

Core Components

Component	Version	Description
trueno	0.11	SIMD/GPU compute primitives (AVX2/AVX-512/NEON, wgpu)
aprender	0.24	ML algorithms: regression, trees, clustering, NAS
entrenar	0.5	Training: autograd, LoRA/QLoRA, quantization
realizar	0.5	Inference engine for GGUF/SafeTensors models
pacha	0.2	Model registry with signatures, encryption, lineage
repartir	2.0	Distributed compute (CPU/GPU/Remote executors)
renacer	0.9	Syscall tracing with semantic validation
batuta	0.4	Stack orchestration, drift detection, CLI

Extended Ecosystem

Component	Version	Description
trueno-db	0.3	GPU-accelerated analytics database
trueno-graph	0.1	Graph database for code analysis
trueno-rag	0.1	RAG pipeline (chunking, BM25+vector, RRF)
trueno-viz	0.1	Terminal/PNG visualization
alimentar	0.2	Zero-copy Parquet/Arrow data loading
whisper-apr	0.1	Pure Rust Whisper ASR (WASM-first)
jugar	0.1	Game engine (ECS, physics, AI, WASM)
simular	0.3	Simulation engine (Monte Carlo, physics)
bashrs	6.53	Shell-to-Rust transpiler and linter
presentar	0.3	Terminal presentation framework
pmat	2.213	Project quality analysis toolkit

Commands

`batuta analyze`

Analyze project structure, languages, and dependencies:

batuta analyze --languages --dependencies --tdg

# Output:
# Primary language: Python
# Dependencies: pip (42 packages), ML frameworks detected
# TDG Score: 73.2/100 (B)
# Recommended: Use Aprender for ML, Realizar for inference

`batuta oracle`

Query the stack for component recommendations:

# Natural language queries
batuta oracle "Train random forest on 1M samples"

# List all components
batuta oracle --list

# Component details
batuta oracle --show realizar

# Interactive mode
batuta oracle --interactive

`batuta pacha`

Model registry operations:

# Pull models from registry
batuta pacha pull llama3-8b-q4

# Generate signing keys
batuta pacha keygen --identity alice@example.com

# Sign models for distribution
batuta pacha sign model.gguf --identity alice@example.com

# Verify model signatures
batuta pacha verify model.gguf

# Encrypt models at rest
batuta pacha encrypt model.gguf --password-env MODEL_KEY

# Decrypt for inference
batuta pacha decrypt model.gguf.enc --password-env MODEL_KEY

`batuta content`

Generate structured content with quality constraints:

# Available content types
batuta content types

# Generate book chapter prompt
batuta content emit --type bch --title "Error Handling" --audience "developers"

# Validate content quality
batuta content validate --type bch chapter.md

`batuta stack`

Manage the Sovereign AI Stack ecosystem:

# Check stack component versions
batuta stack versions

# Detect version drift across published crates
batuta stack drift

# Generate fix commands for drift issues
batuta stack drift --fix --workspace ~/src

# Check which crates need publishing
batuta stack publish-status

# Quality gate for CI/pre-commit
batuta stack gate

Automatic Drift Detection: Batuta blocks all commands if published stack crates are using outdated versions of other stack crates. Use --unsafe-skip-drift-check to bypass in emergencies.

Privacy Tiers

The stack enforces data sovereignty through configurable privacy tiers:

Tier	Behavior	Use Case
Sovereign	Blocks ALL external API calls	Healthcare, Government
Private	VPC/dedicated endpoints only	Financial services
Standard	Public APIs allowed	General deployment

use batuta::serve::{BackendSelector, PrivacyTier};

let selector = BackendSelector::new()
    .with_privacy(PrivacyTier::Sovereign);

// Returns only local backends: Realizar, Ollama, LlamaCpp
let backends = selector.recommend();

Model Security

Digital Signatures (Ed25519)

Verify model integrity before loading:

use pacha::signing::{SigningKey, sign_model, verify_model};

let signing_key = SigningKey::generate();
let signature = sign_model(&model_data, &signing_key)?;

// Verification fails if model tampered
verify_model(&model_data, &signature)?;

Encryption at Rest (ChaCha20-Poly1305)

Protect models during distribution:

use pacha::crypto::{encrypt_model, decrypt_model};

let encrypted = encrypt_model(&model_data, "password")?;
let decrypted = decrypt_model(&encrypted, "password")?;

Documentation

The Batuta Book — Comprehensive guide
Sovereign AI Stack Book — Complete stack tutorial with 22 chapters
API Documentation — Rust API reference
Specifications — Technical specifications

Design Principles

Batuta applies Toyota Production System principles:

Principle	Application
Jidoka	Automatic failover with context preservation
Poka-Yoke	Privacy tiers prevent data leakage
Heijunka	Spillover routing for load leveling
Muda	Cost circuit breakers prevent waste
Kaizen	Continuous metrics and optimization

Development

# Clone repository
git clone https://github.com/paiml/batuta.git
cd batuta

# Build
cargo build --release

# Run tests
cargo test

# Build documentation
mdbook build book

Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository and create your branch from main
Run tests before submitting: cargo test --all-features
Run lints: cargo clippy --all-targets --all-features -- -D warnings
Format code: cargo fmt --all
Update documentation for any API changes
Submit a pull request with a clear description

See our CI workflow for the full test suite.

License

MIT License — see LICENSE for details.

Links

Batuta — Orchestrating sovereign AI infrastructure.

Commit count: 223

batuta

documentation

README

batuta

Table of Contents

Overview

Key Capabilities

Installation

Quick Start

Demo

Stack Components

Core Components

Extended Ecosystem

Commands

batuta analyze

batuta oracle

batuta pacha

batuta content

batuta stack

Privacy Tiers

Model Security

Digital Signatures (Ed25519)

Encryption at Rest (ChaCha20-Poly1305)

Documentation

Design Principles

Development

Contributing

License

Links

cargo fmt

`batuta analyze`

`batuta oracle`

`batuta pacha`

`batuta content`

`batuta stack`