| Crates.io | repartir |
| lib.rs | repartir |
| version | 2.0.1 |
| created_at | 2025-11-22 19:01:42.594998+00 |
| updated_at | 2026-01-07 10:33:31.710825+00 |
| description | Sovereign AI-grade distributed computing primitives for Rust (CPU, GPU, HPC) |
| homepage | |
| repository | https://github.com/paiml/repartir |
| max_upload_size | |
| id | 1945587 |
| size | 975,965 |
Repartir is a pure Rust library for distributed execution across CPUs, GPUs, and remote machines. Built on the Iron Lotus Framework (Toyota Way principles for systems programming) and validated by the certeza testing methodology.
# From crates.io
cargo add repartir
# From source
git clone https://github.com/paiml/repartir
cd repartir
cargo install --path .
Add to your Cargo.toml:
[dependencies]
repartir = "0.1"
tokio = { version = "1.35", features = ["rt-multi-thread", "macros"] }
use repartir::{Pool, task::{Task, Backend}};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Create a pool with 4 CPU workers
let pool = Pool::builder()
.cpu_workers(4)
.build()?;
// Submit a task
let task = Task::builder()
.binary("/bin/echo")
.arg("Hello from Repartir!")
.backend(Backend::Cpu)
.build()?;
let result = pool.submit(task).await?;
if result.is_success() {
println!("Output: {}", result.stdout_str()?.trim());
}
pool.shutdown().await;
Ok(())
}
cargo run --example hello_repartir
See all v1.1 features in action:
# Generate TLS certificates first
./scripts/generate-test-certs.sh ./certs
# Run comprehensive showcase
cargo run --example v1_1_showcase --features full
Demonstrates:
Repartir v2.0 introduces Parquet checkpoint storage and data-locality aware scheduling for enterprise-grade distributed computing.
Enable persistent checkpointing with Apache Parquet format (5-10x compression vs JSON):
use repartir::checkpoint::{CheckpointManager, CheckpointId};
use repartir::task::TaskState;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Create checkpoint manager with Parquet backend
let manager = CheckpointManager::new("./checkpoints")?;
// Save checkpoint (automatically uses Parquet format)
let checkpoint_id = CheckpointId::new();
let state = TaskState::new(vec![1, 2, 3, 4]);
manager.save(checkpoint_id, &state).await?;
// Restore checkpoint (supports both Parquet and legacy JSON)
let restored = manager.restore(checkpoint_id).await?;
println!("Restored iteration: {}", restored.iteration);
Ok(())
}
Storage efficiency:
.parquet and .json formatsRun the example:
cargo run --example checkpoint_example --features checkpoint
Minimize network transfers by scheduling tasks on workers that already have required data:
use repartir::{Pool, task::Task, scheduler::Scheduler};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let scheduler = Scheduler::with_capacity(100);
// Track data locations
let worker_id = uuid::Uuid::new_v4();
scheduler.data_tracker()
.track_data("dataset_A", worker_id).await;
scheduler.data_tracker()
.track_data("dataset_B", worker_id).await;
// Submit task with affinity (automatically prefers worker_id)
let task = Task::builder()
.binary("/usr/bin/process")
.build()?;
scheduler.submit_with_data_locality(
task,
&["dataset_A".to_string(), "dataset_B".to_string()]
).await?;
// Check locality metrics
let metrics = scheduler.locality_metrics().await;
println!("Locality hit rate: {:.1}%", metrics.hit_rate() * 100.0);
Ok(())
}
Scheduling intelligence:
(data_items_present / total_data_items)Performance benefits:
High-performance tensor operations with automatic SIMD optimization:
use repartir::tensor::{TensorExecutor, Tensor};
use repartir::task::Backend;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Create tensor executor with CPU backend
let executor = TensorExecutor::builder()
.backend(Backend::Cpu)
.build()?;
// SIMD-accelerated operations
let a = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0]);
let b = Tensor::from_slice(&[5.0, 6.0, 7.0, 8.0]);
// Element-wise operations
let sum = executor.add(&a, &b).await?;
let product = executor.mul(&a, &b).await?;
// Dot product
let dot = executor.dot(&a, &b).await?;
println!("Dot product: {}", dot);
// Scalar operations
let scaled = executor.scalar_mul(&a, 2.5).await?;
Ok(())
}
SIMD acceleration:
Operations:
add(), sub(), mul(), div()dot()scalar_mul()Run the example:
cargo run --example tensor_example --features tensor
Repartir supports multiple execution backends via feature flags:
[dependencies]
# CPU only (default)
repartir = "0.1"
# With GPU support (v1.1+)
repartir = { version = "0.1", features = ["gpu"] }
# With remote execution (v1.1+)
repartir = { version = "0.1", features = ["remote"] }
# With TLS encryption (v1.1+)
repartir = { version = "0.1", features = ["remote-tls"] }
# With Parquet checkpointing (v2.0+)
repartir = { version = "0.1", features = ["checkpoint"] }
# With SIMD tensor operations (v2.0+)
repartir = { version = "0.1", features = ["tensor"] }
# All features
repartir = { version = "0.1", features = ["full"] }
The GPU executor uses wgpu for cross-platform GPU compute:
use repartir::executor::gpu::GpuExecutor;
use repartir::executor::Executor;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let executor = GpuExecutor::new().await?;
println!("GPU: {}", executor.device_name());
println!("Compute units: {}", executor.capacity());
Ok(())
}
Supported backends:
Note (v1.1): GPU detection and initialization only. Binary task execution on GPU requires compute shader compilation (v1.2+ with rust-gpu).
cargo run --example gpu_detect --features gpu
Secure remote execution with TLS/SSL encryption using rustls:
use repartir::executor::tls::TlsConfig;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Generate test certificates:
// ./scripts/generate-test-certs.sh ./certs
let tls_config = TlsConfig::builder()
.client_cert("./certs/client.pem")
.client_key("./certs/client.key")
.server_cert("./certs/server.pem")
.server_key("./certs/server.key")
.ca_cert("./certs/ca.pem")
.build()?;
println!("TLS enabled!");
Ok(())
}
Security features:
Generate test certificates:
./scripts/generate-test-certs.sh ./certs
cargo run --example tls_example --features remote-tls
⚠️ WARNING: The included certificate generator creates self-signed certificates for TESTING ONLY. For production, use certificates from a trusted CA (Let's Encrypt, DigiCert, etc.).
Advanced messaging for distributed coordination with PUB/SUB and PUSH/PULL patterns:
One publisher broadcasts to multiple subscribers:
use repartir::messaging::{PubSubChannel, Message};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let channel = PubSubChannel::new();
// Subscribe to topics
let mut events = channel.subscribe("events").await;
let mut alerts = channel.subscribe("alerts").await;
// Publish messages
channel.publish("events", Message::text("Task completed")).await?;
// All subscribers receive broadcast
if let Some(msg) = events.recv().await {
println!("Event: {}", msg.as_text()?);
}
Ok(())
}
Use cases: Event notifications, logging, monitoring, real-time updates
Work distribution with automatic load balancing:
use repartir::messaging::{PushPullChannel, Message};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let channel = PushPullChannel::new(100);
// Producers push work
channel.push(Message::text("Work item 1")).await?;
channel.push(Message::text("Work item 2")).await?;
// Consumers pull work (load balanced)
let work = channel.pull().await;
Ok(())
}
Use cases: Work queues, job scheduling, pipeline processing, task distribution
Run examples:
cargo run --example pubsub_example
cargo run --example pushpull_example
Repartir follows a clean, layered architecture with pepita providing low-level primitives:
┌─────────────────────────────────────────────────────────────────┐
│ repartir │
│ (High-level Distributed API) │
├─────────────────────────────────────────────────────────────────┤
│ Pool │ Scheduler │ Serverless │ Checkpoint │ Messaging │
├─────────────────────────────────────────────────────────────────┤
│ Executor Backends │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ CPU │ │ GPU │ │MicroVM │ │ SIMD │ │ Remote │ │
│ │(v1.0) │ │(v1.1+) │ │(v2.0) │ │(v2.0) │ │(v1.1+) │ │
│ └────────┘ └────────┘ └───┬────┘ └───┬────┘ └────────┘ │
└──────────────────────────────┼───────────┼──────────────────────┘
│ │
┌──────────────────────────────▼───────────▼──────────────────────┐
│ pepita │
│ (Sovereign AI Kernel Interfaces) │
├─────────────┬─────────────┬─────────────┬───────────────────────┤
│ vmm.rs │ virtio.rs │ simd.rs │ zram.rs │
│ (MicroVMs) │ (devices) │ (vectors) │ (compression) │
├─────────────┼─────────────┼─────────────┼───────────────────────┤
│ gpu.rs │ scheduler │ executor │ pool.rs │
│ (compute) │(work-steal) │ (backends) │ (high-level) │
├─────────────┴─────────────┴─────────────┴───────────────────────┤
│ io_uring │ ublk │ blk_mq │ memory (Kernel ABI) │
└─────────────────────────────────────────────────────────────────┘
| Module | Purpose | Backend |
|---|---|---|
executor::cpu |
Execute binaries on CPU threads with work-stealing | Native threads |
executor::gpu |
GPU compute detection and shader execution | wgpu (Vulkan/Metal/DX12) |
executor::microvm |
Hardware-isolated execution in KVM MicroVMs | pepita::vmm |
executor::simd |
SIMD-accelerated vector operations | pepita::simd (AVX-512/NEON) |
executor::remote |
Distributed execution over TCP | rustls TLS 1.3 |
serverless |
Function-as-a-Service with warm pools | pepita::vmm + pepita::virtio |
scheduler |
Priority-based work-stealing scheduler | Blumofe-Leiserson algorithm |
checkpoint |
Parquet-based checkpoint storage | Apache Parquet |
messaging |
PUB/SUB and PUSH/PULL patterns | tokio channels |
Repartir v2.0 integrates with pepita for hardware-accelerated execution:
Execute vectorized operations using AVX-512/AVX2/SSE/NEON:
use repartir::executor::simd::{SimdExecutor, SimdTask};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let executor = SimdExecutor::new();
// Check capabilities
println!("SIMD: {}-bit vectors", executor.vector_width());
// Vector addition (SIMD accelerated)
let a = vec![1.0f32; 10000];
let b = vec![2.0f32; 10000];
let task = SimdTask::vadd_f32(a, b);
let result = executor.execute_simd(task).await?;
println!("Throughput: {:.2}M elem/s", result.throughput() / 1_000_000.0);
Ok(())
}
Supported operations: vadd_f32, vadd_f64, vmul_f32, dot_f32, matmul_f32
Hardware-isolated execution with sub-100ms cold start:
use repartir::executor::microvm::{MicroVmExecutor, MicroVmExecutorConfig};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let config = MicroVmExecutorConfig::builder()
.memory_mib(256)
.vcpus(2)
.warm_pool(true, 3) // Keep 3 VMs warm
.build()?;
let executor = MicroVmExecutor::new(config)?;
// VMs from warm pool start in <5ms
// Cold start is <100ms
Ok(())
}
Features: KVM isolation, warm pools, jailer security, virtio devices
Function-as-a-Service with automatic warm pool management:
use repartir::serverless::{Function, FunctionService, Runtime, Trigger, HttpMethod};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let mut service = FunctionService::new();
let function = Function::builder()
.name("process-data")
.runtime(Runtime::RustNative { binary: PathBuf::from("./target/release/worker") })
.memory_mib(256)
.trigger(Trigger::Http {
path: "/api/process".to_string(),
methods: vec![HttpMethod::Post],
})
.build()?;
service.register(function)?;
// Invoke function
let request = InvocationRequest::new("process-data", b"input data");
let response = service.invoke(request).await?;
Ok(())
}
Run examples:
cargo run --example simd_example --features simd
cargo run --example microvm_example --features microvm
cargo run --example serverless_example --features serverless
Repartir embodies Toyota Production System principles:
Repartir uses a three-tiered testing approach:
Fast feedback for flow state:
make tier1
cargo checkcargo clippycargo fmtTarget: < 3 seconds
Comprehensive pre-commit gate:
make tier2
Target: 1-5 minutes
Exhaustive validation:
make tier3
Target: 1-6 hours (run overnight or in CI)
✓ 190 unit tests (0.12s) - lib tests
✓ 32 integration tests (0.10s) - pepita integration
✓ 4 property-based tests (1.53s) - proptest
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
226 tests PASSED
pepita (dependency):
✓ 417 unit tests (0.71s)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
643 total tests across pepita + repartir
Cargo.lock committed and reviewedcargo tree logged in CIPer NSA/CISA joint guidance on memory-safe languages:
#![deny(unsafe_code)] in v1.0 (no unsafe code)Contributions welcome! Please ensure:
make tier2make coverage (when configured)cargo clippy -- -D warningscargo fmtSee Iron Lotus Code Review Framework for detailed guidelines.
cargo doc --open)Repartir is grounded in peer-reviewed research:
See specification for complete citations.
| Feature | Repartir (v1.1) | Ray | Dask |
|---|---|---|---|
| Language | Rust | Python | Python |
| C Dependencies | Zero* | Many | Some |
| GPU Support | Yes (wgpu) | Limited | No |
| Work Stealing | Yes | No | Yes |
| Fault Tolerance | Yes | Yes | Limited |
| Memory Safety | Guaranteed | Runtime | Runtime |
| Binary Execution | Yes | No | No |
| Remote Execution | Yes (TCP) | Yes | Yes |
*Note: rustls (used for TLS) currently depends on aws-lc-rs (C). Pure Rust alternatives under evaluation for v1.2+.
MIT License - see LICENSE file for details.
Built with the Iron Lotus Framework Quality is not inspected in; it is built in.