| Crates.io | rs3gw |
| lib.rs | rs3gw |
| version | 0.1.0 |
| created_at | 2025-12-06 04:31:52.558778+00 |
| updated_at | 2026-01-04 04:11:31.437435+00 |
| description | High-Performance AI/HPC Object Storage Gateway powered by scirs2-io |
| homepage | |
| repository | https://github.com/cool-japan/rs3gw |
| max_upload_size | |
| id | 1969615 |
| size | 3,220,568 |
High-Performance Enterprise Object Storage Gateway
rs3gw (Rust S3 Gateway) is an ultra-high-performance, enterprise-grade object storage gateway designed for AI/ML workloads, scientific computing (HPC), and large-scale data management. Built on Rust's zero-cost abstractions and powered by scirs2-io, it delivers S3-compatible access with predictable low latency, comprehensive observability, and advanced enterprise features.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Clients: PyTorch/TensorFlow | boto3 | aws-cli | gRPC | GraphQL โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ HTTP/REST, gRPC, GraphQL, WebSocket
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ rs3gw Gateway โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ REST API โ โ gRPC API โ โ GraphQL + WebSocket โ โ
โ โ (100+ ops) โ โ (40+ ops) โ โ (Realtime events) โ โ
โ โโโโโโโโฌโโโโโโโ โโโโโโโโฌโโโโโโโโ โโโโโโโโโโโโฌโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโ โ
โ โ S3 Select Query Engine โ โ
โ โ SQL on CSV/JSON/Parquet/Avro/ORC with Optimization โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Advanced Features Layer โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Dedup โ โ ML Cache โ โ Encryption/Compress โ โ โ
โ โ โ Zero-copy โ โ ABAC โ โ Audit/Compliance โ โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Multi-Backend Storage Abstraction โ โ
โ โ Local | MinIO | AWS S3 | GCS | Azure | Ceph โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ scirs2-io High-Performance Storage Engine โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Compression โ โ Format I/O โ โ Async Buffer Management โ โ
โ โ (Zstd/LZ4) โ โ (Parquet) โ โ (Direct I/O) โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Clone the repository
git clone https://github.com/cool-japan/rs3gw.git
cd rs3gw
# Build release binary (optimized)
cargo build --release
# Run the server
./target/release/rs3gw
We provide a comprehensive development stack with monitoring:
# Start the full stack (rs3gw + Prometheus + Grafana + Jaeger + MinIO)
docker-compose -f docker-compose.dev.yml up -d
# Access services:
# - rs3gw S3 API: http://localhost:9000
# - Grafana Dashboard: http://localhost:3000 (admin/admin)
# - Prometheus: http://localhost:9091
# - Jaeger UI: http://localhost:16686
# - MinIO Console: http://localhost:9002 (minioadmin/minioadmin)
rs3gw supports both TOML configuration files and environment variables:
rs3gw.toml.example to rs3gw.toml and customize.env.example to .env and customizeEssential Configuration:
export RS3GW_BIND_ADDR="0.0.0.0:9000"
export RS3GW_STORAGE_ROOT="./data"
export RS3GW_ACCESS_KEY="minioadmin"
export RS3GW_SECRET_KEY="minioadmin"
export RS3GW_COMPRESSION="zstd:3"
export RS3GW_CACHE_ENABLED="true"
export RS3GW_DEDUP_ENABLED="true"
# Configure endpoint
aws configure set default.s3.endpoint_url http://localhost:9000
# Create bucket and upload
aws s3 mb s3://my-bucket
aws s3 cp myfile.txt s3://my-bucket/
# S3 Select query (SQL on CSV/JSON/Parquet)
aws s3api select-object-content \
--bucket my-bucket \
--key data.csv \
--expression "SELECT * FROM S3Object WHERE age > 30" \
--expression-type SQL \
--input-serialization '{"CSV": {"FileHeaderInfo": "USE"}}' \
--output-serialization '{"CSV": {}}' \
output.csv
import boto3
s3 = boto3.client(
's3',
endpoint_url='http://localhost:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin',
region_name='us-east-1'
)
# Basic operations
s3.create_bucket(Bucket='my-bucket')
s3.upload_file('local.txt', 'my-bucket', 'remote.txt')
# S3 Select
response = s3.select_object_content(
Bucket='my-bucket',
Key='data.csv',
ExpressionType='SQL',
Expression='SELECT name, age FROM S3Object WHERE age > 25',
InputSerialization={'CSV': {'FileHeaderInfo': 'USE'}},
OutputSerialization={'CSV': {}}
)
# Multipart upload for large files
mpu = s3.create_multipart_upload(Bucket='my-bucket', Key='large.dat')
parts = []
for i, chunk in enumerate(read_chunks('large.dat', 5*1024*1024), 1):
part = s3.upload_part(
Bucket='my-bucket', Key='large.dat',
PartNumber=i, UploadId=mpu['UploadId'],
Body=chunk
)
parts.append({'PartNumber': i, 'ETag': part['ETag']})
s3.complete_multipart_upload(
Bucket='my-bucket', Key='large.dat',
UploadId=mpu['UploadId'],
MultipartUpload={'Parts': parts}
)
use rs3gw_proto::s3_service_client::S3ServiceClient;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = S3ServiceClient::connect("http://localhost:9000").await?;
let request = tonic::Request::new(ListBucketsRequest {});
let response = client.list_buckets(request).await?;
for bucket in response.into_inner().buckets {
println!("Bucket: {}", bucket.name);
}
Ok(())
}
query {
buckets {
name
creationDate
objectCount
totalSize
}
searchObjects(query: "*.parquet", bucket: "my-bucket") {
key
size
lastModified
}
}
const ws = new WebSocket('ws://localhost:9000/events/stream?bucket=my-bucket');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log('Event:', data.event_type, data.object_key);
};
Manage machine learning training experiments, checkpoints, and hyperparameter searches:
# Create a training experiment
curl -X POST http://localhost:9000/api/training/experiments \
-H "Content-Type: application/json" \
-d '{
"name": "my-model-training",
"description": "Training ResNet-50 on ImageNet",
"tags": ["resnet", "imagenet"],
"hyperparameters": {
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 100
}
}'
# Save a checkpoint
curl -X POST http://localhost:9000/api/training/experiments/{experiment_id}/checkpoints \
-H "Content-Type: application/json" \
-d '{
"epoch": 10,
"model_state": "base64_encoded_model_data",
"optimizer_state": "base64_encoded_optimizer_data",
"metrics": {
"loss": 0.234,
"accuracy": 0.892
}
}'
# Load a checkpoint
curl http://localhost:9000/api/training/checkpoints/{checkpoint_id}
# Log training metrics
curl -X POST http://localhost:9000/api/training/experiments/{experiment_id}/metrics \
-H "Content-Type: application/json" \
-d '{
"step": 1000,
"metrics": {
"loss": 0.234,
"accuracy": 0.892,
"val_loss": 0.256,
"val_accuracy": 0.875
}
}'
# Get experiment metrics
curl http://localhost:9000/api/training/experiments/{experiment_id}/metrics
# List checkpoints
curl http://localhost:9000/api/training/experiments/{experiment_id}/checkpoints
# Update experiment status
curl -X PUT http://localhost:9000/api/training/experiments/{experiment_id}/status \
-H "Content-Type: application/json" \
-d '{"status": "completed"}'
# Create hyperparameter search
curl -X POST http://localhost:9000/api/training/searches \
-H "Content-Type: application/json" \
-d '{
"search_space": {
"learning_rate": [0.0001, 0.001, 0.01],
"batch_size": [16, 32, 64]
},
"optimization_metric": "val_accuracy"
}'
# Add trial result to hyperparameter search
curl -X POST http://localhost:9000/api/training/searches/{search_id}/trials \
-H "Content-Type: application/json" \
-d '{
"parameters": {
"learning_rate": 0.001,
"batch_size": 32
},
"metrics": {
"val_accuracy": 0.892
},
"status": "completed"
}'
Python example with requests:
import requests
import base64
import json
# Create experiment
response = requests.post('http://localhost:9000/api/training/experiments', json={
'name': 'pytorch-training',
'description': 'Training with PyTorch',
'tags': ['pytorch', 'cnn'],
'hyperparameters': {
'lr': 0.001,
'batch_size': 32
}
})
experiment = response.json()['experiment']
exp_id = experiment['id']
# Save checkpoint during training
import torch
model_state = torch.save(model.state_dict()) # Your PyTorch model
model_bytes = pickle.dumps(model_state)
model_b64 = base64.b64encode(model_bytes).decode('utf-8')
requests.post(f'http://localhost:9000/api/training/experiments/{exp_id}/checkpoints', json={
'epoch': 10,
'model_state': model_b64,
'metrics': {
'loss': 0.234,
'accuracy': 0.892
}
})
# Log metrics every N steps
for step in range(1000):
# ... training code ...
if step % 100 == 0:
requests.post(f'http://localhost:9000/api/training/experiments/{exp_id}/metrics', json={
'step': step,
'metrics': {
'loss': current_loss,
'accuracy': current_acc
}
})
Generate test datasets for benchmarking and testing:
# Generate a medium-sized mixed dataset
cargo run --bin testdata-generator -- dataset \
--output ./testdata \
--size medium
# Generate specific file types
cargo run --bin testdata-generator -- parquet \
--output ./parquet-data \
--count 10 \
--rows 100000
Migrate data between S3-compatible systems:
# Copy all objects from MinIO to rs3gw
cargo run --bin s3-migrate -- copy \
--source-endpoint http://minio:9000 \
--source-access-key minioadmin \
--source-secret-key minioadmin \
--source-bucket source-bucket \
--dest-endpoint http://localhost:9000 \
--dest-access-key minioadmin \
--dest-secret-key minioadmin \
--dest-bucket dest-bucket \
--concurrency 20
# Incremental sync with verification
cargo run --bin s3-migrate -- sync \
--source-endpoint http://minio:9000 \
--source-access-key minioadmin \
--source-secret-key minioadmin \
--source-bucket source-bucket \
--dest-endpoint http://localhost:9000 \
--dest-access-key minioadmin \
--dest-secret-key minioadmin \
--dest-bucket dest-bucket \
--delete
# Verify data integrity
cargo run --bin s3-migrate -- verify \
--source-endpoint http://minio:9000 \
--source-access-key minioadmin \
--source-secret-key minioadmin \
--source-bucket source-bucket \
--dest-endpoint http://localhost:9000 \
--dest-access-key minioadmin \
--dest-secret-key minioadmin \
--dest-bucket dest-bucket
# Data Deduplication (30-70% storage savings)
export RS3GW_DEDUP_ENABLED=true
export RS3GW_DEDUP_BLOCK_SIZE=65536
export RS3GW_DEDUP_ALGORITHM=content-defined
# Zero-Copy Optimizations
export RS3GW_ZEROCOPY_DIRECT_IO=true
export RS3GW_ZEROCOPY_SPLICE=true
export RS3GW_ZEROCOPY_MMAP=true
# Smart ML-based Caching
export RS3GW_CACHE_ENABLED=true
export RS3GW_CACHE_MAX_SIZE_MB=512
export RS3GW_CACHE_TTL=300
# Encryption
export RS3GW_ENCRYPTION_ENABLED=true
export RS3GW_ENCRYPTION_ALGORITHM=aes256gcm
# Audit Logging
export RS3GW_AUDIT_ENABLED=true
export RS3GW_AUDIT_LOG_PATH=/var/log/rs3gw/audit.log
# ABAC (Attribute-Based Access Control)
export RS3GW_ABAC_ENABLED=true
# Multi-node cluster with replication
export RS3GW_CLUSTER_ENABLED=true
export RS3GW_CLUSTER_NODE_ID=node1
export RS3GW_CLUSTER_ADVERTISE_ADDR=10.0.0.1:9001
export RS3GW_CLUSTER_SEED_NODES=10.0.0.2:9001,10.0.0.3:9001
export RS3GW_REPLICATION_MODE=quorum
export RS3GW_REPLICATION_FACTOR=3
# OpenTelemetry distributed tracing
export OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
export OTEL_TRACES_SAMPLER=traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1
# Profiling
export RS3GW_PROFILING_ENABLED=true
export RS3GW_PROFILING_INTERVAL_SECS=60
rs3gw provides powerful server-side object transformation capabilities with extensible plugin support.
| Type | Feature Flag | Status | Use Cases |
|---|---|---|---|
| Image Processing | default | โ Production | Resize, crop, format conversion |
| Compression | default | โ Production | Zstd, Gzip, LZ4 |
| Video Transcoding | video-transcoding |
โ Production | Multi-codec video conversion |
| WASM Plugins | wasm-plugins |
โ Production | Custom extensible transformations |
// Resize and convert to WebP
use rs3gw::storage::transformations::{TransformationType, ImageTransformParams};
let transform = TransformationType::Image {
params: ImageTransformParams {
width: Some(800),
height: None, // Maintains aspect ratio
format: Some(ImageFormat::Webp),
quality: Some(85),
maintain_aspect_ratio: true,
crop_mode: None,
}
};
Features:
Requires: video-transcoding feature flag
# Build with video transcoding support
cargo build --features video-transcoding
// Transcode to H.264
let transform = TransformationType::Video {
params: VideoTransformParams {
codec: VideoCodec::H264,
bitrate: Some(2000), // 2000 kbps
fps: Some(30),
width: Some(1920),
height: Some(1080),
audio_codec: Some("aac".to_string()),
audio_bitrate: Some(128),
}
};
Supported Codecs: H.264, H.265/HEVC, VP8, VP9, AV1
Requires: wasm-plugins feature flag
# Build with WASM plugin support
cargo build --features wasm-plugins
Create custom transformations in WebAssembly:
// Register and use custom plugin
let transformer = WasmPluginTransformer::new();
let wasm_binary = std::fs::read("plugins/my-plugin.wasm")?;
transformer.register_plugin("my-plugin".to_string(), wasm_binary).await?;
let transform = TransformationType::WasmPlugin {
plugin_name: "my-plugin".to_string(),
params: HashMap::new(),
};
Documentation:
# Build with all optional features enabled
cargo build --all-features --release
# Available features:
# - io_uring: Linux io_uring support (Linux only)
# - video-transcoding: FFmpeg-based video transcoding (requires FFmpeg)
# - wasm-plugins: WebAssembly plugin system (Pure Rust)
rs3gw delivers exceptional performance through Rust's zero-cost abstractions:
Run comprehensive benchmarks:
# Storage operations
cargo bench --bench storage_benchmarks
# S3 API operations
cargo bench --bench s3_api_benchmarks
# Load testing
cargo bench --bench load_testing_benchmarks
# Compression
cargo bench --bench compression_benchmarks
# Run all 392 tests (305 unit + 67 integration)
cargo nextest run --all-features
# Run integration tests only
cargo test --test '*'
# Run with code coverage
cargo tarpaulin --all-features --out Html
# Run specific test suite
cargo test --test grpc_tests
# Run benchmarks
cargo bench
rs3gw.toml.example - TOML configuration template.env.example - Environment variable template๐ See the Production Deployment Guide for comprehensive deployment instructions.
# Deploy with Kustomize
kubectl apply -k k8s/overlays/production/
# Or with Helm
helm install rs3gw k8s/helm/rs3gw/ \
--set replicaCount=3 \
--set persistence.size=500Gi
Access the Grafana dashboard (included in docker-compose.dev.yml):
Rs3gw is fully compliant with the SCIRS2 (Scientific Rust) ecosystem policies. This ensures high-quality, reproducible, and scientifically sound code.
Rs3gw uses scirs2-core::random instead of the standard rand crate for:
Verify policy compliance:
# Run all policy checks
./scripts/verify_policies.sh
# Individual checks
cargo build --all-features # No warnings
cargo clippy --all-targets # No clippy warnings
cargo nextest run # All tests pass (550/550)
For detailed policy information, see SCIRS2_POLICY.md.
We welcome contributions! Please see our development process:
cargo nextest run --all-featurescargo clippy --all-featuresThis project is dual-licensed under:
Choose the license that best fits your use case.
Built with โค๏ธ in Rust for performance-critical workloads