| Crates.io | tonbo |
| lib.rs | tonbo |
| version | 0.4.0-a0 |
| created_at | 2024-08-01 08:23:28.901491+00 |
| updated_at | 2025-12-10 07:40:08.932881+00 |
| description | Embedded database for serverless and edge runtimes, storing data as Parquet on S3 |
| homepage | |
| repository | https://github.com/tonbo-io/tonbo |
| max_upload_size | |
| id | 1321766 |
| size | 1,504,046 |
Website | Rust Doc | Blog | Community
Tonbo is an embedded database for serverless and edge runtimes. Your data is stored as Parquet on S3, coordination happens through a manifest, and compute stays fully stateless.
Serverless compute is stateless, but your data isn't. Tonbo bridges this gap:
RecordBatchuse tonbo::{db::{AwsCreds, ObjectSpec, S3Spec}, prelude::*};
#[derive(Record)]
struct User {
#[metadata(k = "tonbo.key", v = "true")]
id: String,
name: String,
score: Option<i64>,
}
// Open on S3
let s3 = S3Spec::new("my-bucket", "data/users", AwsCreds::from_env()?);
let db = DbBuilder::from_schema(User::schema())?
.object_store(ObjectSpec::s3(s3))?.open().await?;
// Insert
let users = vec![User { id: "u1".into(), name: "Alice".into(), score: Some(100) }];
let mut builders = User::new_builders(users.len());
builders.append_rows(users);
db.ingest(builders.finish().into_record_batch()).await?;
// Query
let filter = Predicate::gt(ColumnRef::new("score"), ScalarValue::from(80_i64));
let results = db.scan().filter(filter).collect().await?;
For local development, use .on_disk("/tmp/users")? instead. See examples/ for more.
cargo add tonbo@0.4.0-a0 tokio
Or add to Cargo.toml:
[dependencies]
tonbo = "0.4.0-a0"
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
Run with cargo run --example <name>:
01_basic: Define schema, insert, and query in 30 lines02_transaction: MVCC transactions with upsert, delete, and read-your-writes02b_snapshot: Consistent point-in-time reads while writes continue03_filter: Predicates: eq, gt, in, is_null, and, or, not04_s3: Store Parquet files on S3/R2/MinIO with zero server config05_scan_options: Projection pushdown reads only the columns you need06_composite_key: Multi-column keys for time-series and partitioned data07_streaming: Process millions of rows without loading into memory08_nested_types: Deep struct nesting + Lists stored as Arrow StructArray09_time_travel: Query historical snapshots via MVCC timestampsTonbo implements a merge-tree optimized for object storage: writes go to WAL → MemTable → Parquet SSTables, with MVCC for snapshot isolation and a manifest for coordination via compare-and-swap:
Stateless compute: A worker only needs to read and update the manifest; no long-lived coordinator is required.
Object storage CAS: The manifest is committed using compare-and-swap on S3, so any function can safely participate in commits.
Immutable data: Data files are write-once Parquet SSTables, which matches the strengths of S3 and other object stores.
See docs/overview.md for the full design.
Tonbo is currently in alpha. APIs may change, and we're actively iterating based on feedback. We recommend starting with development and non-critical workloads before moving to production.
Storage
Schema & Query
#[derive(Record)] or dynamic Schema)Backends
Runtime
Integrations
Apache License 2.0. See LICENSE for details.