| Crates.io | bazof |
| lib.rs | bazof |
| version | 0.1.0 |
| created_at | 2025-03-17 08:08:22.170428+00 |
| updated_at | 2025-03-17 08:08:22.170428+00 |
| description | Lakehouse format with event time travel |
| homepage | |
| repository | https://github.com/MaciekLesiczka/bazof |
| max_upload_size | |
| id | 1595163 |
| size | 81,369 |
Query tables in object storage as of event time.
Bazof is a lakehouse format with time-travel capabilities.
The Bazof project is organized as a Rust workspace with multiple crates:
To build all projects in the workspace:
cargo build --workspace
The bazof-cli provides a command-line interface for interacting with bazof:
# Scan a table (current version)
cargo run -p bazof-cli -- scan --path ./test-data --table table0
# Scan a table as of a specific event time
cargo run -p bazof-cli -- scan --path ./test-data --table table0 --as-of "2024-03-15T14:30:00"
The bazof-datafusion crate provides integration with Apache DataFusion, allowing you to:
use bazof_datafusion::BazofTableProvider;
use datafusion::prelude::*;
async fn query_bazof() -> Result<(), Box<dyn std::error::Error>> {
let ctx = SessionContext::new();
let event_time = Utc.with_ymd_and_hms(2019, 1, 17, 0, 0, 0).unwrap();
let provider = BazofTableProvider::as_of(
store_path.clone(),
local_store.clone(),
"ltm_revenue".to_string(),
event_time
)?;
ctx.register_table("ltm_revenue_jan17", Arc::new(provider))?;
let df = ctx.sql("SELECT key as symbol, value as revenue FROM ltm_revenue_jan17 WHERE key IN ('AAPL', 'GOOG') ORDER BY key").await?;
df.show().await?;
}
Run the example:
cargo run --example query_example -p bazof-datafusion
If you install the CLI with cargo install --path crates/bazof-cli, you can run it directly with:
bazof-cli scan --path ./test-data --table table0
Bazof is under development. The goal is to implement a data lakehouse with the following capabilities: