| Crates.io | typed-arrow-dyn |
| lib.rs | typed-arrow-dyn |
| version | 0.0.1 |
| created_at | 2025-08-21 07:45:58.093288+00 |
| updated_at | 2025-09-18 15:24:26.919828+00 |
| description | Dynamic Arrow facade for typed-arrow (runtime schema/builders). |
| homepage | |
| repository | https://github.com/tonbo-io/typed-arrow |
| max_upload_size | |
| id | 1804411 |
| size | 141,168 |
typed-arrow provides a strongly typed, fully compile-time way to declare Arrow schemas in Rust. It maps Rust types directly to arrow-rs typed builders/arrays and arrow_schema::DataType — without any runtime DataType switching — enabling zero runtime cost, monomorphized column construction and ergonomic ORM-like APIs.
DataType matching.arrow-array/arrow-schema types directly; no bespoke runtime layer to learn.use typed_arrow::{prelude::*, schema::SchemaMeta};
use typed_arrow::{Dictionary, TimestampTz, Millisecond, Utc, List};
#[derive(typed_arrow::Record)]
struct Address { city: String, zip: Option<i32> }
#[derive(typed_arrow::Record)]
struct Person {
id: i64,
address: Option<Address>,
tags: Option<List<Option<i32>>>, // List column with nullable items
code: Option<Dictionary<i32, String>>, // Dictionary<i32, Utf8>
joined: TimestampTz<Millisecond, Utc>, // Timestamp(ms) with timezone (UTC)
}
fn main() {
// Build from owned rows
let rows = vec![
Person {
id: 1,
address: Some(Address { city: "NYC".into(), zip: None }),
tags: Some(List::new(vec![Some(1), None, Some(3)])),
code: Some(Dictionary::new("gold".into())),
joined: TimestampTz::<Millisecond, Utc>::new(1_700_000_000_000),
},
Person {
id: 2,
address: None,
tags: None,
code: None,
joined: TimestampTz::<Millisecond, Utc>::new(1_700_000_100_000),
},
];
let mut b = <Person as BuildRows>::new_builders(rows.len());
b.append_rows(rows);
let arrays = b.finish();
// Compile-time schema + RecordBatch
let batch = arrays.into_record_batch();
assert_eq!(batch.schema().fields().len(), <Person as Record>::LEN);
println!("rows={}, field0={}", batch.num_rows(), batch.schema().field(0).name());
}
Add to your Cargo.toml (derives enabled by default):
[dependencies]
typed-arrow = { version = "0.x" }
When working in this repository/workspace:
[dependencies]
typed-arrow = { path = "." }
Run the included examples to see end-to-end usage:
01_primitives — derive Record, inspect DataType, build primitives02_lists — List<T> and List<Option<T>>03_dictionary — Dictionary<K, String>04_timestamps — Timestamp<U> units04b_timestamps_tz — TimestampTz<U, Z> with Utc and custom markers05_structs — nested structs → StructArray06_rows_flat — row-based building for flat records07_rows_nested — row-based building with nested struct fields08_record_batch — compile-time schema + RecordBatch09_duration_interval — Duration and Interval types10_union — Dense Union as a Record column (with attributes)11_map — Map (incl. Option<V> values) + as a Record column12_ext_hooks — Extend #[derive(Record)] with visitor injection and macro callbacksRun:
cargo run --example 08_record_batch
Record: implemented by the derive macro for structs with named fields.ColAt<I>: per-column associated items Rust, ColumnBuilder, ColumnArray, NULLABLE, NAME, and data_type().ArrowBinding: compile-time mapping from a Rust value type to its Arrow builder, array, and DataType.BuildRows: derive generates <Type>Builders and <Type>Arrays with append_row(s) and finish.SchemaMeta: derive provides fields() and schema(); arrays structs provide into_record_batch().AppendStruct and StructMeta: enable nested struct fields and StructArray building.#[schema_metadata(k = "owner", v = "data")].#[metadata(k = "pii", v = "email")].Struct columns by default. Make the parent field nullable with Option<Nested>; child nullability is independent.List<T> (items non-null) and List<Option<T>> (items nullable). Use Option<List<_>> for list-level nulls.LargeList<T> and LargeList<Option<T>> for 64-bit offsets; wrap with Option<_> for column nulls.FixedSizeList<T, N> (items non-null) and FixedSizeListNullable<T, N> (items nullable). Wrap with Option<_> for list-level nulls.Map<K, V, const SORTED: bool = false> where keys are non-null; use Map<K, Option<V>> to allow nullable values. Column nullability via Option<Map<...>>. SORTED sets keys_sorted in the Arrow DataType.OrderedMap<K, V> uses BTreeMap<K, V> and declares keys_sorted = true.Dictionary<K, V> with integral keys K ∈ { i8, i16, i32, i64, u8, u16, u32, u64 } and values:
String/LargeUtf8 (Utf8/LargeUtf8)Vec<u8>/LargeBinary (Binary/LargeBinary)[u8; N] (FixedSizeBinary)i*, u*, f32, f64
Column nullability via Option<Dictionary<..>>.Timestamp<U> (unit-only) and TimestampTz<U, Z> (unit + timezone). Units: Second, Millisecond, Microsecond, Nanosecond. Use Utc or define your own Z: TimeZoneSpec.Decimal128<P, S> and Decimal256<P, S> (precision P, scale S as const generics).#[derive(Union)] for enums with #[union(mode = "dense"|"sparse")], per-variant #[union(tag = N)], #[union(field = "name")], and optional null carrier #[union(null)] or container-level null_variant = "Var".Supported (arrow-rs v56):
[u8; N])Option<V> for nullable values), OrderedMap (BTreeMap<K,V>) with keys_sorted = true#[derive(Union)] on enums)[u8; N]), primitives (i*, u*, f32, f64)Missing:
#[record(visit(MyVisitor))]#[record(field_macro = my_ext::per_field, record_macro = my_ext::per_record)]#[record(ext(key))]docs/extensibility.md and the runnable example examples/12_ext_hooks.rs.