Crates.io | df-derive |
lib.rs | df-derive |
version | 0.1.1 |
created_at | 2025-09-15 19:54:58.464067+00 |
updated_at | 2025-09-25 15:15:06.669908+00 |
description | Procedural derive macro for efficiently converting Rust structs into Polars DataFrames. |
homepage | https://github.com/gramistella/df-derive |
repository | https://github.com/gramistella/df-derive |
max_upload_size | |
id | 1840513 |
size | 354,983 |
Procedural derive macros for converting your Rust types into Polars DataFrame
s.
Deriving ToDataFrame
on your structs and tuple structs generates fast, allocation-conscious code to:
polars::prelude::DataFrame
DataType
s) at compile time via a generated methodIt supports nested structs (flattened with dot notation), Option<T>
, Vec<T>
, tuple structs, and key domain types like chrono::DateTime<Utc>
and rust_decimal::Decimal
.
Add the macro crate and Polars. You will also need a trait defining the to_dataframe
behavior (you can use your own runtime crate/traits; see the override section below). For a minimal inline trait you can copy, see the Quick start example.
[dependencies]
df-derive = "0.1"
polars = { version = "0.50", features = ["timezones", "dtype-decimal"] }
# If you use these types in your models
chrono = { version = "0.4", features = ["serde"] }
rust_decimal = { version = "1.36", features = ["serde"] }
Copy-paste runnable example without any external runtime traits. This is a complete working example that you can run with cargo run --example quickstart
.
Cargo.toml:
[package]
name = "quickstart"
version = "0.1.1"
edition = "2024"
[dependencies]
df-derive = "0.1"
polars = { version = "0.50", features = ["timezones", "dtype-decimal"] }
src/main.rs:
use df_derive::ToDataFrame;
mod dataframe {
use polars::prelude::{DataFrame, DataType, PolarsResult};
pub trait ToDataFrame {
fn to_dataframe(&self) -> PolarsResult<DataFrame>;
fn empty_dataframe() -> PolarsResult<DataFrame>;
fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>;
}
pub trait Columnar: Sized {
fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>;
}
pub trait ToDataFrameVec {
fn to_dataframe(&self) -> PolarsResult<DataFrame>;
}
impl<T> ToDataFrameVec for [T]
where
T: Columnar + ToDataFrame,
{
fn to_dataframe(&self) -> PolarsResult<DataFrame> {
if self.is_empty() {
return <T as ToDataFrame>::empty_dataframe();
}
<T as Columnar>::columnar_to_dataframe(self)
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")] // Columnar path auto-infers to crate::dataframe::Columnar
struct Trade {
symbol: String,
price: f64,
size: u64,
}
fn main() -> polars::prelude::PolarsResult<()> {
let t = Trade { symbol: "AAPL".into(), price: 187.23, size: 100 };
let df_single = <Trade as crate::dataframe::ToDataFrame>::to_dataframe(&t)?;
println!("{}", df_single);
let trades = vec![
Trade { symbol: "AAPL".into(), price: 187.23, size: 100 },
Trade { symbol: "MSFT".into(), price: 411.61, size: 200 },
];
use crate::dataframe::ToDataFrameVec;
let df_batch = trades.as_slice().to_dataframe()?;
println!("{}", df_batch);
Ok(())
}
Run it:
cargo run
outer.inner
columnsList
columns; Vec<Nested>
becomes multiple outer.subfield
list columnsOption<T>
: null-aware materialization for both scalars and listsfield_0
, field_1
, ...(1, 0)
for instances and (0, 0)
for empty framesT::schema() -> Vec<(&'static str, DataType)>
[T]::to_dataframe()
via the Columnar
implementationUse #[df_derive(as_string)]
to stringify values during conversion. This is particularly useful for enums:
#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }
// Required: implement Display for the enum
impl std::fmt::Display for Status {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Status::Active => write!(f, "Active"),
Status::Inactive => write!(f, "Inactive"),
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
#[df_derive(as_string)]
status: Status,
#[df_derive(as_string)]
opt_status: Option<Status>,
#[df_derive(as_string)]
statuses: Vec<Status>,
}
Columns will use DataType::String
(or List<String>
for Vec<_>
), and values are produced via ToString
. See the complete working example with cargo run --example as_string
.
String
, bool
, integer types (i8/i16/i32/i64/isize
, u8/u16/u32/u64/usize
), f32
, f64
chrono::DateTime<Utc>
→ materialized as Datetime(Milliseconds, None)
rust_decimal::Decimal
→ Decimal(38, 10)
Option<T>
, Vec<T>
in any nesting orderToDataFrame
(supports nesting and Vec<Nested>
)field_{index}
field_name
outer.inner
(recursively)vec_field.subfield
(list dtype)field_0
, field_1
, ...For every #[derive(ToDataFrame)]
type T
the macro generates implementations of two traits (paths configurable via #[df_derive(...)]
):
ToDataFrame
for T
:
fn to_dataframe(&self) -> PolarsResult<DataFrame>
fn empty_dataframe() -> PolarsResult<DataFrame>
fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>
Columnar
for T
:
fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>
This crate includes several runnable examples in the examples/
directory. You can run any example with:
cargo run --example <example_name>
Or run all examples to see the full feature set:
cargo run --example quickstart && \
cargo run --example nested && \
cargo run --example vec_custom && \
cargo run --example tuple && \
cargo run --example datetime_decimal && \
cargo run --example as_string
quickstart
- Basic usage with single and batch DataFrame conversionnested
- Nested structs with dot notation column namingvec_custom
- Vec of custom structs creating List columnstuple
- Tuple structs with field_0, field_1 namingdatetime_decimal
- DateTime and Decimal type supportas_string
- #[df_derive(as_string)]
attribute for enum conversion#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Address { street: String, city: String, zip: u32 }
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Person { name: String, age: u32, address: Address }
// Columns: name, age, address.street, address.city, address.zip
Note: the runnable examples define a small
dataframe
module with the traits used by the macro. Some helper trait items are not used in every snippet (for exampleempty_dataframe
orColumnar
). To avoid noise duringcargo run --example …
, the examples annotate that module with#[allow(dead_code)]
.
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Quote { ts: i64, open: f64, high: f64, low: f64, close: f64, volume: u64 }
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct MarketData { symbol: String, quotes: Vec<Quote> }
// Columns include: symbol, quotes.ts, quotes.open, quotes.high, ... (each a List)
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct SimpleTuple(i32, String, f64);
// Columns: field_0 (Int32), field_1 (String), field_2 (Float64)
DateTime<Utc>
and Decimal#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct TxRecord { amount: rust_decimal::Decimal, ts: chrono::DateTime<chrono::Utc> }
// Schema dtypes: amount = Decimal(38, 10), ts = Datetime(Milliseconds, None)
Why
#[allow(dead_code)]
in examples? The examples include a minimaldataframe
module to provide the traits that the macro implements. Not every example calls every method (e.g.,empty_dataframe
,schema
), and compile-time warnings would otherwise distract from the output. Adding#[allow(dead_code)]
to that module keeps the examples clean while remaining fully correct.
#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }
impl std::fmt::Display for Status {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Status::Active => write!(f, "Active"),
Status::Inactive => write!(f, "Inactive"),
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
#[df_derive(as_string)]
status: Status,
#[df_derive(as_string)]
opt_status: Option<Status>,
#[df_derive(as_string)]
statuses: Vec<Status>,
}
// Columns use DataType::String or List<String>
Note: All examples require the trait definitions shown in the Quick start section. See the complete working examples in the
examples/
directory.
HashMap<_, _>
are not supported.#[df_derive(as_string)]
on enum fields.ToDataFrame
.Columnar
path used by the runtime to convert slices efficiently, avoiding per-row DataFrame builds.benches/
exercise wide, deep, and nested-Vec shapes (100k+ rows), demonstrating consistent performance across shapes.Performance is continuously monitored and tracked using Bencher:
timezones
and dtype-decimal
if you use DateTime<Utc>
or Decimal
.MIT. See LICENSE
.
This crate currently resolves default trait paths to a dataframe
module under the paft
ecosystem. Concretely, it attempts to implement:
paft::dataframe::ToDataFrame
and paft::dataframe::Columnar
(or paft-core::dataframe::...
) if those crates are present.You can override these paths for any runtime by annotating your type with #[df_derive(...)]
:
#[derive(df_derive::ToDataFrame)]
#[df_derive(trait = "my_runtime::dataframe::ToDataFrame")] // Columnar will be inferred as my_runtime::dataframe::Columnar
struct MyType { /* fields */ }
If you need to override both explicitly:
#[derive(df_derive::ToDataFrame)]
#[df_derive(
trait = "my_runtime::dataframe::ToDataFrame",
columnar = "my_runtime::dataframe::Columnar",
)]
struct MyType { /* fields */ }