df-derive

Crates.iodf-derive
lib.rsdf-derive
version0.1.1
created_at2025-09-15 19:54:58.464067+00
updated_at2025-09-25 15:15:06.669908+00
descriptionProcedural derive macro for efficiently converting Rust structs into Polars DataFrames.
homepagehttps://github.com/gramistella/df-derive
repositoryhttps://github.com/gramistella/df-derive
max_upload_size
id1840513
size354,983
G. Ramistella (gramistella)

documentation

https://docs.rs/df-derive

README

df-derive

Crates.io Docs.rs CI Downloads License

Procedural derive macros for converting your Rust types into Polars DataFrames.

What this crate does

Deriving ToDataFrame on your structs and tuple structs generates fast, allocation-conscious code to:

  • Convert a single value to a polars::prelude::DataFrame
  • Convert a slice of values via a columnar path (efficient batch conversion)
  • Inspect the schema (column names and DataTypes) at compile time via a generated method

It supports nested structs (flattened with dot notation), Option<T>, Vec<T>, tuple structs, and key domain types like chrono::DateTime<Utc> and rust_decimal::Decimal.

Installation

Add the macro crate and Polars. You will also need a trait defining the to_dataframe behavior (you can use your own runtime crate/traits; see the override section below). For a minimal inline trait you can copy, see the Quick start example.

[dependencies]
df-derive = "0.1"
polars = { version = "0.50", features = ["timezones", "dtype-decimal"] }

# If you use these types in your models
chrono = { version = "0.4", features = ["serde"] }
rust_decimal = { version = "1.36", features = ["serde"] }

Quick start

Copy-paste runnable example without any external runtime traits. This is a complete working example that you can run with cargo run --example quickstart.

Cargo.toml:

[package]
name = "quickstart"
version = "0.1.1"
edition = "2024"

[dependencies]
df-derive = "0.1"
polars = { version = "0.50", features = ["timezones", "dtype-decimal"] }

src/main.rs:

use df_derive::ToDataFrame;

mod dataframe {
    use polars::prelude::{DataFrame, DataType, PolarsResult};

    pub trait ToDataFrame {
        fn to_dataframe(&self) -> PolarsResult<DataFrame>;
        fn empty_dataframe() -> PolarsResult<DataFrame>;
        fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>;
    }

    pub trait Columnar: Sized {
        fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>;
    }

    pub trait ToDataFrameVec {
        fn to_dataframe(&self) -> PolarsResult<DataFrame>;
    }

    impl<T> ToDataFrameVec for [T]
    where
        T: Columnar + ToDataFrame,
    {
        fn to_dataframe(&self) -> PolarsResult<DataFrame> {
            if self.is_empty() {
                return <T as ToDataFrame>::empty_dataframe();
            }
            <T as Columnar>::columnar_to_dataframe(self)
        }
    }
}

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")] // Columnar path auto-infers to crate::dataframe::Columnar
struct Trade {
    symbol: String,
    price: f64,
    size: u64,
}

fn main() -> polars::prelude::PolarsResult<()> {
    let t = Trade { symbol: "AAPL".into(), price: 187.23, size: 100 };
    let df_single = <Trade as crate::dataframe::ToDataFrame>::to_dataframe(&t)?;
    println!("{}", df_single);

    let trades = vec![
        Trade { symbol: "AAPL".into(), price: 187.23, size: 100 },
        Trade { symbol: "MSFT".into(), price: 411.61, size: 200 },
    ];
    use crate::dataframe::ToDataFrameVec;
    let df_batch = trades.as_slice().to_dataframe()?;
    println!("{}", df_batch);

    Ok(())
}

Run it:

cargo run

Features

  • Nested structs (flattening): fields of nested structs appear as outer.inner columns
  • Vec of primitives and structs: becomes Polars List columns; Vec<Nested> becomes multiple outer.subfield list columns
  • Option<T>: null-aware materialization for both scalars and lists
  • Tuple structs: supported; columns are named field_0, field_1, ...
  • Empty structs: produce (1, 0) for instances and (0, 0) for empty frames
  • Schema discovery: T::schema() -> Vec<(&'static str, DataType)>
  • Columnar batch conversion: [T]::to_dataframe() via the Columnar implementation

Attribute helpers

Use #[df_derive(as_string)] to stringify values during conversion. This is particularly useful for enums:

#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }

// Required: implement Display for the enum
impl std::fmt::Display for Status {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Status::Active => write!(f, "Active"),
            Status::Inactive => write!(f, "Inactive"),
        }
    }
}

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
    #[df_derive(as_string)]
    status: Status,
    #[df_derive(as_string)]
    opt_status: Option<Status>,
    #[df_derive(as_string)]
    statuses: Vec<Status>,
}

Columns will use DataType::String (or List<String> for Vec<_>), and values are produced via ToString. See the complete working example with cargo run --example as_string.

Supported types

  • Primitives: String, bool, integer types (i8/i16/i32/i64/isize, u8/u16/u32/u64/usize), f32, f64
  • Time: chrono::DateTime<Utc> → materialized as Datetime(Milliseconds, None)
  • Decimal: rust_decimal::DecimalDecimal(38, 10)
  • Wrappers: Option<T>, Vec<T> in any nesting order
  • Custom structs: any other struct deriving ToDataFrame (supports nesting and Vec<Nested>)
  • Tuple structs: unnamed fields are emitted as field_{index}

Column naming

  • Named struct fields: field_name
  • Nested structs: outer.inner (recursively)
  • Vec of custom structs: vec_field.subfield (list dtype)
  • Tuple structs: field_0, field_1, ...

Generated API

For every #[derive(ToDataFrame)] type T the macro generates implementations of two traits (paths configurable via #[df_derive(...)]):

  • ToDataFrame for T:
    • fn to_dataframe(&self) -> PolarsResult<DataFrame>
    • fn empty_dataframe() -> PolarsResult<DataFrame>
    • fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>
  • Columnar for T:
    • fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>

Examples

This crate includes several runnable examples in the examples/ directory. You can run any example with:

cargo run --example <example_name>

Or run all examples to see the full feature set:

cargo run --example quickstart && \
cargo run --example nested && \
cargo run --example vec_custom && \
cargo run --example tuple && \
cargo run --example datetime_decimal && \
cargo run --example as_string

Available Examples

  • quickstart - Basic usage with single and batch DataFrame conversion
  • nested - Nested structs with dot notation column naming
  • vec_custom - Vec of custom structs creating List columns
  • tuple - Tuple structs with field_0, field_1 naming
  • datetime_decimal - DateTime and Decimal type support
  • as_string - #[df_derive(as_string)] attribute for enum conversion

Example Code Snippets

Nested structs

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Address { street: String, city: String, zip: u32 }

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Person { name: String, age: u32, address: Address }

// Columns: name, age, address.street, address.city, address.zip

Note: the runnable examples define a small dataframe module with the traits used by the macro. Some helper trait items are not used in every snippet (for example empty_dataframe or Columnar). To avoid noise during cargo run --example …, the examples annotate that module with #[allow(dead_code)].

Vec of custom structs

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Quote { ts: i64, open: f64, high: f64, low: f64, close: f64, volume: u64 }

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct MarketData { symbol: String, quotes: Vec<Quote> }

// Columns include: symbol, quotes.ts, quotes.open, quotes.high, ... (each a List)

Tuple structs

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct SimpleTuple(i32, String, f64);
// Columns: field_0 (Int32), field_1 (String), field_2 (Float64)

DateTime<Utc> and Decimal

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct TxRecord { amount: rust_decimal::Decimal, ts: chrono::DateTime<chrono::Utc> }
// Schema dtypes: amount = Decimal(38, 10), ts = Datetime(Milliseconds, None)

Why #[allow(dead_code)] in examples? The examples include a minimal dataframe module to provide the traits that the macro implements. Not every example calls every method (e.g., empty_dataframe, schema), and compile-time warnings would otherwise distract from the output. Adding #[allow(dead_code)] to that module keeps the examples clean while remaining fully correct.

As string attribute

#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }

impl std::fmt::Display for Status {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Status::Active => write!(f, "Active"),
            Status::Inactive => write!(f, "Inactive"),
        }
    }
}

#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
    #[df_derive(as_string)]
    status: Status,
    #[df_derive(as_string)]
    opt_status: Option<Status>,
    #[df_derive(as_string)]
    statuses: Vec<Status>,
}
// Columns use DataType::String or List<String>

Note: All examples require the trait definitions shown in the Quick start section. See the complete working examples in the examples/ directory.

Limitations and guidance

  • Unsupported container types: maps/sets like HashMap<_, _> are not supported.
  • Enums: derive on enums is not supported; use #[df_derive(as_string)] on enum fields.
  • Generics: generic structs are not supported by the derive (see tests/fail for examples).
  • All nested types must also derive: if you nest a struct, it must also derive ToDataFrame.

Performance notes

  • The derive implements an internal Columnar path used by the runtime to convert slices efficiently, avoiding per-row DataFrame builds.
  • Criterion benches in benches/ exercise wide, deep, and nested-Vec shapes (100k+ rows), demonstrating consistent performance across shapes.

Performance tracking

Performance is continuously monitored and tracked using Bencher:

df-derive - Bencher

Compatibility

  • Rust edition: 2024
  • Polars: 0.50 (tested)
  • Enable Polars features timezones and dtype-decimal if you use DateTime<Utc> or Decimal.

License

MIT. See LICENSE.

Crate path override (about paft)

This crate currently resolves default trait paths to a dataframe module under the paft ecosystem. Concretely, it attempts to implement:

  • paft::dataframe::ToDataFrame and paft::dataframe::Columnar (or paft-core::dataframe::...) if those crates are present.

You can override these paths for any runtime by annotating your type with #[df_derive(...)]:

#[derive(df_derive::ToDataFrame)]
#[df_derive(trait = "my_runtime::dataframe::ToDataFrame")] // Columnar will be inferred as my_runtime::dataframe::Columnar
struct MyType { /* fields */ }

If you need to override both explicitly:

#[derive(df_derive::ToDataFrame)]
#[df_derive(
    trait = "my_runtime::dataframe::ToDataFrame",
    columnar = "my_runtime::dataframe::Columnar",
)]
struct MyType { /* fields */ }
Commit count: 17

cargo fmt