| Crates.io | polars_protobuf |
| lib.rs | polars_protobuf |
| version | 0.2.1 |
| created_at | 2025-12-20 18:35:31.836136+00 |
| updated_at | 2025-12-20 23:27:35.774136+00 |
| description | Automatic polars_structpath implementations for Protocol Buffer messages |
| homepage | |
| repository | https://github.com/jmunar/polarspath |
| max_upload_size | |
| id | 1996834 |
| size | 109,526 |
A Rust library that automatically generates polars_structpath implementations for Protocol Buffer messages, enabling type-safe field access and Polars integration.
polars_protobuf provides seamless integration between Protocol Buffers and the polars_structpath ecosystem. It enables:
StructPath and EnumPath derives to protobuf messages and enums during build timeSeries and AnyValue typesThis crate is used by:
build.rs) in projects that use protobuf messages with PolarsTo create a new project using polars_protobuf, you can download and run the project generator script:
curl -sSL https://raw.githubusercontent.com/jmunar/polarspath/main/crates/polars_protobuf/create_polars_protobuf_project.sh | bash -s -- --project-name my_project -p -t
The script will:
build.rs)pyproject.toml for Python packagingMakefile for building the project-p) and tests (-t)After the scripts finishes creating all the necessary fles, the python environment (including
the Polars extension) can be created by going into the project folder and running make build.
An example on how to use the package in python can be found in this notebook.
Add to your Cargo.toml:
[build-dependencies]
polars_protobuf = { version = "*", features = ["build"] }
Then in your build.rs:
fn main() -> Result<(), Box<dyn std::error::Error>> {
polars_protobuf::build::build_protobuf(polars_protobuf::build::BuildConfig {
proto_dir: "protobuf/sample".to_string(),
include_paths: vec!["protobuf/sample".to_string()],
generate_extensions: Some(polars_protobuf::build::ExtensionConfig {
python_package_dir: "example_protobuf/example_protobuf".to_string(),
python_package_name: "example_protobuf".to_string(),
}),
})?;
Ok(())
}
Once your protobuf messages are generated with StructPath support, you can extract fields:
use polars_core::prelude::{BinaryType, ChunkedArray};
use polars_protobuf::get_value;
use prost::Message;
#[derive(polars_structpath::StructPath, Clone, Message)]
struct Person {
#[prost(string, tag = "1")]
name: String,
#[prost(int64, tag = "2")]
age: i64,
}
// Assuming you have a ChunkedArray<BinaryType> containing encoded protobuf messages
let binary_column: ChunkedArray<BinaryType> = /* ... */;
// Extract the "name" field from all messages
let name_series = get_value::<Person>(&binary_column, "name", true)?;
// Extract nested fields using path notation
// let parent_name = get_value::<Person>(&binary_column, "parent.name", true)?;
You can also get the Polars data type for a field path:
use polars_core::prelude::Field;
use polars_protobuf::get_type;
let field = get_type::<Person>(&[], "name")?;
// field.dtype() will be DataType::String
build: Enables the build-time code generation functionality (requires prost-build and prost-types)get_type<T>: Get the Polars Field type for a given path in a protobuf message typeget_value<T>: Extract a field from a ChunkedArray<BinaryType> containing encoded protobuf messagesbuild_protobuf: Main function to build protobuf files with polars_structpath supportBuildConfig: Configuration for the build processExtensionConfig: Configuration for generating Python extension modulesIn order to compare the performance of using the polars_structpath backend versus using
prost directly, we have built a benchmark in the examples folder. You can run it using
cargo run --example benchmark --release
On an Apple M1 laptop, the results are:
Prost decode time
Time taken: 0.0513 s
Extracting f_string
Time taken (direct): 0.0524 s
Time taken (structpath single-threaded): 0.0800 s
Time taken (structpath multi-threaded): 0.0154 s
Extracting f_integer
Time taken (direct): 0.0481 s
Time taken (structpath single-threaded): 0.0750 s
Time taken (structpath multi-threaded): 0.0154 s
Extracting f_double
Time taken (direct): 0.0474 s
Time taken (structpath single-threaded): 0.0739 s
Time taken (structpath multi-threaded): 0.0146 s
Extracting f_boolean
Time taken (direct): 0.0491 s
Time taken (structpath single-threaded): 0.0741 s
Time taken (structpath multi-threaded): 0.0155 s
Extracting f_integer_optional
Time taken (direct): 0.0474 s
Time taken (any value): 0.0493 s
Time taken (structpath single-threaded): 0.0838 s
Time taken (structpath multi-threaded): 0.0164 s
Extracting f_string_optional
Time taken (direct): 0.0498 s
Time taken (any value): 0.0484 s
Time taken (structpath single-threaded): 0.0850 s
Time taken (structpath multi-threaded): 0.0168 s
Extracting f_integer_repeated
Time taken (direct): 0.0905 s
Time taken (any value): 0.0718 s
Time taken (structpath single-threaded): 0.1382 s
Time taken (structpath multi-threaded): 0.0466 s
Extracting f_string_repeated
Time taken (direct): 0.0857 s
Time taken (any value): 0.0671 s
Time taken (structpath single-threaded): 0.1562 s
Time taken (structpath multi-threaded): 0.0513 s
Extracting f_submessage
Time taken (any value): 0.0684 s
Time taken (structpath single-threaded): 0.1625 s
Time taken (structpath multi-threaded): 0.0431 s
Extracting f_submessage_repeated
Time taken (any value): 0.4206 s
Time taken (structpath single-threaded): 0.6474 s
Time taken (structpath multi-threaded): 0.3386 s