Crates.io | serde_avro_fast |
lib.rs | serde_avro_fast |
version | 2.0.0 |
source | src |
created_at | 2023-03-03 15:04:16.605725 |
updated_at | 2024-10-21 18:51:13.278805 |
description | An idiomatic implementation of serde/avro (de)serialization |
homepage | |
repository | https://github.com/Ten0/serde_avro_fast |
max_upload_size | |
id | 799809 |
size | 353,617 |
An idiomatic implementation of serde/avro (de)serialization
let schema: serde_avro_fast::Schema = r#"
{
"namespace": "test",
"type": "record",
"name": "Test",
"fields": [
{
"type": {
"type": "string"
},
"name": "field"
}
]
}
"#
.parse()
.expect("Failed to parse schema");
#[derive(serde_derive::Deserialize, Debug, PartialEq)]
struct Test<'a> {
field: &'a str,
}
let avro_datum = &[6, 102, 111, 111];
assert_eq!(
serde_avro_fast::from_datum_slice::<Test>(avro_datum, &schema)
.expect("Failed to deserialize"),
Test { field: "foo" }
);
At the time of writing, the other existing libraries for Avro
(de)serialization do tons of unnecessary allocations, HashMap
lookups,
etc... for every record they encounter.
This version is a more idiomatic implementation, both with regards to Rust
and to serde
.
It is consequently >10x more performant (cf benchmarks):
apache_avro/small time: [386.57 ns 387.04 ns 387.52 ns]
serde_avro_fast/small time: [19.367 ns 19.388 ns 19.413 ns] <- x20 improvement
apache_avro/big time: [1.8618 µs 1.8652 µs 1.8701 µs]
serde_avro_fast/big time: [165.87 ns 166.92 ns 168.09 ns] <- x11 improvement
It supports any schema/deserialization target type combination that came to mind, including advanced union usage with (or without) enums, as well as proper Option support. If you have an exotic use-case in mind, it typically should be supported (if it isn't, feel free to open an issue).
Aside from the already mentionned performance improvements, there are a couple major design differences:
Value
is removed. Deserialization is a one-step process, which is fully serde-integrated, and leverages its zero-copy features. The output struct can now borrow from the source slice.
Value
representation appears to be unnecessary in Rust, as the two use-cases for Value
would seem to be:
serde_transcode
.Value
representation hurts performance compared to deserializing right away to the correct struct (especially when said representation involves as many allocations as that of apache-avro does).serde_transcode
, as such serializer would combine the types provided from the original struct (or in this case, deserializer) with the schema variant to remap the values in a correct way, while preserving zero-alloc.