serde-encoded-bytes

Crates.io	serde-encoded-bytes
lib.rs	serde-encoded-bytes
version	0.2.1
created_at	2024-10-16 20:19:02.207838+00
updated_at	2025-05-26 18:52:47.691849+00
description	Efficient bytestring serialization for Serde-supporting types
homepage
repository	https://github.com/fjarri/serde-encoded-bytes
max_upload_size
id	1412294
size	73,835

Bogdan Opanchuk (fjarri)

documentation

README

Efficient bytestring serialization for `serde`-supporting types

License

What it does

Byte arrays ([u8; N], Vec<u8>, Box<[u8]> and so on) are treated by serde as arrays of integers, which leads to inefficient representations in various formats. For example, this is how serialization works by default for binary and human-readable formats:

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct Array([u8; 16]);

let array = Array([
    0, 1, 0xf2, 3, 0xf4, 5, 0xf6, 7, 0xf8, 9, 0xfa, 11, 0xfc, 13, 14, 0xff
]);

// Serializing as MessagePack
assert_eq!(
    rmp_serde::encode::to_vec(&array).unwrap(),
    [
        0xdc, 0, 0x10,
        0, 1, 0xcc, 0xf2, 3, 0xcc, 0xf4, 5, 0xcc, 0xf6, 7,
        0xcc, 0xf8, 9, 0xcc, 0xfa, 11, 0xcc, 0xfc, 13, 14, 0xcc, 0xff,
    ]
);

// Serializing as JSON
assert_eq!(
    serde_json::to_string(&array).unwrap(),
    "[0,1,242,3,244,5,246,7,248,9,250,11,252,13,14,255]"
);

Note that in MessagePack the bytes with the value above 0x7f (that is, the ones with the MSB set) are prefixed by 0xcc, which makes the bytestring take more space than it should. And in case of JSON, the bytestring is serialized as an array of integers, which is also not very efficient.

This crate provides methods that can be used in serde(with) field attribute to amend this behavior and serialize bytestrings efficiently, verbatim in binary formats, or using the selected encoding in human-readable formats.

Usage

To use, add a serde(with) annotation with an argument composed of a container type (whether it is array-like, slice-like and so on) and the desired encoding:

use serde::{Deserialize, Serialize};
use serde_encoded_bytes::{ArrayLike, Hex};

#[derive(Debug, PartialEq, Eq, Serialize, Deserialize)]
struct Array(#[serde(with = "ArrayLike::<Hex>")] [u8; 16]);

let array = Array([
    0, 1, 0xf2, 3, 0xf4, 5, 0xf6, 7, 0xf8, 9, 0xfa, 11, 0xfc, 13, 14, 0xff,
]);

// Serializing as MessagePack
assert_eq!(
    rmp_serde::encode::to_vec(&array).unwrap(),
    [0xc4, 0x10, 0, 1, 0xf2, 3, 0xf4, 5, 0xf6, 7, 0xf8, 9, 0xfa, 11, 0xfc, 13, 14, 0xff]
);

// Serializing as JSON
assert_eq!(
    serde_json::to_string(&array).unwrap(),
    "\"0x0001f203f405f607f809fa0bfc0d0eff\""
);

As you can see, the serialization of the example above is now more efficient in either format.

Note that due to serde limitations (see https://github.com/serde-rs/serde/issues/2120) fixed-size arrays will still be serialized with their length included in binary formats.

Features

hex: hex encoding support (enabled by default);
base64: base64 encoding support.

Tested formats

While this crate is supposed to work for any format that supports serde, it is specifically tested on:

Prior art

More established crates with intersecting functionality include:

serde-bytes-repr - similar capabilities, but a different approach to the API, which can be inconvenient in a number of cases;
serde_bytes - provides efficient serialization in binary formats, but not in human-readable formats;
serdect - focused specifically on constant-time serialization;
hex-buffer-serde - fixed encoding;
base64-serde, serde-hex, stremio-serde-hex - fixed encoding, and no support for binary formats.

Commit count: 21