| Crates.io | protocrap |
| lib.rs | protocrap |
| version | 0.3.2 |
| created_at | 2025-12-29 02:35:52.067953+00 |
| updated_at | 2026-01-06 06:31:07.665932+00 |
| description | A small, efficient, and flexible protobuf implementation |
| homepage | |
| repository | https://github.com/gerben-stavenga/protocrap |
| max_upload_size | |
| id | 2009860 |
| size | 5,774,786 |
A small efficient and flexible protobuf implementation for rust
Unlike other protobuf implementations that focus on feature completeness, Protocrap prioritizes:
This protobuf implementation prioritizes four key design goals:
These goals drive the following architectural decisions:
Generics create significant code bloat and compilation slowdowns—Serde exemplifies this problem perfectly. We achieve the same flexibility through type erasure using dyn traits and type punning. By carefully placing this type erasure at interface boundaries, we eliminate performance overhead while often improving instruction cache utilization through code deduplication. Our Arena allocation and parsing APIs demonstrate this approach effectively.
The few generic functions in our API are small, inline-only helpers that encapsulate dispatch logic to non-generic library functions.
Rather than generating parsing and serialization code, we handle these operations through non-generic library functions. Code generation from proto IDL files produces only Rust struct definitions with their accessors, plus extremely efficient static lookup tables that drive the parsing and serialization logic.
Arena allocation serves two critical purposes in this library:
Efficiency: While modern allocators are extremely fast, arena allocation reduces allocation to a simple pointer increment in the common case—and this gets inlined at the call site. More importantly, it eliminates the need to drop individual nodes in object trees. Traditional dropping requires a costly post-order traversal of often cold memory, whereas arenas allow us to deallocate entire object trees by dropping just a few large blocks.
Flexibility: Since actual allocator calls only happen when requesting new blocks, we can use &dyn Allocator without significant performance impact. This gives users complete control over memory placement while serving as a highly efficient type eraser.
Fallible Allocation: All arena operations return Result, allowing custom allocators to enforce memory limits and reject allocations. This gives you explicit control over memory budgets—your allocator decides how much memory to admit, and protocrap properly propagates those decisions as errors rather than panicking.
Most serialization frameworks limit themselves to flat buffers as sources and sinks—a restrictive approach. The Rust standard library's Read/Write and BufRead/BufWrite traits provide better abstractions, but they're generic traits that force code duplication for each implementation. Using dynamic dispatch for individual operations touching just a few bytes creates unacceptable overhead.
The standard solution uses some &dyn BufferStream streaming buffer chunks, such that the cost of dynamic dispatch is amortized over many operations. This is a pull (callback) API where the protobuf library requests buffers as needed—the approach used by Google's protobuf implementation.
However, pull APIs lack the flexibility our design goals demand. When parsing and the next buffer isn't ready from disk or network, the buffer stream must block. This breaks async task systems. We could define an async stream, but then the parser must .await buffers, making the entire parser async and unusable outside of async contexts. Creating separate sync and async parsers contradicts our design principles, see code bloat.
Our solution: a push API. Both parser and serializer become pure functions with the signature (internal_state, buffer) -> updated_state. Instead of the protobuf pulling buffers in its inner loops, users push chunks in an outer loop by calling the parse function. This approach supports both synchronous and asynchronous streams without requiring separate implementations. It also eliminates trait compatibility issues—no more situations where third-party crate streams lack the necessary traits and Rust's orphan rules prevent you from implementing them.
Every framework comes with some limits (often unspecified) to where you can push things. For instance template instantiation recursion limits of a c++ compiler. For a serialization framework these involve how many fields can a schema have, etc.. We are very principled here, we support only sane schemas, so no thousands of fields with arbitrary field numbers in the tens of millions. We support
Field numbers should just be assigned consecutive 1 ... max field. We do tolerate holes, but we do not compress field numbers, assigning a single field struct a field number of 2000. Will lead to a very big table. These restrictions simplify code and allows compression of has bit index and offset of fields into a single 16 bit integer, which makes for compact tables.
See SPECS.md for detailed specifications.
use protocrap::*;
// Create a message
let mut arena = arena::Arena::new(&std::alloc::Global);
let mut msg = MyMessage::default();
msg.set_name("Hello", &mut arena);
msg.set_id(42);
// Encode to bytes
let bytes = msg.encode_vec::<32>()?;
// Decode from bytes
let mut decoded = MyMessage::default();
decoded.decode_flat::<32>(&mut arena, &bytes);
// Async decoding
let mut async_msg = MyMessage::default();
async_msg.decode_from_async_bufread::<32>(&mut arena, &mut async_reader).await?;
See pc-example for a complete example project using protocrap.
Generate Rust code from protobuf definitions:
# Install the code generator
cargo install protocrap --features codegen
# Generate descriptor set from .proto file
protoc --include_imports --descriptor_set_out=descriptor.bin my_types.proto
# Generate Rust code
protocrap descriptor.bin my_types.pc.rs
Or build and run from source:
cargo run --features codegen -- descriptor.bin my_types.pc.rs
Include the generated code in your crate:
use protocrap;
include!("my_types.pc.rs");
Embed protobuf data as compile-time constants - no lazy init, no mutex, just a const:
# From binary .pb file
protocrap descriptor.bin --embed config.pb:my.package.Config -o config.pc.rs
# From JSON (easier to maintain in version control)
protocrap descriptor.bin --embed config.json:my.package.Config -o config.pc.rs
Use in your code:
const CONFIG: my::package::Config::ProtoType = include!("config.pc.rs");
The data goes straight to .rodata - zero runtime overhead.
For working on protocrap itself (regenerating descriptor.pc.rs):
# Regenerate using bootstrap codegen (protocrap from crates.io)
bazel run //:regen_descriptor
Protocrap includes a powerful reflection API for dynamic message inspection:
use protocrap::reflection::{DescriptorPool, DynamicMessage};
// Create a descriptor pool and load file descriptors
let mut pool = DescriptorPool::new(&std::alloc::Global);
pool.add_file(&file_descriptor);
// Decode a message dynamically
let dynamic_msg = pool.decode_message("package.MessageType", &bytes)?;
// Inspect fields at runtime
for field in dynamic_msg.descriptor().field() {
if let Some(value) = dynamic_msg.get_field(field) {
println!("{}: {:?}", field.name(), value);
}
}
Protocrap is validated through multiple testing approaches:
cargo-fuzz to find edge cases in the decoder# Run conformance tests via Bazel
~/bin/bazelisk test //conformance:conformance_test
~/bin/bazelisk test //conformance:conformance_test_dynamic
# Run fuzz tests (requires nightly)
cargo +nightly fuzz run decode_fuzz
no_std feature)đźš§ Beta - This library is not yet fully matured. Expect:
Use in production at your own risk. Bug reports and contributions welcome!