Crates.io | binary-layout |
lib.rs | binary-layout |
version | 4.0.2 |
source | src |
created_at | 2021-05-01 07:17:16.636951 |
updated_at | 2024-04-14 00:27:51.531478 |
description | The binary-layout library allows type-safe, inplace, zero-copy access to structured binary data. You define a custom data layout and give it a slice of binary data, and it will allow you to read and write the fields defined in the layout from the binary data without having to copy any of the data. It's similar to transmuting to/from a `#[repr(packed)]` struct, but much safer. |
homepage | https://github.com/smessmer/binary-layout |
repository | https://github.com/smessmer/binary-layout |
max_upload_size | |
id | 391813 |
size | 252,301 |
The binary-layout library allows type-safe, inplace, zero-copy access to structured binary data.
You define a custom data layout and give it a slice of binary data, and it will allow you to read and
write the fields defined in the layout from the binary data without having to copy any of the data.
It's similar to transmuting to/from a #[repr(packed)]
struct, but much safer.
Note that the data does not go through serialization/deserialization or a parsing step. All accessors access the underlying packet data directly.
This crate is #[no_std]
compatible.
use binary_layout::prelude::*;
// See https://en.wikipedia.org/wiki/Internet_Control_Message_Protocol for ICMP packet layout
binary_layout!(icmp_packet, BigEndian, {
packet_type: u8,
code: u8,
checksum: u16,
rest_of_header: [u8; 4],
data_section: [u8], // open ended byte array, matches until the end of the packet
});
fn func(packet_data: &mut [u8]) {
let mut view = icmp_packet::View::new(packet_data);
// read some data
let code: u8 = view.code().read();
// equivalent: let code: u8 = packet_data[1];
// write some data
view.checksum_mut().write(10);
// equivalent: packet_data[2..4].copy_from_slice(&10u16.to_be_bytes());
// access an open ended byte array
let data_section: &[u8] = view.data_section();
// equivalent: let data_section: &[u8] = &packet_data[8..];
// and modify it
view.data_section_mut()[..5].copy_from_slice(&[1, 2, 3, 4, 5]);
// equivalent: packet_data[8..13].copy_from_slice(&[1, 2, 3, 4, 5]);
}
See the icmp_packet module for what this binary_layout! macro generates for you.
Anything that needs inplace zero-copy access to structured binary data.
#[repr(packed)]
?Annotating structs with #[repr(packed)]
gives some of the features of this crate, namely it lays out the data fields exactly in the order they're specified
without padding. But it has serious shortcomings that this library solves.
#[repr(packed)]
uses the system byte order, which will be different depending on if you're running on a little endian or big endian system. #[repr(packed)]
is not cross-platform compatible. This library is.#[repr(packed)]
can cause undefined behavior on some CPUs when taking references to unaligned data.
This library avoids that by not offering any API that takes references to unaligned data. Primitive integer types are allowed to be unaligned but they're copied and you can't get references to them.
The only data type you can get a reference to is byte arrays, and they only require an alignment of 1 which is trivially always fulfilled.To the best of my knowledge, there is no other library offering inplace, zero-copy and type-safe access to structured binary data. But if you don't need direct access to your data and are ok with a serialization/deserialization step, then there is a number of amazing libraries out there.
Layouts are defined using the binary_layout! macro. Based on such a layout, this library offers two alternative APIs for data access:
packet_data
in the example above) holding the packet data. This API does not wrap the underlying slice of storage data, which means you have to pass it in to each accessor.
This is not the API used in the example above, see Field for an API example.View
object, allowing access to the fields without having to pass in the packed data slice each time. This is the API used in the example above. See FieldView for another example.For these fields, the Field API offers FieldReadExt::read, FieldWriteExt::write, FieldCopyAccess::try_read, FieldCopyAccess::try_write and the FieldView API offers FieldView::read and FieldView::write.
Reading a zero values will throw an error. Because of this, FieldReadExt::read and FieldView::read are not available for those types and you need to use FieldCopyAccess::try_read and FieldView::try_read.
bool and char are supported using the bool as u8
and char as u32
data type notation.
Note that not only 0u8
and 1u8
are valid boolean values and not all u32 values are valid unicode code points.
Reading invalid values will throw an error. Because of this, FieldReadExt::read and FieldView::read are not available for those types and you need to use FieldCopyAccess::try_read and FieldView::try_read.
ZSTs neither read nor write to the underlying storage, but the appropriate traits are implemented for them to support derive macros which may require all members of a struct to implement or enum to also support the various traits.
()
, also known as the unit
type.[u8; N]
.For these fields, the Field API offers FieldSliceAccess::data, FieldSliceAccess::data_mut, and the FieldView API returns a slice.
[u8]
.This field type can only occur as the last field of a layout and will mach the remaining data until the end of the storage. This field has a dynamic size, depending on how large the packet data is. For these fields, the Field API offers FieldSliceAccess::data, FieldSliceAccess::data_mut and the FieldView API returns a slice.
You can define your own custom types as long as they implement the LayoutAs trait to define how to convert them from/to a primitive type.
These data types aren't supported yet, but they could be added in theory and might be added in future versions.
This crate relies on a static layout, it cannot support data types with dynamic length. In theory, types with dynamic length could be supported if they either
Both of these, however, would be some effort to implement and it is unclear if that will ever happen (unless somebody opens a PR for it).
For strings, note that even fixed-size UTF-8 strings take a variable number of bytes because of the UTF-8 encoding and that brings all the issues of data types with dynamic length with it. This is why strings aren't supported yet.
[u8; N]
Say we wanted to have a [u32; N]
field. The API couldn't just return a zero-copy &[u32; N]
to the caller because that would use the system byte order (i.e. endianness) which might be different from the byte order defined in the packet layout.
To make this cross-platform compatible, we'd have to wrap these slices into our own slice type that enforces the correct byte order and return that from the API.
This complexity is why it wasn't implemented yet, but feel free to open a PR if you need this.
Layouts can be nested within each other by using the NestedView
type created by the binary_layout! macro for one layout as a field type in another layout.
Example:
use binary_layout::prelude::*;
binary_layout!(icmp_header, BigEndian, {
packet_type: u8,
code: u8,
checksum: u16,
rest_of_header: [u8; 4],
});
binary_layout!(icmp_packet, BigEndian, {
header: icmp_header::NestedView,
data_section: [u8], // open ended byte array, matches until the end of the packet
});
Nested layouts do not need to have the same endianess. The following, which
is copied from the complete example at tests/nested.rs
in this repository,
shows how you can mix different endian layouts together:
use binary_layout::prelude::*;
use core::convert::TryInto;
binary_layout!(deep_nesting, LittleEndian, {
field1: u16,
});
binary_layout!(header, BigEndian, {
field1: i16,
});
binary_layout!(middle, NativeEndian, {
deep: deep_nesting::NestedView,
field1: u16,
});
binary_layout!(footer, BigEndian, {
field1: u32,
deep: deep_nesting::NestedView,
tail: [u8],
});
binary_layout!(whole, LittleEndian, {
head: header::NestedView,
field1: u64,
mid: middle::NestedView,
field2: u128,
foot: footer::NestedView,
});
License: MIT OR Apache-2.0