sbe_gen

Crates.iosbe_gen
lib.rssbe_gen
version0.3.2
created_at2026-01-07 09:25:44.955952+00
updated_at2026-01-09 18:02:10.596823+00
descriptionSimple Binary Encoding (SBE) code generator for Rust using zerocopy
homepage
repository
max_upload_size
id2027870
size179,403
(dkumsh)

documentation

README

SBE Generator for Rust

This crate provides a small, pragmatic compiler for the Simple Binary Encoding (SBE) protocol. It reads SBE XML schemas and produces zero‑copy Rust message types that are ready for use in performance‑sensitive applications. Generated structures rely on the zerocopy crate to provide safe, alignment‑aware views over raw byte buffers.

Unlike the reference SBE toolchain, this project focuses solely on Rust and keeps the API minimal and easy to use. It does not support code generation for other languages.

Features

  • Zero‑copy decoding: Generated message structs derive FromBytes, IntoBytes, KnownLayout, Immutable and Unaligned so they can be safely cast from network buffers without copying.
  • Spec‑aware encoding builders: Each message gets a FooBuilder companion that writes the fixed block, groups and variable data with the correct offsets, byte order and length prefixes. Builders can also emit the standard SBE message header.
  • Byte‑order aware fields: Multi‑byte integer and floating‑point fields use the zerocopy::byteorder types (e.g. little_endian::U32, little_endian::F64) so that endianness is explicit and efficient.
  • Field offsets and padding: Explicit offset attributes are honoured, with padding inserted to keep layout in sync with the SBE block length.
  • Acting-version aware decoding: Helpers accept the standard SBE message header, apply the advertised block length/version at runtime, and expose presence checks so older payloads still parse safely.
  • Declarative parsing helpers: Each generated message implements a parse_prefix helper that leverages zerocopy::Ref to split a slice into a typed prefix and a remainder.
  • Groups and variable data included: Nested repeating groups are emitted with iterable views and entry structs, and data fields become VarData slices with an ergonomic as_str() helper.
  • Optional fields: presence="optional" fields stay zero‑copy but gain <field>_opt() accessors that return Option based on the SBE null value for that primitive.
  • Schema reflection: Generated code surfaces SINCE_VERSION, SEMANTIC_TYPE, field offsets and constraint constants so you can reason about compatibility at the call site.

Usage

Add the sbe_gen crate to your Cargo.toml and build a small driver program which reads your XML schema and writes the generated code to disk:

use std::{fs, path::Path};
use sbe_gen::{generate_to, GeneratorOptions};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let schema_xml = fs::read_to_string("./my-schema.xml")?;
    let out_dir = Path::new("./src/sbe");
    generate_to(&schema_xml, out_dir, &GeneratorOptions::default())?;
    Ok(())
}

Alternatively, install the CLI binary with cargo install --path . and run it directly:

cargo run --bin sbe_gen -- -i path/to/my-schema.xml -o src/sbe

For a more complete guide with end-to-end examples (fixed-size and variable-size messages), see docs/USAGE.md.

This will create a module in src/sbe containing one Rust file per message defined in the schema. Each file starts with common imports and includes a parse_prefix helper on each message type. For example, given a message header like:

<message name="PacketHdr" id="0" blockLength="12">
  <field name="seq" id="1" type="uint32" />
  <field name="sending_time" id="2" type="uint64" />
</message>

the generator produces the following Rust code:

use zerocopy::{Ref, FromBytes, IntoBytes, KnownLayout, Immutable, Unaligned};
use zerocopy::byteorder::little_endian::{U32, U64};

#[repr(C)]
#[derive(Debug, FromBytes, IntoBytes, KnownLayout, Immutable, Unaligned, Clone, Copy)]
pub struct PacketHdr {
    pub seq: U32,
    pub sending_time: U64,
}

impl PacketHdr {
    #[inline]
    pub fn parse_prefix(body: &[u8]) -> Option<(&Self, &[u8])> {
        Ref::<_, Self>::from_prefix(body)
            .ok()
            .map(|(r, b)| (Ref::into_ref(r), b))
    }
}

Build-time integration (build.rs)

If you want generated modules to live inside your crate (like examples/cme_mdp3_pcap_dump), use a build script that runs the generator at compile time.

  1. Add the build dependency:
[build-dependencies]
sbe_gen = "0.3"
  1. Organize schemas under schemas/<schema_name>/:
schemas/
  my_schema/
    templates_FixBinary.xml   # preferred name

If templates_FixBinary.xml is not present, the build script below expects exactly one .xml file in that directory.

  1. Add build.rs that emits code into src/generated/<schema_name> and writes src/generated/mod.rs:
use std::env;
use std::fs;
use std::path::{Path, PathBuf};

use sbe_gen::{generate_to, GeneratorOptions};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let manifest_dir = PathBuf::from(env::var("CARGO_MANIFEST_DIR")?);
    let schemas_dir = manifest_dir.join("schemas");
    let generated_root = manifest_dir.join("src/generated");

    println!("cargo:rerun-if-changed=build.rs");
    println!("cargo:rerun-if-changed={}", schemas_dir.display());

    fs::create_dir_all(&generated_root)?;

    let mut schema_dirs: Vec<PathBuf> = fs::read_dir(&schemas_dir)?
        .filter_map(|entry| entry.ok())
        .map(|entry| entry.path())
        .filter(|path| path.is_dir())
        .collect();
    schema_dirs.sort();

    let opts = GeneratorOptions::default();
    let mut modules = Vec::new();
    for schema_dir in schema_dirs {
        let schema_name = schema_dir
            .file_name()
            .and_then(|s| s.to_str())
            .ok_or("invalid schema directory name")?
            .to_string();

        let schema_xml = pick_schema_xml(&schema_dir)?;
        println!("cargo:rerun-if-changed={}", schema_xml.display());
        let schema_contents = fs::read_to_string(&schema_xml)?;

        let output_dir = generated_root.join(&schema_name);
        fs::create_dir_all(&output_dir)?;
        generate_to(&schema_contents, &output_dir, &opts)?;

        modules.push(schema_name);
    }

    write_generated_mod(&generated_root, &modules)?;
    Ok(())
}

fn pick_schema_xml(dir: &Path) -> Result<PathBuf, Box<dyn std::error::Error>> {
    let preferred = dir.join("templates_FixBinary.xml");
    if preferred.is_file() {
        return Ok(preferred);
    }

    let mut xml_files: Vec<PathBuf> = fs::read_dir(dir)?
        .filter_map(|entry| entry.ok())
        .map(|entry| entry.path())
        .filter(|path| path.extension().map(|ext| ext == "xml").unwrap_or(false))
        .collect();
    xml_files.sort();

    match xml_files.len() {
        0 => Err(format!("no XML schema file found in {}", dir.display()).into()),
        1 => Ok(xml_files.remove(0)),
        _ => Err(format!(
            "multiple XML files found in {}, pick one via templates_FixBinary.xml",
            dir.display()
        )
        .into()),
    }
}

fn write_generated_mod(root: &Path, modules: &[String]) -> Result<(), Box<dyn std::error::Error>> {
    let mut buf = String::from("// @generated by build.rs; do not edit\n");
    for module in modules {
        buf.push_str(&format!("pub mod {};\n", module));
    }
    fs::write(root.join("mod.rs"), buf)?;
    Ok(())
}
  1. Expose and use the generated modules:
pub mod generated;
use crate::generated::my_schema::packet_hdr::PacketHdr;

If you only have one schema, you can simplify the build script to read a single XML file instead of scanning subdirectories.

Groups and variable data

Groups are emitted as iterators layered on top of the raw buffer, and variable‑length data fields come back as lightweight VarData<'a> wrappers you can inspect as bytes or as UTF‑8 strings.

<message name="Book" id="1" blockLength="4">
  <field name="seq" id="1" type="uint32" />
  <group name="Levels" id="2" blockLength="16" dimensionType="groupSize">
    <field name="price" id="1" type="int64" />
    <field name="qty" id="2" type="int64" />
    <data name="note" id="3" type="varStringEncoding" />
  </group>
  <data name="raw" id="4" type="varStringEncoding" />
</message>

The generated module exposes clear, chainable helpers:

use sbe::book::*;

let (book, rest) = Book::parse_prefix(bytes).expect("prefix");

// Parse the Levels group
let levels = parse_levels(rest).expect("levels header");
for level in levels.iter() {
    let price = level.body.price;
    let qty = level.body.qty;
    let note = level.note.as_str();
}
let after_levels = levels.iter().remainder();

// Parse trailing variable data
let (raw, tail) = book.parse_raw(after_levels).expect("raw data");
let raw_str = raw.as_str();

Encoding with builders

Every message module includes a builder that writes the fixed block, groups and variable data with the correct padding, offsets and length prefixes. Builders accept native Rust numeric types and take care of the endianness for you.

use sbe::book::*;

let mut builder = BookBuilder::new();
builder.seq(123);
builder.levels(|levels| {
    levels.entry(|entry| {
        entry.price(101_500);
        entry.qty(10);
        entry.note(b"resting");
    });
});
builder.raw(b"payload");

// Emit the message framed with the standard SBE header
let framed = builder.finish_with_header();
// or if you only need the body:
// let body = builder.finish();

CME MDP3 pcap dump example

This repository includes a standalone example crate that generates CME MDP3 decoders from examples/cme_mdp3_pcap_dump/schemas/cme_mdp3/templates_FixBinary.xml and dumps Market-by-Order packets from a pcap file:

cargo run --manifest-path examples/cme_mdp3_pcap_dump/Cargo.toml -- \
  --pcap /path/to/file.pcap

The example supports filters (--src-port, --dst-port, --udp-port, --src, --dst) and --limit. The pcap crate requires libpcap headers to be installed on your system.

Status and limitations

This project is a work‑in‑progress. The generator covers the core SBE types (primitives, enums, sets, composites, groups and variable data) along with the standard message header and byte‑order rules. Notable spec features that are still missing:

  • Optional composites are treated as required; optional handling is only emitted for primitives, enums and sets.
  • Acting version awareness is not implemented; parsing assumes the current schema version and declared block lengths.
  • Value constraints (min/max/null/initial) are emitted as constants but are not enforced at runtime.
  • Constant fields remain const definitions rather than struct members.

Contributions to extend the generator are welcome. See the src/parser.rs and src/codegen.rs modules for the implementation.

Commit count: 0

cargo fmt