| Crates.io | hedl-core |
| lib.rs | hedl-core |
| version | 1.2.0 |
| created_at | 2026-01-08 11:00:21.660793+00 |
| updated_at | 2026-01-21 02:57:11.205193+00 |
| description | Core parser and data model for HEDL (Hierarchical Entity Data Language) |
| homepage | https://dweve.com |
| repository | https://github.com/dweve/hedl |
| max_upload_size | |
| id | 2029986 |
| size | 1,323,106 |
The parsing engine that makes HEDL work.
Every format needs a foundation. JSON has parsers that handle {} and []. YAML wrestles with indentation. XML navigates angle brackets. HEDL's foundation is hedl-core - a parser designed from scratch for AI-era data: typed matrices, entity references, and tensor literals built into the format itself.
When you're processing 100GB datasets for ML training, or serving thousands of API requests per second, parser performance isn't academic—it's operational. hedl-core was built for this: deterministic parsing with fail-fast error handling, zero-copy preprocessing for minimal allocations, and a data model that maps directly to how AI systems think about data.
Schema-Aware Matrices: Most formats see tabular data as "arrays of objects." HEDL sees them as typed matrices with column definitions—the difference between [{}, {}, {}] and @User[id,name,email]. The parser understands this natively.
Entity References: When you write @User:alice, the parser knows it's a reference, not a string. Graph relationships are first-class
citizens, not workarounds.
Zero-Copy Where It Counts: Line offset tables and careful memory layout mean the parser allocates only what it needs. Less allocation pressure means more predictable performance.
Fail-Fast Philosophy: Invalid syntax? You get an error at parse time with the exact line number. No silent failures, no "undefined" cascading through your system.
[dependencies]
hedl-core = "1.2"
Parse HEDL documents into a structured data model:
use hedl_core::{parse, Document, Value};
let hedl = r#"
%VERSION: 1.0
%STRUCT: User: [id, name, email, created]
---
users: @User
| alice, Alice Smith, alice@example.com, 2024-01-15
| bob, Bob Jones, bob@example.com, 2024-02-20
"#;
let doc = parse(hedl.as_bytes())?;
// The data model preserves structure
if let Some(users) = doc.get("users") {
if let Some(matrix) = users.as_list() {
println!("Schema: {:?}", matrix.schema);
println!("Rows: {}", matrix.rows.len());
}
}
Work with references and relationships:
let hedl = r#"
%VERSION: 1.0
%STRUCT: Post: [id, author, title]
%STRUCT: Comment: [id, post, author, text]
---
posts: @Post
| p1, @User:alice, First Post
| p2, @User:bob, Second Post
comments: @Comment
| c1, @Post:p1, @User:bob, Great post!
| c2, @Post:p1, @User:alice, Thanks!
"#;
let doc = parse(hedl.as_bytes())?;
// References are parsed as Reference values, not strings
// Traverse the graph structure naturally
Handle tensors for ML workflows:
let hedl = r#"
%VERSION: 1.0
---
metrics: @Metric[id, values, percentiles]
| m1, 1250, [850, 1100, 1400, 2100, 3500]
| m2, 890, [620, 780, 920, 1200, 1800]
"#;
let doc = parse(hedl.as_bytes())?;
// Tensor literals are native Value::Tensor types
// No string parsing needed
The parsed document exposes a clean, typed API:
Document - Root container with version header and content mapItem - Body entries: Scalar (Value), Object (nested map), or List (MatrixList)Node - A row in a matrix list with typed fields and optional childrenValue - Tagged enum for scalar values: Null, Bool, Int, Float, String, Tensor, Reference, ExpressionMatrixList - Schema-defined tabular data with typed columnsReference - Typed entity pointers (@Type:id syntax)Deterministic Parsing: Same input always produces the same AST. No locale-dependent behavior, no timestamp drift, no platform quirks.
Zero-Copy Preprocessing: Line offset tables let the parser jump to any line without scanning the entire file. Parse the schema header, skip to the data section—no wasted work.
Type Safety: The data model uses Rust enums, not stringly-typed polymorphism. Value::Reference is distinct from Value::Scalar. Pattern matching gives you compile-time guarantees.
Error Messages That Help: Parse errors include line numbers, column positions, and context. No cryptic "unexpected token" messages.
Extensible Design: The core parser handles syntax. Higher-level crates add semantic validation, reference resolution, and format conversion.
It's a parser, not a Swiss Army knife:
hedl-linthedl-core's validation layerhedl-json, hedl-yaml, etc.hedl-streamThis separation means you can parse a 10GB file without loading validation rules, or validate a document without network-dependent resolution.
Most developers use higher-level crates (hedl facade crate, hedl-cli tool, hedl-lsp editor support). You'd use hedl-core directly when:
If you're just converting formats or validating files, use hedl-cli. If you need autocomplete in your editor, use hedl-lsp. If you're building the next HEDL tool, start here.
hedl-core uses minimal unsafe code only in performance-critical paths (string interning and arena allocation). The parser provides:
Unsafe blocks appear only in two places, both extensively audited:
lex/arena/interner.rs): Uses unsafe to create interned string references with manual lifetime managementlex/arena/vec.rs): Uses unsafe to create slice views over arena-allocated memoryThese are local to the arena module, not exposed in the public API. The safe public API wraps all arena operations.
Parse errors never panic. Invalid input returns Result<Document, HedlError> with detailed diagnostic information:
use hedl_core::{parse, HedlError, HedlErrorKind};
match parse(data) {
Ok(doc) => { /* process doc */ },
Err(e) => {
println!("Parse error at line {}: {}", e.line, e.message);
// Error kinds: Syntax, Version, Schema, Alias, Shape, Semantic,
// OrphanRow, Collision, Reference, Security, Conversion, IO
match e.kind {
HedlErrorKind::Reference => { /* handle unresolved reference */ },
HedlErrorKind::Schema => { /* handle schema mismatch */ },
_ => { /* other errors */ }
}
}
}
If you discover a security vulnerability, please email [security contact] with:
We take security seriously and will respond within 48 hours.
Apache-2.0 — Use it, fork it, build on it.