| Crates.io | whitehole |
| lib.rs | whitehole |
| version | 0.8.0 |
| created_at | 2024-11-24 13:13:22.606592+00 |
| updated_at | 2025-04-05 08:04:00.965849+00 |
| description | A simple, fast, intuitive parser combinator framework for Rust. |
| homepage | |
| repository | https://github.com/DiscreteTom/whitehole |
| max_upload_size | |
| id | 1459245 |
| size | 301,351 |
A simple, fast, intuitive parser combinator framework for Rust.
eat, till, next, take, wrap, recur.+, |, ! to compose combinators, use * to repeat a combinator.recur which uses some pointers for recursion.unsafe variants for performance.&str) and bytes (&[u8]) support.cargo add whitehole
See the examples directory for more examples.
Here is a simple example to parse hexadecimal color codes:
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
// Repeat a combinator with `*`.
(next(|c| c.is_ascii_hexdigit()) * 2)
// Convert the matched content to `u8`.
.select(|accepted| u8::from_str_radix(accepted.content(), 16).unwrap())
// Wrap `u8` to `(u8,)`, this is required by `+` below.
.tuple()
};
// Concat multiple combinators with `+`.
// Tuple values will be concatenated into a single tuple.
// Here `() + (u8,) + (u8,) + (u8,)` will be `(u8, u8, u8)`.
let entry = eat('#') + double_hex() + double_hex() + double_hex();
let mut parser = Parser::builder().entry(entry).build("#FFA500");
let output = parser.next().unwrap();
assert_eq!(output.digested, 7);
assert_eq!(output.value, (0xFF, 0xA5, 0x00));
The easiest way is to apply .log(name) to any combinator you need to inspect.
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
(next(|c| c.is_ascii_hexdigit()).log("hex") * 2)
.log("double_hex")
.select(|accepted| u8::from_str_radix(accepted.content(), 16).unwrap())
.tuple()
};
let entry =
(eat('#').log("hash") + double_hex().log("R") + double_hex().log("G") + double_hex().log("B"))
.log("entry");
let mut parser = Parser::builder().entry(entry).build("#FFA500");
parser.next().unwrap();
Output:
(entry) input: "#FFA500"
| (hash) input: "#FFA500"
| (hash) output: Some("#")
| (R) input: "FFA500"
| | (double_hex) input: "FFA500"
| | | (hex) input: "FFA500"
| | | (hex) output: Some("F")
| | | (hex) input: "FA500"
| | | (hex) output: Some("F")
| | (double_hex) output: Some("FF")
| (R) output: Some("FF")
| (G) input: "A500"
| | (double_hex) input: "A500"
| | | (hex) input: "A500"
| | | (hex) output: Some("A")
| | | (hex) input: "500"
| | | (hex) output: Some("5")
| | (double_hex) output: Some("A5")
| (G) output: Some("A5")
| (B) input: "00"
| | (double_hex) input: "00"
| | | (hex) input: "00"
| | | (hex) output: Some("0")
| | | (hex) input: "0"
| | | (hex) output: Some("0")
| | (double_hex) output: Some("00")
| (B) output: Some("00")
(entry) output: Some("#FFA500")
If you need to inspect your custom state and heap, you can use combinator decorators or write your own combinator extensions to achieve this.
Because of the high level abstraction, it's hard to set breakpoints to combinators.
One workaround is to use wrap to wrap your combinator in a closure or function and manually call Action::exec.
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
(next(|c| c.is_ascii_hexdigit()) * 2)
.select(|accepted| u8::from_str_radix(accepted.content(), 16).unwrap())
.tuple()
};
// wrap the original combinator
let double_hex = || {
use whitehole::{action::Action, combinator::wrap};
let c = double_hex();
wrap(move |input| {
// set a breakpoint here
c.exec(input)
})
};
let entry = eat('#') + double_hex() + double_hex() + double_hex();
let mut parser = Parser::builder().entry(entry).build("#FFA500");
parser.next().unwrap();
in_str: a procedural macro to generate a closure that checks if a character is in the provided literal string.This project is inspired by: