Crates.io | whitehole |
lib.rs | whitehole |
version | |
source | src |
created_at | 2024-11-24 13:13:22.606592+00 |
updated_at | 2025-02-16 13:17:29.083244+00 |
description | A simple, fast, intuitive parser combinator framework for Rust. |
homepage | |
repository | https://github.com/DiscreteTom/whitehole |
max_upload_size | |
id | 1459245 |
Cargo.toml error: | TOML parse error at line 18, column 1 | 18 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
A simple, fast, intuitive parser combinator framework for Rust.
eat
, take
, next
, till
, wrap
, recur
.+
and |
to compose combinators, use *
to repeat a combinator.recur
which uses some pointers for recursion.unsafe
variants for performance.&str
) and bytes (&[u8]
) support.cargo add whitehole
See the examples directory for more examples.
Here is a simple example to parse hexadecimal color codes:
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
// Repeat a combinator with `*`.
(next(|c| c.is_ascii_hexdigit()) * 2)
// Convert the matched content to `u8`.
.select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())
// Wrap `u8` to `(u8,)`, this is required by `+` below.
.tuple()
};
// Concat multiple combinators with `+`.
// Tuple values will be concatenated into a single tuple.
// Here `() + (u8,) + (u8,) + (u8,)` will be `(u8, u8, u8)`.
let entry = eat('#') + double_hex() + double_hex() + double_hex();
let mut parser = Parser::builder().entry(entry).build("#FFA500");
let output = parser.next().unwrap();
assert_eq!(output.digested, 7);
assert_eq!(output.value, (0xFF, 0xA5, 0x00));
The easiest way is to apply .log(name)
to any combinator you need to inspect.
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
(next(|c| c.is_ascii_hexdigit()).log("hex") * 2)
.log("double_hex")
.select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())
.tuple()
};
let entry =
(eat('#').log("hash") + double_hex().log("R") + double_hex().log("G") + double_hex().log("B"))
.log("entry");
let mut parser = Parser::builder().entry(entry).build("#FFA500");
parser.next().unwrap();
Output:
(entry) input: "#FFA500"
| (hash) input: "#FFA500"
| (hash) output: Some("#")
| (R) input: "FFA500"
| | (double_hex) input: "FFA500"
| | | (hex) input: "FFA500"
| | | (hex) output: Some("F")
| | | (hex) input: "FA500"
| | | (hex) output: Some("F")
| | (double_hex) output: Some("FF")
| (R) output: Some("FF")
| (G) input: "A500"
| | (double_hex) input: "A500"
| | | (hex) input: "A500"
| | | (hex) output: Some("A")
| | | (hex) input: "500"
| | | (hex) output: Some("5")
| | (double_hex) output: Some("A5")
| (G) output: Some("A5")
| (B) input: "00"
| | (double_hex) input: "00"
| | | (hex) input: "00"
| | | (hex) output: Some("0")
| | | (hex) input: "0"
| | | (hex) output: Some("0")
| | (double_hex) output: Some("00")
| (B) output: Some("00")
(entry) output: Some("#FFA500")
If you need to inspect your custom state and heap, you can use combinator decorators or write your own combinator extensions to achieve this.
Because of the high level abstraction, it's hard to set breakpoints to combinators.
One workaround is to use wrap
to wrap your combinator in a closure or function and manually call Action::exec
.
use whitehole::{
combinator::{eat, next},
parser::Parser,
};
let double_hex = || {
(next(|c| c.is_ascii_hexdigit()) * 2)
.select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())
.tuple()
};
// wrap the original combinator
let double_hex = || {
use whitehole::{action::Action, combinator::wrap};
let c = double_hex();
wrap(move |instant, ctx| {
// set a breakpoint here
c.exec(instant, ctx)
})
};
let entry = eat('#') + double_hex() + double_hex() + double_hex();
let mut parser = Parser::builder().entry(entry).build("#FFA500");
parser.next().unwrap();
in_str
: a procedural macro to generate a closure that checks if a character is in the provided literal string.This project is inspired by: