Crates.io | parser-compose |
lib.rs | parser-compose |
version | 0.19.0 |
source | src |
created_at | 2023-09-25 17:28:39.747919 |
updated_at | 2023-12-17 03:57:04.505569 |
description | A library for writing and composing parsers for arbitrary file or data formats |
homepage | |
repository | https://gitlab.com/wake-sleeper/parser-compose |
max_upload_size | |
id | 982908 |
size | 48,279 |
DEPRECATED. See bparse
⚠️ Warning ☣️
Homemade, hand-rolled code ahead. Experimental. May not function as advertised.
parser-compose
is a crate for writing parsers for arbitrary data formats. You
can use it to parse string slices and slice
s whose elements implement
Clone
.
How is this different?
parser-compose
does nothing for error handling or reporting. If the parser
suceeds you get a Some(..)
with the value. If the parser fails you get a
None
. Many parsing tasks do not require knowing why parsing fails, only
that it did. For example, when parsing an HTTP message header, you just want to
know if it is valid or not. This crate focuses on this use case, and in so
doing sheds all the ceremony involved in supporting parser error handling.
Parsing a IMF http date.
use parser_compose::{Parser,utf8_scalar};
struct ImfDate {
day_name: String,
day: u8,
month: String,
year: u16,
hour: u8,
minute: u8,
second: u8,
}
fn imfdate(input: &str) -> Option<(ImfDate, &str)> {
let digit = utf8_scalar(0x30..=0x39);
let twodigits = digit
.repeats(2)
.input()
.map(|s| u8::from_str_radix(s, 10).unwrap());
// day-name = %s"Mon" / %s"Tue" / %s"Wed"
// / %s"Thu" / %s"Fri" / %s"Sat" / %s"Sun"
let day_name = "Mon".or("Tue").or("Wed")
.or("Thu").or("Fri").or("Sat").or("Sun")
.map(str::to_string);
let day = twodigits;
// month = %s"Jan" / %s"Feb" / %s"Mar" / %s"Apr"
// / %s"May" / %s"Jun" / %s"Jul" / %s"Aug"
// / %s"Sep" / %s"Oct" / %s"Nov" / %s"Dec"
let month = "Jan".or("Feb").or("Mar").or("Apr")
.or("May").or("Jun").or("Jul").or("Aug")
.or("Sep").or("Oct").or("Nov").or("Dec")
.map(str::to_string);
let year = digit
.repeats(4)
.input()
.map(|s| u16::from_str_radix(s, 10).unwrap());
// date1 = day SP month SP year
let date1 = (day, " ", month, " ", year)
.map(|(d, _, m, _, y)| (d, m, y));
let gmt = "GMT";
let hour = twodigits;
let minute = twodigits;
let second = twodigits;
// time-of-day = hour ":" minute ":" second
let time_of_day = (hour, ":", minute, ":", second)
.map(|(h,_,m,_,s)| (h, m, s));
// IMF-fixdate = day-name "," SP date1 SP time-of-day SP GMT
(day_name, ", ", date1, " ", time_of_day, " ", gmt)
.map(|res| ImfDate {
day_name: res.0,
day: res.2.0,
month: res.2.1,
year: res.2.2,
hour: res.4.0,
minute: res.4.1,
second: res.4.2,
}).try_parse(input)
}
let input = "Sun, 06 Nov 1994 08:49:37 GMT";
let (date, rest) = imfdate(input).unwrap();
assert_eq!(date.day_name, "Sun");
assert_eq!(date.day, 6);
assert_eq!(date.month, "Nov");
assert_eq!(date.year, 1994);
assert_eq!(date.hour, 8);
assert_eq!(date.minute, 49);
assert_eq!(date.second, 37);
// missing comma after day name
assert!(
imfdate("Sun 06 Nov 1994 08:49:37 GMT").is_none(),
);
Validating an http quoted string:
use parser_compose::{Parser,byte};
/// Tries to parse a quoted string out of `input`.
/// If it succeeds, the returned tuple contains the quoted string (without
/// quotes) along with the rest of the input
fn quoted_string(input: &[u8]) -> Option<(&[u8], &[u8])> {
let htab = byte(b'\t');
let sp = byte(b' ');
let dquote = byte(b'"');
let obs_text = byte(0x80..=0xFF);
let vchar = byte(0x21..=0x7E);
// qdtext = HTAB / SP / %x21 / %x23-5B / %x5D-7E / obs-text
let qdtext = htab
.or(sp)
.or(byte(0x21))
.or(byte(0x23..=0x5B))
.or(byte(0x5D..=0x7E))
.or(obs_text);
// quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text )
let quoted_pair = (
byte(b'\\'),
htab.or(sp).or(vchar).or(obs_text)
).input();
// quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE
(dquote, qdtext.or(quoted_pair).repeats(0..).input(), dquote)
.map(|(_, r, _)| r)
.try_parse(input)
}
// 3 quoted strings one after the other "1", "a" and "\""
let input = r##""1""a""\"""##.as_bytes();
let (first, rest) = quoted_string(input).unwrap();
assert_eq!(
b"1",
first
);
// Notice that because of its signature, `quoted_string` can be treated as a parser
let (remaining, _) = quoted_string.accumulate(2).try_parse(rest).unwrap();
assert_eq!(remaining.len(), 2);
assert_eq!(remaining[0], b"a");
assert_eq!(remaining[1], b"\\\"");
This crate would not have been possible without:
pom
, which lays out the various approaches to writing parser
combinators in rust.