| Crates.io | tokit |
| lib.rs | tokit |
| version | 0.0.0 |
| created_at | 2025-12-13 10:32:27.870009+00 |
| updated_at | 2025-12-13 10:32:27.870009+00 |
| description | Blazing fast parser combinators: parse-while-lexing (zero-copy), deterministic LALR-style parsing, no backtracking. Flexible emitters for fail-fast runtime or greedy compiler diagnostics |
| homepage | https://github.com/al8n/tokit |
| repository | https://github.com/al8n/tokit |
| max_upload_size | |
| id | 1982796 |
| size | 1,485,183 |
WIP: This project is still under active development and not ready for use.
Blazing fast parser combinators with parse-while-lexing architecture (zero-copy), deterministic LALR-style parsing, and no hidden backtracking.
English | 简体中文
Tokit is a blazing fast parser combinator library for Rust that uniquely combines:
Emitter traitUnlike traditional parser combinators that buffer tokens and rely on implicit backtracking, Tokit streams tokens on-demand with predictable, deterministic decisions. This makes it ideal for building high-performance language tooling, DSL parsers, compilers, and REPLs that need both speed and comprehensive error reporting.
Emitter type - same parser, different behaviorLexer<'inp> traitFatal, Silent, Ignored)str, [u8], Bytes, BStr, HipStrLogosLexer adapter for seamless Logos integrationAdd this to your Cargo.toml:
[dependencies]
tokit = "0.0.0"
std (default) - Enable standard library supportalloc - Enable allocator support for no-std environmentslogos - Enable LogosLexer adapter for Logos integrationrowan - Enable CST (Concrete Syntax Tree) support with rowan integrationbytes - Support for bytes::Bytes as token sourcebstr - Support for bstr::BStr as token sourcehipstr - Support for hipstr::HipStr as token sourceamong - Enable Among<L, M, R> parseable supportsmallvec - Enable small vector optimization utilitiesLexer<'inp> Trait
Core trait for lexers that produce token streams. Implement this to use any lexer with Tokit.
Token<'a> Trait
Defines token types with:
Kind: Token kind discriminatorError: Associated error typeLogosLexer<'inp, T, L> (feature: logos)
Ready-to-use adapter for integrating Logos lexers.
Tokit's flexible Emitter system allows the same parser to adapt to different use cases by simply changing the error handling strategy:
Fatal - Fail-fast parsing: Stop on first error (default) - perfect for runtime parsing and REPLsSilent - Silently ignore errorsIgnored - Ignore errors completelyKey Design: Change the Emitter type to switch between fail-fast runtime parsing and greedy compiler diagnostics - same parser code, different behavior. This makes Tokit suitable for both:
Runtime/REPL: Fast feedback with Fatal emitter
Compiler/IDE: Comprehensive diagnostics with greedy emitter (coming soon)
Rich Error Types (in error/ module)
UnexpectedToken, MissingToken, UnexpectedEotUnclosed, Unterminated, Malformed, InvalidHexEscape, UnicodeEscapeSpan Tracking
Span - Lightweight span representationSpanned<T> - Wrap value with spanLocated<T> - Wrap value with span and source sliceSliced<T> - Wrap value with source sliceParser Configuration
Parser<F, L, O, Error, Context> - Configurable parserParseContext - Context for emitter and cacheWindow - Type-level peek buffer capacity for deterministic lookaheadtypenum::{U1..U32}Here's a simple example parsing JSON tokens:
use logos::Logos;
use tokit::{Any, Parse, Token as TokenT};
#[derive(Debug, Logos, Clone)]
#[logos(skip r"[ \t\r\n\f]+")]
enum Token {
#[token("true", |_| true)]
#[token("false", |_| false)]
Bool(bool),
#[token("null")]
Null,
#[regex(r"-?(?:0|[1-9]\d*)(?:\.\d+)?", |lex| lex.slice().parse::<f64>().unwrap())]
Number(f64),
}
#[derive(Debug, Display, Clone, Copy)]
enum TokenKind {
Bool,
Null,
Number,
}
impl TokenT<'_> for Token {
type Kind = TokenKind;
type Error = ();
fn kind(&self) -> Self::Kind {
match self {
Token::Bool(_) => TokenKind::Bool,
Token::Null => TokenKind::Null,
Token::Number(_) => TokenKind::Number,
}
}
}
type MyLexer<'a> = tokit::LogosLexer<'a, Token, Token>;
fn main() {
// Parse any token and extract its value
let parser = Any::parser::<'_, MyLexer<'_>, ()>()
.map(|tok: Token| match tok {
Token::Number(n) => Some(n),
_ => None,
});
let result = parser.parse("42.5");
println!("{:?}", result); // Ok(Some(42.5))
}
Check out the examples directory:
# JSON token parsing with map combinators
cargo run --example json
# Note: The calculator examples are being updated for v0.3.0 API
Tokit's architecture follows a layered design:
This separation enables:
Lexer<'inp>Tokit uses a parse-while-lexing architecture where parsers consume tokens directly from the lexer as needed, without intermediate buffering:
Traditional Approach (Two-Phase):
Source → Lexer → [Token Buffer] → Parser
↓
Allocate Vec<Token> ← Extra allocation!
Tokit Approach (Streaming):
Source → Lexer ←→ Parser
↑________↓
Zero-copy streaming, no buffer
Benefits:
Unlike traditional parser combinators that rely on implicit backtracking (trying alternatives until one succeeds), Tokit uses explicit lookahead-based decisions. This design choice provides:
peek_then() and peek_then_choice()Window trait)// Traditional parser combinator (hidden backtracking):
// try_parser1.or(try_parser2).or(try_parser3) // May backtrack!
// Tokit approach (explicit lookahead, no backtracking):
let parser = any()
.peek_then::<_, typenum::U2>(|peeked, _| {
match peeked.get(0) {
Some(Token::If) => Ok(Action::Continue), // Deterministic decision
_ => Ok(Action::Stop),
}
});
Tokit uniquely combines:
Window::CAPACITY)This hybrid approach gives you composable abstractions without sacrificing performance or predictability.
Tokit's architecture decouples parsing logic from error handling strategy through the Emitter trait. This means:
Same Parser, Different Contexts:
Fatal emitter → stop on first error for immediate feedbackIgnored emitter → parse through all errors for robustness testingBenefits:
Tokit takes inspiration from:
Emittersmear: Blazing fast, fully spec-compliant, reusable parser combinators for standard GraphQL and GraphQL-like DSLstokit is dual-licensed under:
You may choose either license for your purposes.
Copyright (c) 2025 Al Liu.