| Crates.io | lex_sleuther_multiplexer |
| lib.rs | lex_sleuther_multiplexer |
| version | 1.0.0 |
| created_at | 2025-04-26 17:33:57.491893+00 |
| updated_at | 2025-04-26 17:33:57.491893+00 |
| description | a multiplexer over multiple lexers |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1650450 |
| size | 102,742 |
A multiplexer over multiple lexers.
This crate is only responsible for actually lexing source code and producing a feature vector, a glorified count of each token occurrence.
We leverage the lexgen lexer library to build optimized lexer state machines quickly. In many cases, the resulting tokenization is not totally correct, but it is good enough and fast enough to provide a meaningful guess.
There are three general steps to adding new lexers:
lexgen::lexer! macro to generate a lexer based on a PEG-like syntax. Use existing lexers as a guide.lexers in src/lib.rs.Note that adding a lexer here does not add a new classification set to lex_sleuther upstream.
The set of classification categories is completely decoupled from the set of lexers this library uses internally.