lex_sleuther_multiplexer

Crates.iolex_sleuther_multiplexer
lib.rslex_sleuther_multiplexer
version1.0.0
created_at2025-04-26 17:33:57.491893+00
updated_at2025-04-26 17:33:57.491893+00
descriptiona multiplexer over multiple lexers
homepage
repository
max_upload_size
id1650450
size102,742
(wtollef-cs)

documentation

README

lex_sleuther_multiplexer

A multiplexer over multiple lexers.

This crate is only responsible for actually lexing source code and producing a feature vector, a glorified count of each token occurrence.

We leverage the lexgen lexer library to build optimized lexer state machines quickly. In many cases, the resulting tokenization is not totally correct, but it is good enough and fast enough to provide a meaningful guess.

adding new lexers

There are three general steps to adding new lexers:

  1. Create a new module using the lexgen::lexer! macro to generate a lexer based on a PEG-like syntax. Use existing lexers as a guide.
  2. Add your lexer to the array lexers in src/lib.rs.
  3. Create a simple test to sanity check the stability of your lexer. Use existing tests as an example.

Note that adding a lexer here does not add a new classification set to lex_sleuther upstream. The set of classification categories is completely decoupled from the set of lexers this library uses internally.

Commit count: 0

cargo fmt