token_processor

Crates.io	token_processor
lib.rs	token_processor
version	0.1.6
created_at	2025-04-21 23:54:10.117904+00
updated_at	2025-04-22 05:08:41.397494+00
description	A fast, streaming‑first Rust library for processing LLM outputs by attaching callbacks to XML‑style tags—supporting both streaming and buffered handlers—and using aho‑corasick for ultra‑efficient, cross‑chunk pattern matching on decoded text tokens.
homepage
repository	https://github.com/ljt019/token_processor/
max_upload_size
id	1643376
size	53,509

Lucien Thomas (ljt019)

documentation

README

token_processor

A fast, streaming‐oriented token processor for Large Language Model output in Rust.

It's meant to be used with already decoded text tokens/chunks.

Features

Streaming Handlers: Callbacks on tag open, data chunks, and close events in real time.
Buffered Handlers: Collect full payload between tags and invoke an async callback on close.
High Performance: Uses aho-corasick for efficient multi-pattern scanning, including cross‐chunk matches.

Installation

Add this to your Cargo.toml:

[dependencies]
token_processor = { path = "https://github.com/ljt019/token_processor"}

Or use cargo:

cargo add token_processor

Quickstart

use token_processor::{Tag, TokenProcessorBuilder};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut processor = TokenProcessorBuilder::new(1024)
        .streaming_tag(
            Tag::new("<think>"),
            || print!("[open] "),
            |chunk: &str| print!("{}", chunk),
            || print!(" [close]"),
        )
        .buffered_tag(Tag::new("<tool>"), |payload: String| async move {
            println!("[tool payload] {}", payload);
        })
        .raw_tokens(|chunk: &str| print!("{}", chunk))
        .build()?;

    processor.process("Hello <think>world</think> <tool>data</tool>!").await?;
    processor.flush().await?;
    Ok(())
}

Examples

Explore the examples/ folder for more usage scenarios:

simple.rs – raw tokens only
streaming_tags.rs – streaming‐mode tag handling
buffered_tags.rs – buffered‐mode tag handling

Testing

Run the full test suite:

cargo test

License

Licensed under MIT OR Apache‐2.0. See LICENSE for details.

Commit count: 34

token_processor

documentation

README

token_processor

Features

Installation

Quickstart

Examples

Testing

License

cargo fmt