token_trie

Crates.iotoken_trie
lib.rstoken_trie
version0.1.0
created_at2025-09-22 08:28:01.395271+00
updated_at2025-09-22 08:28:01.395271+00
descriptionA high-performance Radix Trie implementation with sorted children for efficient binary search operations
homepagehttps://github.com/reyoung/token_trie
repositoryhttps://github.com/reyoung/token_trie
max_upload_size
id1849663
size68,883
Yang Yu (reyoung)

documentation

https://docs.rs/token_trie

README

Token Trie

A high-performance Radix Trie implementation in Rust with sorted children for efficient binary search operations.

Features

  • Radix Compression: Efficient space usage through path compression
  • Sorted Children: All child nodes are kept sorted for O(log k) lookups where k is the branching factor
  • Binary Search: Fast child node lookup using binary search
  • Generic Keys: Works with any key type that implements Clone + Ord + PartialEq
  • Prefix Operations: Efficient prefix search and common prefix length calculation
  • Memory Safe: Written in safe Rust with comprehensive test coverage

Installation

Add this to your Cargo.toml:

[dependencies]
token_trie = "0.1.0"

Quick Start

use token_trie::Trie;

// Create a new trie
let mut trie = Trie::new();

// Insert key-value pairs
trie.insert(&['h', 'e', 'l', 'l', 'o'], "greeting");
trie.insert(&['h', 'e', 'l', 'p'], "assistance");
trie.insert(&['w', 'o', 'r', 'l', 'd'], "earth");

// Look up values
assert_eq!(trie.get(&['h', 'e', 'l', 'l', 'o']), Some(&"greeting"));
assert_eq!(trie.get(&['f', 'o', 'o']), None);

// Prefix search
let he_words = trie.keys_with_prefix(&['h', 'e']);
println!("Words starting with 'he': {:?}", he_words);

// Common prefix length
let prefix_len = trie.common_prefix_length(&['h', 'e', 'l', 'p', 'e', 'r']);
assert_eq!(prefix_len, 4); // Matches "help"

Usage with Integer Keys (Token IDs)

The trie works excellently with integer sequences, making it perfect for token-based applications:

use token_trie::Trie;

let mut token_trie = Trie::new();

// Insert token sequences
token_trie.insert(&[1, 2, 3], "sequence_a");
token_trie.insert(&[1, 2, 4], "sequence_b");
token_trie.insert(&[1, 5], "sequence_c");

// Find all sequences starting with [1, 2]
let sequences = token_trie.keys_with_prefix(&[1, 2]);
// Returns: [[1, 2, 3], [1, 2, 4]]

API Reference

Core Operations

  • new() - Create a new empty trie
  • insert(&[Key], Value) - Insert a key-value pair
  • get(&[Key]) -> Option<&Value> - Get value for a key
  • remove(&[Key]) -> Option<Value> - Remove and return value for a key
  • contains_key(&[Key]) -> bool - Check if key exists

Advanced Operations

  • keys_with_prefix(&[Key]) -> Vec<Vec<Key>> - Get all keys with given prefix
  • common_prefix_length(&[Key]) -> usize - Find longest common prefix length
  • iter() -> Vec<(Vec<Key>, &Value)> - Get all key-value pairs
  • keys() -> Vec<Vec<Key>> - Get all keys
  • len() -> usize - Number of key-value pairs
  • is_empty() -> bool - Check if trie is empty

Performance Characteristics

  • Insert: O(m) where m is the key length
  • Lookup: O(m + log k) where k is the average branching factor
  • Delete: O(m + log k)
  • Prefix Search: O(p + n) where p is prefix length and n is number of matching keys
  • Space: O(total_characters) with radix compression

Algorithm Details

This implementation uses a Radix Trie (compressed trie) with the following optimizations:

  1. Path Compression: Nodes with single children are compressed into their parent
  2. Sorted Children: Child nodes are maintained in sorted order
  3. Binary Search: Child lookup uses binary search for O(log k) performance
  4. Root Constraint: The root node always has an empty key

Examples

Run the comprehensive demo:

cargo run --example demo

Testing

The implementation includes extensive test coverage:

cargo test

Test categories:

  • Basic operations (insert, get, remove)
  • Radix compression behavior
  • Children ordering after all operations
  • Binary search efficiency
  • Edge cases with various key types
  • Random operations stress testing
  • Common prefix length accuracy

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please ensure all tests pass and add tests for new features.

Commit count: 2

cargo fmt