weighted_trie

Crates.io	weighted_trie
lib.rs	weighted_trie
version	0.2.0
created_at	2023-02-11 16:43:58.325696+00
updated_at	2026-01-08 18:26:55.576259+00
description	Weighted prefix trie for autosuggestions
homepage	https://github.com/subpath/weighted_trie
repository	https://github.com/subpath/weighted_trie
max_upload_size
id	782612
size	75,103

Alexander Osipenko (subpath)

documentation

README

weighted_trie

🦀 Rust crate that allows creating weighted prefix trees that can be used in autocomplete

Released API Docs

Features

Speed-optimized: Single insert in 272ns, lookup in 244ns, bulk build (100K) in 33.7ms
Memory-efficient: Only 243 bytes per word at 1M scale
In-memory: Pure memory-based data structure, no disk I/O
Dynamic: Supports incremental inserts on-the-fly
Prefix-based: Returns all matches for a given prefix, sorted by weight
Weight-sorted: Results pre-sorted by descending weight
Configurable limits:
- Max suggestions per query (default: 10)
- Max word length (default: 100 characters)

Quickstart

To use weigthed-trie, add the following to your Cargo.toml file:

[dependencies]
weighted_trie = "0.1.0"  # NOTE: Replace to latest minor version.

Usage overview

use weighted_trie::WeightedTrie;

let mut trie = WeightedTrie::new();
// build trie with words and associated weights
trie.insert("pie", 5);
trie.insert("pita", 2);
trie.insert("pi", 1);
trie.insert("pizza", 10);

// get prefix based suggestions sorted by weight
let suggestions = trie.search("pi");
assert_eq!(suggestions, vec!["pizza", "pie", "pita", "pi"]);

let suggestions = trie.search("piz");
assert_eq!(suggestions, vec!["pizza"]);

// out of vocabulary
let suggestions = trie.search("apple");
assert_eq!(suggestions.len(), 0);

Alternatively you can use .build method

use weighted_trie::{WeightedString, WeightedTrie};

let weighted_strings = vec![
    WeightedString::new("pie", 5),
    WeightedString::new("pita", 2),
    WeightedString::new("pi", 1),
    WeightedString::new("pizza", 10),
];

let trie = WeightedTrie::build(weighted_strings);

Custom Configuration

use weighted_trie::WeightedTrie;

// Create trie with custom max word length of 50 characters
let mut trie = WeightedTrie::with_max_word_length(50);

// This succeeds
assert!(trie.insert("short", 10));

// This fails - word too long
let very_long_word = "a".repeat(51);
assert!(!trie.insert(very_long_word, 5));

// Create trie with custom max suggestions limit
let mut trie = WeightedTrie::with_max_suggestions(5);
for i in 0..20 {
    trie.insert(format!("word{}", i), 20 - i as u32);
}
assert_eq!(trie.search("word").len(), 5); // Only top 5 returned

// Configure both word length and suggestions limit
let mut trie = WeightedTrie::with_config(100, 5);

Benchmarks

Performance

Single insert:           272 ns
Lookup (per query):      244 ns
Build (100K words):      33.7 ms
Insert 100K (incremental): 32.6 ms

Memory Footprint

Dataset      Memory      Bytes/Word
------------------------------------
10K          2.4 MB      254
50K          13.0 MB     273
100K         29.5 MB     309
500K         130.8 MB    274
1M           231.4 MB    243

Run detailed memory analysis:

cargo bench --bench memory_bench

Guidelines

README.md is generated from cargo readme command. Do not manually update README.md instead edit src/lib.rs and then run cargo readme > README.md.

Optimizations

String interning: Each word stored once, nodes use indices
Packed suggestions: Weight+index packed into u64 (bit manipulation)
compact_str: 12-byte strings vs 24-byte std String
SmallVec: Stack allocation for small collections (≤2 suggestions, ≤4 children)
Arena allocation: All nodes in Vec, use indices instead of Box pointers
hashbrown HashMap: Faster than std HashMap
Pre-allocation: Vec capacity = words × 2 (avoids reallocations)
shrink_to_fit: Removes over-allocation after build
Top-K limiting: 10 suggestions per node max
Deduplication: Automatic on insert

License

License: Apache-2.0

Commit count: 36

weighted_trie

documentation

README

weighted_trie

Features

Quickstart

Usage overview

Custom Configuration

Benchmarks

Performance

Memory Footprint

Guidelines

Optimizations

License

cargo fmt