[](https://github.com/tpisto/fuzzy_prefix_search)
[](https://crates.io/crates/fuzzy_prefix_search)
[](https://docs.rs/fuzzy_prefix_search)
[](https://github.com/tpisto/fuzzy_prefix_search/actions?query=branch%3Amain)
# Fuzzy Prefix Search
Flexible Trie implementation in Rust for fuzzy prefix string searching and auto-completion.
Documentation:
- [API reference (docs.rs)](https://docs.rs/fuzzy_prefix_search)
## Features
- Prefix-based fuzzy searching (Levenshtein distance)
- Fuzzy search with customizable edit distance
- Multiple data associations per word
- Jaro-Winkler similarity scoring for search results
- Thread safe
- No dependencies
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
fuzzy_prefix_search = "0.2"
```
## Usage
Here's a quick example of how to use the Fuzzy Prefix Search:
```rust
use fuzzy_prefix_search::Trie;
fn main() {
let mut trie = Trie::new();
// Insert words with associated data
trie.insert("apple", 1);
trie.insert("application", 2);
trie.insert("appetite", 3);
// Perform a fuzzy search
let results = trie.search_within_distance("appl", 1);
for result in results {
println!("Word: {}, Data: {:?}", result.word, result.data);
}
// Search with similarity scoring
let scored_results = trie.search_within_distance_scored("aple", 2);
for result in scored_results {
println!("Word: {}, Data: {:?}, Score: {}", result.word, result.data, result.score);
}
}
```
## Advanced Usage
### Custom Data Types
The Trie supports any data type that implements `Clone`, `PartialEq`, `Eq`, and `Hash`:
```rust
#[derive(Clone, PartialEq, Eq, Hash)]
struct CustomData {
id: u32,
value: String,
}
let mut trie = Trie::new();
trie.insert("example", CustomData { id: 1, value: "Test".to_string() });
```
### Removing Data
You can remove all occurrences of a specific data value:
```rust
trie.remove_all(&2);
```
## Performance
- O(k) time complexity for insertion, where k is the length of the word
- Space-efficient storage using a tree structure with shared prefixes
- TODO: Benchmarks, optimizations, algorithm selection...
Caveat Emptor: we use *unsafe* in deletes for 2x read performance compared to Rc/RefCell approach.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## Resources
- https://phiresky.github.io/levenshtein-demo/
- https://www.geeksforgeeks.org/trie-insert-and-search/
- http://stevehanov.ca/blog/?id=114
- https://murilo.wordpress.com/2011/02/01/fast-and-easy-levenshtein-distance-using-a-trie-in-c/
- https://blog.vjeux.com/2011/c/c-fuzzy-search-with-trie.html
## License
This project is licensed under the MIT or Apache 2.0 License - see the [LICENSE](LICENSE) files for details.