Crates.io | rust-fuzzy-search |
lib.rs | rust-fuzzy-search |
version | 0.1.1 |
source | src |
created_at | 2021-07-14 12:37:53.029613 |
updated_at | 2021-07-14 13:04:48.936373 |
description | Fuzzy Search with trigrams implemented in Rust |
homepage | https://gitlab.com/EnricoCh/rust-fuzzy-search |
repository | https://gitlab.com/EnricoCh/rust-fuzzy-search |
max_upload_size | |
id | 422697 |
size | 1,427,037 |
This crate implements Fuzzy Searching with trigrams
Fuzzy searching allows to compare strings by similarity rather than by equality:
Similar strings will get a high score (close to 1.0f32
) while dissimilar strings will get a lower score (closer to 0.0f32
).
Fuzzy searching tolerates changes in word order:
ex. "John Dep"
and "Dep John"
will get a high score.
The Algorithm used is taken from : https://dev.to/kaleman15/fuzzy-searching-with-postgresql-97o
Basic idea:
From both strings extracts all groups of 3 adjacent letters.
("House"
becomes [' H', ' Ho', 'Hou', 'ous', 'use', 'se ']
).
Note the 2 spaces added to the head of the string and the one on the tail, used to make the algorithm work on zero length words.
Then counts the number of trigrams of the first words that are also present on the second word and divide by the number of trigrams of the first word.
Example: Comparing 2 strings
fn test () {
use rust_fuzzy_search::fuzzy_compare;
let score : f32 = fuzzy_compare("kolbasobulko", "kolbasobulko");
println!("score = {:?}", score);
}
Example: Comparing a string with a list of strings and retrieving only the best matches
fn test() {
use rust_fuzzy_search::fuzzy_search_best_n;
let s = "bulko";
let list : Vec<&str> = vec![
"kolbasobulko",
"sandviĉo",
"ŝatas",
"domo",
"emuo",
"fabo",
"fazano"
];
let n : usize = 3;
let res : Vec<(&str, f32)> = fuzzy_search_best_n(s,&list, n);
for (_word, score) in res {
println!("{:?}",score)
}
}
Example: if you have a Vec
of String
s you need to convert it to a list of &str
fn works_with_strings() {
use rust_fuzzy_search::fuzzy_search;
let s = String::from("varma");
let list: Vec<String> = vec![String::from("varma vetero"), String::from("varma ĉokolado")];
fuzzy_search(&s, &list.iter().map(String::as_ref).collect::<Vec<&str>>());
}
The crate exposes 5 main functions:
fuzzy_compare will take 2 strings and return a score representing how similar those strings are.
fuzzy_search applies fuzzy_compare to a list of strings and returns a list of tuples: (word, score).
fuzzy_search_sorted is similar to fuzzy_search but orders the output in descending order.
fuzzy_search_threshold will take an additional f32
as input and returns only tuples with score greater than the threshold.
fuzzy_search_best_n will take an additional usize
arguments and returns the first n
tuples.
To use this crate, first add this to your Cargo.toml:
[dependencies]
rust-fuzzy-search = { git = "https://gitlab.com/EnricoCh/rust-fuzzy-search"}
Next, add this to your crate:
extern crate rust_fuzzy_search;
use rust_fuzzy_search::*;
fn main() {
// ...
}
To build the library use cargo build
To test the library use cargo test
Licensed under either of
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.