Crates.io | common_substrings |
lib.rs | common_substrings |
version | 1.0.0 |
source | src |
created_at | 2020-04-11 23:29:22.271208 |
updated_at | 2020-04-11 23:29:22.271208 |
description | Finding all common strings |
homepage | https://github.com/hanwencheng/common_substrings_rust |
repository | https://github.com/hanwencheng/common_substrings_rust |
max_upload_size | |
id | 228791 |
size | 24,056 |
A method for finding all common strings, particularly quick for large string samples. It only use the Rust std
library.
The algorithms uses a two dimension trie to get all the fragment. The vertical one is the standard suffix trie, but all the node of the last word in each suffix is linked, which I call them virtually horizontally linked.
Use the function get_substrings
to get all the common strings in the strings list,
use common_substrings::get_substrings;
let input_strings = vec!["java", "javascript", "typescript", "coffeescript", "coffee"];
let result_substrings = get_substrings(input_strings, 2, 3);
which gives the result list of
Substring(sources: {2, 3}, name: escript, weight: 14)
Substring(sources: {1, 0}, name: java, weight: 8)
Substring(sources: {4, 3}, name: coffee, weight: 12)
input
- The target input string vector.min_occurrences
The minimal occurrence of the captured common substrings.min_length
The minimal length of the captured common substrings.Explanation here
Apache-2.0