Crates.io | untanglr |
lib.rs | untanglr |
version | 1.1.0 |
source | src |
created_at | 2021-07-22 20:22:53.611173 |
updated_at | 2022-09-02 21:28:50.292209 |
description | Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies. |
homepage | |
repository | https://github.com/abutnaru/untanglr |
max_upload_size | |
id | 426018 |
size | 4,415,695 |
Untanglr takes in a some mangled words and makes sense out of them so you dont have to. It goes through the input and splits it probabilistically into words. The crate includes both a bin.rs and a lib.rs to facilitate both usage as a command line utility, and as a library that you can use in your code.
Pass the tangled words as a cli argument:
$ untanglr thequickbrownfoxjumpedoverthelazydog
the quick brown fox jumped over the lazy dog
Or use it in your projects:
extern crate untanglr;
fn main() {
let lm = untanglr::LanguageModel::new();
println!("{:?}", lm.untangle("helloworld"));
}
If you find that untanglr might be useful on your machine you can install it. Just make sure cargo is installed and run:
$ cargo install untanglr
Note: Don't be discouraged if this project hasn't been updated in a while. I will address potential issues but the crate does not need regular updates.
I have developed this project around Derek Anderson's wordninja python implementation for some exercising in rust while producing something useful.