| Crates.io | langram |
| lib.rs | langram |
| version | 0.7.0 |
| created_at | 2025-04-14 12:26:11.014442+00 |
| updated_at | 2025-08-02 16:45:10.447816+00 |
| description | Natural language detection library |
| homepage | |
| repository | https://github.com/RoDmitry/langram |
| max_upload_size | |
| id | 1632757 |
| size | 92,490 |
One language can be written in multiple scripts, so it will be detected as a different
ScriptLanguage(language + script)
Uses alphabet_detector as a word separator + language prefilter.
Based on chars (1 - 5) and 1 word n-gram language model modified algorithm.
ModelsStorage with all models preloaded uses around 4.1GB of RAM (2.4GB using max_trigrams). There can be a way (unimplemented) to unload each language model after use, it will work slower but will use around 300MB of RAM.
This library is a complete rewrite of Lingua: much faster, more accuracy, more languages, etc.
Comparison with other language detectors
To use it, you need to patch langram_models in Cargo.toml:
[patch.crates-io]
langram_models = { git = "https://github.com/RoDmitry/langram_models.git" }
[patch.crates-io]
langram_models = { path = "../langram_models" }
Which is more advanced and allows you to remove model ngrams, so that final executable would be lighter.