langram

Crates.iolangram
lib.rslangram
version0.7.0
created_at2025-04-14 12:26:11.014442+00
updated_at2025-08-02 16:45:10.447816+00
descriptionNatural language detection library
homepage
repositoryhttps://github.com/RoDmitry/langram
max_upload_size
id1632757
size92,490
Dmitry Rodionov (RoDmitry)

documentation

https://docs.rs/langram/

README

Langram - the most accurate language detection library

Crate API

314 ScriptLanguages (187 models + 127 single language scripts)

One language can be written in multiple scripts, so it will be detected as a different ScriptLanguage (language + script)

Uses alphabet_detector as a word separator + language prefilter.

Based on chars (1 - 5) and 1 word n-gram language model modified algorithm.

ModelsStorage with all models preloaded uses around 4.1GB of RAM (2.4GB using max_trigrams). There can be a way (unimplemented) to unload each language model after use, it will work slower but will use around 300MB of RAM.

This library is a complete rewrite of Lingua: much faster, more accuracy, more languages, etc.

Accuracy report

Comparison with other language detectors

Setup

To use it, you need to patch langram_models in Cargo.toml:

  • From Git:
[patch.crates-io]
langram_models = { git = "https://github.com/RoDmitry/langram_models.git" }
[patch.crates-io]
langram_models = { path = "../langram_models" }

Which is more advanced and allows you to remove model ngrams, so that final executable would be lighter.

Commit count: 45

cargo fmt