langram

Crates.io	langram
lib.rs	langram
version	0.11.0
created_at	2025-04-14 12:26:11.014442+00
updated_at	2026-01-22 16:43:04.660446+00
description	Natural language detection library
homepage
repository	https://github.com/RoDmitry/langram
max_upload_size
id	1632757
size	77,155

Dmitry Rodionov (RoDmitry)

documentation

https://docs.rs/langram/

README

Langram - the most accurate language detection library

346 ScriptLanguages (188 models + 158 scripts with no models)

Usage examples in docs.rs.

One language can be written in multiple scripts, so it will be detected as a different ScriptLanguage (language + script)

Uses alphabet_detector as a word separator + language prefilter.

Based on chars (1 - 5) and 1 word n-gram language model modified algorithm.

RAM requirements are low, but it may take up to the provided models binary file's size, but this memory is shared (Virtual space, Mmap), so it's not required to have that amount of RAM available. But if it won't be able to cache the whole models file in RAM, it's speed will be affected.

This library is a complete rewrite of Lingua: much faster, more accuracy, more languages, etc.

Also more accurate than Whatlang or Whichlang. More info at the Comparison with other language detectors.

To better understand the accuracy of different modes, look into the Accuracy report.

Setup

To use this library, you need a binary models file, which must be placed near the executable, or set LANGRAM_MODELS_PATH.

It can be:

Downloaded from langram_models releases;
Built (recommened if big-endian target) langram_models. Which is more advanced and allows you to remove model ngrams, and recompile, so that models binary would be lighter.

Commit count: 85