Crates.io | rust_readability |
lib.rs | rust_readability |
version | 0.2.0 |
source | src |
created_at | 2022-08-29 03:42:24.649401 |
updated_at | 2024-07-17 03:52:14.846167 |
description | A package to assess the complexity of texts using a variety of readability formulas. |
homepage | |
repository | https://github.com/ian-nai/rust_readability |
max_upload_size | |
id | 654155 |
size | 18,606 |
This is a package to assess the complexity of texts using a variety of well-established readability formulas written in Rust. The package includes implementations of the Lix, Rix, Flesch, Flesch-Kincaid, Linsear Write, Coleman-Liau, and Automated Readability Index methods.
The package includes functions for the Lix, Rix, Flesch, Flesch-Kincaid, Linsear Write, Coleman-Liau, and Automated Readability Index methods. When presented with either a file or a string, each function prints and returns a corresponding readability index. Call each function like so:
use rust_readability::lix;
lix("path/to/file.txt");
// or, for a string: lix_string("your string");
To remove stopwords before assessing the readability of a text, use the functions stopwords_file (to remove stopwords from a file) and stopwords_string (to remove stopwords from a string), like so:
use rust_readability::stopwords_string;
stopwords_string("this is an example string", "[your language]");
use rust_readability::stopwords_file;
stopwords_file("path/to/file.txt", "[your language]");
All languages included in NLTK's stopwords lists can be used to remove stopwords:
Lix is a readability measure first established by Carl-Hugo Björnsson. For text, the Lix value is derived by summing the average sentence length and the percentage of words longer than six letters.
lix("path/to/file.txt")
lix_string("your string");
Also developed by Björnsson, the Rix value is the number of longer words (>6 letters) divided by the number of sentences.
rix("path/to/file.txt")
rix_string("your string");
The Flesch reading-ease test evaluates how many words are in each sentence and how many syllables are in each word to provide a score ranging from 0-100, with 0 being the hardest to read.
flesch("path/to/file.txt");
flesch("your string");
Flesh-Kincaid presents the Flesch reading-ease test as a grade-level.
flesch_kincaid("path/to/file.txt");
flesch_kincaid("your string");
Linsear write is a reading metric developed specifically for technical/scientific texts. The score evaluates syllable count and sentence length in a 100-word sample to present a grade-level metric.
linsear_write("path/to/file.txt");
linsear_write_string("your string");
The Coleman-Liau index relies on the average word length in letters and average sentence length in words to produce a grade-level metric.
coleman_liau("path/to/file.txt");
coleman_liau("your string");
The automated readability index (ARI) similarly relies on character counts and sentence length to produce a score which uniquely maps to grade-level representation.
ari("path/to/file.txt");
ari("your string");
There are two ways to install rust_readability on your system: through cargo or via source code.
rust_readability can be installed using cargo, the package manager and crate host for rust. To install using cargo, run the following command:
cargo add rust_readability
To install from the source code you will can clone the repository:
git clone https://github.com/ian-nai/rust_readability.git