| Crates.io | rustrict |
| lib.rs | rustrict |
| version | 0.7.36 |
| created_at | 2021-09-12 20:54:28.616894+00 |
| updated_at | 2025-08-06 16:57:36.85222+00 |
| description | rustrict is a profanity filter for Rust |
| homepage | |
| repository | https://github.com/finnbear/rustrict/ |
| max_upload_size | |
| id | 450196 |
| size | 1,471,475 |
rustrict is a profanity filter for Rust.
Disclaimer: Multiple source files (.txt, .csv, .rs test cases) contain profanity. Viewer discretion is advised.
&str or Iterator<Item = char>context featurecustomize featurewidth featureregex (uses custom trie)release modedebug mode&str)use rustrict::CensorStr;
let censored: String = "hello crap".censor();
let inappropriate: bool = "f u c k".is_inappropriate();
assert_eq!(censored, "hello c***");
assert!(inappropriate);
Iterator<Type = char>)use rustrict::CensorIter;
let censored: String = "hello crap".chars().censor().collect();
assert_eq!(censored, "hello c***");
By constructing a Censor, one can avoid scanning text multiple times to get a censored String and/or
answer multiple is queries. This also opens up more customization options (defaults are below).
use rustrict::{Censor, Type};
let (censored, analysis) = Censor::from_str("123 Crap")
.with_censor_threshold(Type::INAPPROPRIATE)
.with_censor_first_character_threshold(Type::OFFENSIVE & Type::SEVERE)
.with_ignore_false_positives(false)
.with_ignore_self_censoring(false)
.with_censor_replacement('*')
.censor_and_analyze();
assert_eq!(censored, "123 C***");
assert!(analysis.is(Type::INAPPROPRIATE));
assert!(analysis.isnt(Type::PROFANE & Type::SEVERE | Type::SEXUAL));
If you cannot afford to let anything slip though, or have reason to believe a particular user is trying to evade the filter, you can check if their input matches a short list of safe strings:
use rustrict::{CensorStr, Type};
// Figure out if a user is trying to evade the filter.
assert!("pron".is(Type::EVASIVE));
assert!("porn".isnt(Type::EVASIVE));
// Only let safe messages through.
assert!("Hello there!".is(Type::SAFE));
assert!("nice work.".is(Type::SAFE));
assert!("yes".is(Type::SAFE));
assert!("NVM".is(Type::SAFE));
assert!("gtg".is(Type::SAFE));
assert!("not a common phrase".isnt(Type::SAFE));
If you want to add custom profanities or safe words, enable the customize feature.
#[cfg(feature = "customize")]
{
use rustrict::{add_word, CensorStr, Type};
// You must take care not to call these when the crate is being
// used in any other way (to avoid concurrent mutation).
unsafe {
add_word("reallyreallybadword", (Type::PROFANE & Type::SEVERE) | Type::MEAN);
add_word("mybrandname", Type::SAFE);
}
assert!("Reallllllyreallllllybaaaadword".is(Type::PROFANE));
assert!("MyBrandName".is(Type::SAFE));
}
If your use-case is chat moderation, and you store data on a per-user basis, you can use rustrict::Context as a reference implementation:
#[cfg(feature = "context")]
{
use rustrict::{BlockReason, Context};
use std::time::Duration;
pub struct User {
context: Context,
}
let mut bob = User {
context: Context::default()
};
// Ok messages go right through.
assert_eq!(bob.context.process(String::from("hello")), Ok(String::from("hello")));
// Bad words are censored.
assert_eq!(bob.context.process(String::from("crap")), Ok(String::from("c***")));
// Can take user reports (After many reports or inappropriate messages,
// will only let known safe messages through.)
for _ in 0..5 {
bob.context.report();
}
// If many bad words are used or reports are made, the first letter of
// future bad words starts getting censored too.
assert_eq!(bob.context.process(String::from("crap")), Ok(String::from("****")));
// Can manually mute.
bob.context.mute_for(Duration::from_secs(2));
assert!(matches!(bob.context.process(String::from("anything")), Err(BlockReason::Muted(_))));
}
To compare filters, the first 100,000 items of this list is used as a dataset. Positive accuracy is the percentage of profanity detected as profanity. Negative accuracy is the percentage of clean text detected as clean.
| Crate | Accuracy | Positive Accuracy | Negative Accuracy | Time |
|---|---|---|---|---|
| rustrict | 79.99% | 94.02% | 76.49% | 10s |
| censor | 76.16% | 72.76% | 77.01% | 23s |
| stfu | 91.74% | 77.69% | 95.25% | 45s |
| profane-rs | 80.47% | 73.79% | 82.14% | 52s |
If you make an adjustment that would affect false positives, such as adding profanity,
you will need to run false_positive_finder:
make downloads to download the required word lists and dictionariesmake false_positives to automatically find false positivesIf you modify replacements_extra.csv, run make replacements to rebuild replacements.csv.
Finally, run make test for a full test or make test_debug for a fast test.
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.