| Crates.io | nl-sre-english |
| lib.rs | nl-sre-english |
| version | 0.1.4 |
| created_at | 2026-01-21 23:19:40.893105+00 |
| updated_at | 2026-01-22 20:03:54.795149+00 |
| description | Deterministic Semantic Disambiguation Engine for English - 1500+ verbs, 5500+ words, Zero dependencies, Pure Rust |
| homepage | |
| repository | https://github.com/Yatrogenesis/nl-sre-english |
| max_upload_size | |
| id | 2060348 |
| size | 8,942,422 |
Deterministic Semantic Disambiguation Engine for English
A comprehensive English verb database with semantic disambiguation capabilities, designed for natural language processing, command parsing, and AI applications.
The engine organizes verbs into semantic categories for easy programmatic access:
| Category | Description | Examples |
|---|---|---|
| Movement | Motion and locomotion | walk, run, fly, swim, climb |
| Perception | Sensing and perceiving | see, hear, feel, smell, taste |
| Communication | Speaking and speech acts | say, tell, speak, ask, answer |
| Cognition | Mental processes | think, know, believe, understand |
| Emotion | Emotional states | love, hate, fear, hope, enjoy |
| Physical | Physical manipulation | hit, cut, push, pull, throw |
| State | States of being | be, exist, remain, stay |
| Change | Change of state | become, grow, transform |
| Transfer | Giving and receiving | give, take, send, receive |
| Creation | Making and producing | make, create, build, write |
| Destruction | Breaking and destroying | destroy, break, kill, damage |
| Control | Controlling and managing | control, manage, lead, govern |
| Possession | Owning and having | own, have, possess, acquire |
| Social | Social interaction | meet, help, fight, cooperate |
| Consumption | Eating and drinking | eat, drink, consume, breathe |
| Body | Bodily functions | sleep, wake, sit, stand, lie |
| Weather | Weather phenomena | rain, snow, blow, shine |
| Measurement | Measuring and comparing | measure, weigh, count, compare |
| Aspectual | Beginning, ending, continuing | begin, end, continue, stop |
| Causation | Causing and enabling | cause, allow, prevent, force |
| Attempt | Trying and succeeding/failing | try, succeed, fail, practice |
| Modal | Modal and semi-modal | want, need, can, should |
| Position | Body position and location | put, place, set, remove |
| Connection | Joining and separating | connect, join, separate, split |
| Emission | Light and sound emission | shine, glow, ring, buzz |
use nl_sre_english::SemanticDisambiguator;
use nl_sre_english::verbs::FunctionalCategory;
fn main() {
let disambiguator = SemanticDisambiguator::new();
// Process a sentence
let result = disambiguator.process("The cat runs quickly across the room");
println!("Detected actions:");
for action in &result.detected_actions {
println!(" - {} (base: {}, category: {})",
action.verb,
action.base_form,
action.category.name()
);
}
// Get verbs by category
let movement_verbs = disambiguator.verbs_by_category(FunctionalCategory::Movement);
println!("Movement verbs: {:?}", &movement_verbs[..5]);
}
use nl_sre_english::verbs::{VerbDatabase, FunctionalCategory, VerbGroup};
fn main() {
let db = VerbDatabase::with_builtin();
// Look up any verb form
if let Some(entry) = db.lookup("running") {
println!("Base form: {}", entry.base); // "run"
println!("Category: {}", entry.category.name()); // "Movement"
println!("Group: {}", entry.group.name()); // "Run"
println!("Irregular: {}", entry.irregular); // true
println!("Forms: {} / {} / {} / {}",
entry.base,
entry.past,
entry.past_participle,
entry.present_participle
);
}
// Get all verbs in a category
let emotions = db.by_category(FunctionalCategory::Emotion);
for verb in emotions.iter().take(10) {
println!("{}: {}", verb.base, verb.synonyms.join(", "));
}
// Get all verbs in a specific group
let running_verbs = db.by_group(VerbGroup::Run);
// Returns: run, sprint, dash, race, jog, rush, hurry, bolt...
}
use nl_sre_english::command_parser::CommandParser;
fn main() {
let mut parser = CommandParser::new();
// Parse natural language commands
if let Some(cmd) = parser.parse("please walk to the store") {
println!("Action: {}", cmd.action); // "walk"
println!("Category: {}", cmd.category.name()); // "Movement"
println!("Subject: {:?}", cmd.subject); // Some("please")
println!("Object: {:?}", cmd.object); // Some("to the store")
}
// Parse multiple commands
let commands = parser.parse_all("Run to the store. Buy some milk. Come back home.");
// Returns 3 parsed commands
}
Total verb entries: 1,514 (including multi-category)
Unique verb base forms: 1,312
Irregular verbs: 133
Regular verbs: 1,381
Total forms indexed: ~5,300
Functional categories: 25
Verb groups: 80+
Dictionary words: 4,971 (COCA corpus)
| Operation | Throughput | Latency |
|---|---|---|
| Verb lookup | 12.2M ops/sec | 0.08 µs |
| Spell correction (BK-Tree) | 1.2K ops/sec | 804 µs |
| Command parsing | 918K ops/sec | 1.1 µs |
| Contraction expansion | 4.6M ops/sec | 0.22 µs |
BK-Tree speedup: 1.6x over linear search for fuzzy matching.
Some verbs belong to multiple semantic categories depending on context:
use nl_sre_english::VerbDatabase;
let db = VerbDatabase::with_builtin();
// "run" has multiple meanings
let categories = db.get_all_categories("run");
// Returns: [Movement, Control]
// - "run to the store" -> Movement
// - "run a company" -> Control
// Get all entries for a verb
let entries = db.lookup_all("run");
// Returns Vec with 2 entries, each with different category
┌─────────────────────────────────────────────────────┐
│ SemanticDisambiguator │
├─────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ VerbDatabase │ │ Grammar │ │ Dictionary │ │
│ │ 1312 verbs │ │ Parser │ │ 5500+ │ │
│ │ 25 categories│ │ POS tag │ │ words │ │
│ └──────────────┘ └────────────┘ └────────────┘ │
├─────────────────────────────────────────────────────┤
│ CommandParser (NL → Structured) │
└─────────────────────────────────────────────────────┘
cargo build --release
cargo run --example verb_groups
cargo test
The spell correction system has been optimized with a Burkhard-Keller Tree (BK-Tree) implementation:
use nl_sre_english::EnglishDictionary;
let dict = EnglishDictionary::new();
// Fast fuzzy search for spell correction
let suggestions = dict.find_similar("helo", 2);
// Returns: [("hello", 1), ("help", 2), ("held", 2), ...]
Benchmark improvement: ~1.6x faster for typical spell correction queries (4,971 word dictionary).
The tokenizer now automatically expands 50+ common English contractions:
use nl_sre_english::EnglishGrammar;
let grammar = EnglishGrammar::new();
// "don't" expands to "do" + "not"
let tokens = grammar.tokenize("I don't know");
assert_eq!(tokens, vec!["i", "do", "not", "know"]);
// "we'll" expands to "we" + "will"
let tokens = grammar.tokenize("We'll see");
assert_eq!(tokens, vec!["we", "will", "see"]);
Supported contractions:
Added comprehensive regression tests (tests/regression.rs) covering:
Note on Scalability
This repository demonstrates the logical architecture of a deterministic NLP engine - the algorithmic foundations, data structures, and API design patterns. All components are production-ready and optimized for performance.
For industrial-scale applications requiring:
- 160K+ word lexicons with WordNet integration
- Zero-copy memory-mapped binary formats for sub-millisecond loading
- Word Sense Disambiguation (WSD) with configurable weights
- Synset hierarchies (hypernyms/hyponyms) for semantic reasoning
Contact Avermex Research Division for enterprise implementation details.
MIT License - See LICENSE for details.
Francisco Molina-Burgos Avermex Research Division Merida, Yucatan, Mexico
Part of the NL-SRE (Natural Language Semantic Rule Engine) family
See also: NL-SRE-Semantico (Spanish)