| Crates.io | rml-core |
| lib.rs | rml-core |
| version | 0.1.0 |
| created_at | 2025-04-01 04:06:19.967044+00 |
| updated_at | 2025-04-01 04:06:19.967044+00 |
| description | A simple N-gram language model implementation in Rust |
| homepage | |
| repository | https://github.com/LEVOGNE/rml-core |
| max_upload_size | |
| id | 1614385 |
| size | 25,483 |
rml-core is a simple N-Gram language model implemented in Rust. The name is a play on LLM (Large Language Models) and stands for "Rust Language Model".
This project implements a character-level N-Gram language model using a basic neural architecture with one hidden layer. It uses a context of 4 characters to predict the next character in a sequence.
In your Cargo.toml:
[dependencies]
rml-core = "0.1.0"
cargo run --bin train path/to/input.txt path/to/output/model [epochs]
Example:
cargo run --bin train data/shakespeare.txt model.bin 10
cargo run --bin generate path/to/model "Seed Text" [length]
Example:
cargo run --bin generate model.bin "To be" 200
use rml_core::{NGramModel, prepare_training_data};
// Training
let text = std::fs::read_to_string("data/input.txt").unwrap();
let training_data = prepare_training_data(&text);
let mut model = NGramModel::new();
for (context, target) in training_data {
model.train(&context, target);
}
model.save("model.bin").unwrap();
// Generation
let mut model = NGramModel::load("model.bin").unwrap();
// Use model.forward() and sampling logic
| Component | Value |
|---|---|
| Context Size | 4 characters |
| Hidden Layer | 128 neurons |
| Learning Rate | 0.005 |
| Sampling Temperature | 0.3 (conservative) |
| Vocabulary | a-z, A-Z, 0โ9, punctuation |
# Train on Shakespeare for 10 epochs
cargo run --bin train data/shakespeare.txt shakespeare_model 10
# Generate 200 characters using "To be" as the seed
cargo run --bin generate shakespeare_model "To be" 200
Contributions are welcome! Please feel free to open an issue or submit a pull request.
This project is licensed under the MIT License.
Feel free to customize this README to fit your needs.