Crates.io | ezllama |
lib.rs | ezllama |
version | 0.3.1 |
created_at | 2025-04-19 07:24:50.978392+00 |
updated_at | 2025-04-19 14:53:40.500207+00 |
description | An opinionated, simple Rust interface for local LLMs, powered by llama-cpp-2 |
homepage | |
repository | https://github.com/torkleyy/ezllama |
max_upload_size | |
id | 1640390 |
size | 68,328 |
An opinionated, simple Rust interface for local LLMs, powered by llama-cpp-2.
Right now it only supports the basics, but I might add more features in the future as I need them.
You can try out the chatbot from this repo:
cargo run --example chatbot --features <backend> -- model.gguf
Add ezllama to your Cargo.toml:
[dependencies]
ezllama = "*"
For GPU acceleration, enable the appropriate feature (if you are using CUDA, go run some errands when compiling):
[dependencies]
ezllama = { version = "*", features = ["cuda"] } # For CUDA support
# or
ezllama = { version = "*", features = ["metal"] } # For Metal support (macOS)
# or
ezllama = { version = "*", features = ["vulkan"] } # For Vulkan support
Note: Make sure you grab a GGUF model from Hugging Face or elsewhere.
use ezllama::{ContextParams, Model, ModelParams, Result};
use std::path::PathBuf;
fn main() -> Result<()> {
// Initialize the model
let model_params = ModelParams {
model_path: PathBuf::from("path/to/your/model.gguf"),
..Default::default()
};
let context_params = ContextParams {
ctx_size: Some(2048),
..Default::default()
};
let model = Model::new(&model_params)?;
let mut chat_session = model.create_chat_session(&context_params)?;
// First turn
chat_session.add_user_message("Hello, can you introduce yourself?");
let response1 = chat_session.generate()?.take(128);
print!("Assistant: ");
for token in response1 {
print!("{}", token);
std::io::stdout().flush()?;
}
// Second turn (can do the same in one step)
let response2 = chat_session.prompt("What can you help me with?")?.join();
println!("Assistant: {}", response2);
Ok(())
}
// Create a text session for text completion
let mut text_session = model.create_text_session(&context_params)?;
let output = text_session.prompt("Once upon a time")?.join();
// Continue generating from the existing context
let more_output = text_session.prompt(" and then")?.join();
// Create a chat session with a system message
let mut chat_session = model.create_chat_session_with_system(
"You are a helpful assistant that specializes in Rust programming.",
&context_params
)?;
// Or add a system message to an existing session
let mut chat_session = model.create_chat_session(&context_params)?;
chat_session.add_system_message("You are a helpful assistant.");
// One-shot completion with system message
let response = model.chat_completion_with_system(
"You are a concise assistant.",
"Explain quantum computing.",
&context_params
)?.join();
// Create a chat session with a custom template
let template = "{{0_role}}: {{0_content}}\n{{1_role}}: {{1_content}}";
let mut chat_session = model.create_chat_session_with_template(template.to_string(), &context_params)?;
Licensed under MIT.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as above, without any additional terms or conditions.