Crates.io | lmonade-runtime |
lib.rs | lmonade-runtime |
version | 0.1.0-alpha.2 |
created_at | 2025-08-20 14:10:53.345932+00 |
updated_at | 2025-08-21 22:07:16.999729+00 |
description | Actor-based runtime for LLM inference orchestration and resource management |
homepage | |
repository | https://jgok76.gitea.cloud/femtomc/lmonade |
max_upload_size | |
id | 1803469 |
size | 468,440 |
High-performance actor-based runtime for LLM inference orchestration.
This crate provides the core runtime infrastructure for Lmonade:
src/actor/
)src/actor/model_hub.rs
)src/resources/
)src/llm_engine.rs
)src/model_runner.rs
)The runtime uses an actor-based architecture where:
use lmonade_runtime::actor::model_hub::ModelHub;
use lmonade_runtime::actor::messages::{LoadModelRequest, GenerateRequest};
// Create and start the model hub
let hub = ModelHub::new(Default::default()).await?;
// Load a model
hub.load_model(LoadModelRequest {
model_id: "tinyllama".to_string(),
model_path: "/path/to/model".into(),
config: Default::default(),
}).await?;
// Generate text
let response = hub.generate(GenerateRequest {
model_id: "tinyllama".to_string(),
prompt: "Hello, world!".to_string(),
params: Default::default(),
}).await?;
For detailed documentation:
The runtime is under active development with basic TinyLlama inference working. Production features like distributed serving and advanced scheduling are in progress.