| Crates.io | infernum-server |
| lib.rs | infernum-server |
| version | 0.1.0 |
| created_at | 2025-12-03 07:24:12.803617+00 |
| updated_at | 2025-12-03 07:24:12.803617+00 |
| description | HTTP API server with OpenAI-compatible endpoints |
| homepage | |
| repository | https://github.com/Daemoniorum-LLC/infernum-framework |
| max_upload_size | |
| id | 1963511 |
| size | 157,793 |
HTTP API server with OpenAI-compatible endpoints for the Infernum LLM inference framework.
infernum-server provides a production-ready HTTP server that exposes Infernum's LLM capabilities through OpenAI-compatible API endpoints. This allows you to use any OpenAI client library or tool with locally-running models.
use infernum_server::Server;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let server = Server::new("0.0.0.0:8080").await?;
server.run().await
}
Or use the CLI:
infernum server --port 8080
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions |
POST | Chat completions (streaming/non-streaming) |
/v1/completions |
POST | Text completions |
/v1/models |
GET | List available models |
/health |
GET | Health check |
/metrics |
GET | Prometheus metrics |
This crate is part of the Infernum ecosystem:
Licensed under either of Apache License, Version 2.0 or MIT license at your option.