| Crates.io | litellm-rs |
| lib.rs | litellm-rs |
| version | 0.1.3 |
| created_at | 2025-07-28 10:04:27.436908+00 |
| updated_at | 2025-09-18 00:51:32.879315+00 |
| description | A high-performance AI Gateway written in Rust, providing OpenAI-compatible APIs with intelligent routing, load balancing, and enterprise features |
| homepage | https://github.com/majiayu000/litellm-rs |
| repository | https://github.com/majiayu000/litellm-rs |
| max_upload_size | |
| id | 1770953 |
| size | 3,732,724 |
A high-performance Rust library for unified LLM API access.
litellm-rs provides a simple, consistent interface to interact with multiple AI providers (OpenAI, Anthropic, Google, Azure, and more) through a single, unified API. Built with Rust's performance and safety guarantees, it simplifies multi-provider AI integration in production systems.
use litellm_rs::{completion, user_message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Works with any supported provider
let response = completion(
"gpt-4", // or "claude-3", "gemini-pro", etc.
vec![user_message("Hello!")],
None,
).await?;
println!("{}", response.choices[0].message.content.as_ref().unwrap());
Ok(())
}
Add this to your Cargo.toml:
[dependencies]
litellm-rs = "0.1.0"
tokio = { version = "1.0", features = ["full"] }
serde_json = "1.0"
Or build from source:
git clone https://github.com/majiayu000/litellm-rs.git
cd litellm-rs
cargo build --release
use litellm_rs::{completion, user_message, system_message, CompletionOptions};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Set your API key
std::env::set_var("OPENAI_API_KEY", "your-openai-key");
// Simple completion call
let response = completion(
"gpt-4",
vec![user_message("Hello, how are you?")],
None,
).await?;
println!("Response: {}", response.choices[0].message.content.as_ref().unwrap());
// With system message and options
let response = completion(
"gpt-4",
vec![
system_message("You are a helpful assistant."),
user_message("Explain quantum computing"),
],
Some(CompletionOptions {
temperature: Some(0.7),
max_tokens: Some(150),
..Default::default()
}),
).await?;
println!("AI: {}", response.choices[0].message.content.as_ref().unwrap());
Ok(())
}
use litellm_rs::{completion, user_message, CompletionOptions};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Set API keys for different providers
std::env::set_var("OPENAI_API_KEY", "your-openai-key");
std::env::set_var("ANTHROPIC_API_KEY", "your-anthropic-key");
std::env::set_var("GROQ_API_KEY", "your-groq-key");
// Call OpenAI
let openai_response = completion(
"gpt-4",
vec![user_message("Hello from OpenAI!")],
None,
).await?;
// Call Anthropic Claude
let claude_response = completion(
"anthropic/claude-3-sonnet-20240229",
vec![user_message("Hello from Claude!")],
None,
).await?;
// Call Groq (with reasoning)
let groq_response = completion(
"groq/deepseek-r1-distill-llama-70b",
vec![user_message("Solve this math problem: 2+2=?")],
Some(CompletionOptions {
extra_params: {
let mut params = std::collections::HashMap::new();
params.insert("reasoning_effort".to_string(), serde_json::json!("medium"));
params
},
..Default::default()
}),
).await?;
println!("OpenAI: {}", openai_response.choices[0].message.content.as_ref().unwrap());
println!("Claude: {}", claude_response.choices[0].message.content.as_ref().unwrap());
println!("Groq: {}", groq_response.choices[0].message.content.as_ref().unwrap());
Ok(())
}
use litellm_rs::{completion, user_message, CompletionOptions};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Call any OpenAI-compatible API with custom endpoint
let response = completion(
"llama-3.1-70b", // Model name
vec![user_message("Hello from custom endpoint!")],
Some(CompletionOptions {
api_key: Some("your-custom-api-key".to_string()),
api_base: Some("https://your-custom-endpoint.com/v1".to_string()),
..Default::default()
}),
).await?;
println!("Custom API: {}", response.choices[0].message.content.as_ref().unwrap());
Ok(())
}
Start the server:
# Set your API keys
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
# Start the proxy server
cargo run
# Server starts on http://localhost:8000
Make requests:
# OpenAI GPT-4
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello, how are you?"}]
}'
# Anthropic Claude
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet",
"messages": [{"role": "user", "content": "Hello, how are you?"}]
}'
{
"id": "chatcmpl-1214900a-6cdd-4148-b663-b5e2f642b4de",
"created": 1751494488,
"model": "claude-3-sonnet",
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Hello! I'm doing well, thank you for asking. How are you doing today?",
"role": "assistant"
}
}
],
"usage": {
"completion_tokens": 17,
"prompt_tokens": 12,
"total_tokens": 29
}
}
Call any model supported by a provider, with model=<model_name>. See Supported Providers for complete list.
LiteLLM-RS supports streaming the model response back, pass stream=true to get a streaming response.
Streaming is supported for all models (OpenAI, Anthropic, Google, Azure, Groq, etc.)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": true
}'
Create a config/gateway.yaml file:
server:
host: "0.0.0.0"
port: 8000
providers:
openai:
api_key: "${OPENAI_API_KEY}"
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
google:
api_key: "${GOOGLE_API_KEY}"
router:
strategy: "round_robin"
max_retries: 3
timeout: 60
See config/gateway.yaml.example for a complete example.
| Metric | Value | Notes |
|---|---|---|
| Throughput | 10,000+ req/s | On 8-core CPU |
| Latency | <10ms | Routing overhead |
| Memory | ~50MB | Base footprint |
| Startup | <100ms | Cold start time |
docker run -p 8000:8000 \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
-e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
litellm-rs:latest
kubectl apply -f deployment/kubernetes/
# Download the latest release
curl -L https://github.com/majiayu000/litellm-rs/releases/latest/download/litellm-rs-linux-amd64 -o litellm-rs
chmod +x litellm-rs
./litellm-rs
Contributions are welcome! Please read our Contributing Guide for details.
# Setup
git clone https://github.com/majiayu000/litellm-rs.git
cd litellm-rs
# Test
cargo test
# Format
cargo fmt
# Lint
cargo clippy
See GitHub Issues for detailed roadmap.
Licensed under the MIT License. See LICENSE for details.
Special thanks to the Rust community and all contributors to this project.
Built with Rust for performance and reliability.