| Crates.io | llm-gateway |
| lib.rs | llm-gateway |
| version | 0.1.1 |
| created_at | 2025-09-12 14:49:57.050175+00 |
| updated_at | 2025-09-14 14:46:12.137035+00 |
| description | A lightweight proxy library to normalize requests and responses across multiple LLM providers. |
| homepage | |
| repository | https://github.com/xinference/llm-gateway |
| max_upload_size | |
| id | 1835903 |
| size | 91,743 |
A lightweight Rust library to normalize requests and responses across multiple LLM providers.
Features:
Config via YAML (recommended)
# llm-gateway.config.yaml
# Do NOT commit this file (contains secrets)
deepseek:
api_key: {{YOUR_DEEPSEEK_API_KEY}}
base_url: https://api.deepseek.com/v1
glm:
api_key: {{YOUR_ZHIPU_API_KEY}}
base_url: https://open.bigmodel.cn/api/paas/v4
qwen:
api_key: {{YOUR_QWEN_API_KEY}}
base_url: https://dashscope.aliyun.com/compatible-mode/v1
kimi:
api_key: {{YOUR_KIMI_API_KEY}}
base_url: https://api.moonshot.cn/v1
Load it in code automatically:
use llm_gateway::{Client};
use llm_gateway::types::ChatRequest;
# async fn run() -> anyhow::Result<()> {
let client = Client::from_yaml_file_auto().unwrap_or_else(|| Client::from_env());
let resp = client.chat(ChatRequest {
model: "deepseek/deepseek-chat".into(),
messages: vec![("user".into(), "Hello".into())],
}).await?;
println!("{}", resp.text);
# Ok(())
# }
Add to your Cargo.toml:
[dependencies]
llm-gateway = "0.1.0"
Initialize the client from environment variables (recommended):
use llm_gateway::{Client};
use llm_gateway::types::ChatRequest;
# async fn run() -> anyhow::Result<()> {
let client = Client::from_env();
let resp = client.chat(ChatRequest {
model: "deepseek/deepseek-chat".into(),
messages: vec![("user".into(), "Hello".into())],
}).await?;
println!("{}", resp.text);
# Ok(())
# }
Or configure explicitly:
use llm_gateway::{Client, Config, ProviderConfig};
use llm_gateway::types::ChatRequest;
# async fn run() -> anyhow::Result<()> {
let client = Client::with_config(Config {
deepseek: ProviderConfig {
base_url: Some("https://api.deepseek.com/v1".into()),
api_key: Some("{{DEEPSEEK_API_KEY}}".into()),
},
glm: ProviderConfig { base_url: None, api_key: None },
qwen: ProviderConfig { base_url: None, api_key: None },
kimi: ProviderConfig { base_url: None, api_key: None },
});
let resp = client.chat(ChatRequest {
model: "deepseek/deepseek-chat".into(),
messages: vec![("user".into(), "Hello".into())],
}).await?;
println!("{}", resp.text);
# Ok(())
# }
Environment variables recognized by from_env():
use futures_util::StreamExt;
use llm_gateway::{Client};
use llm_gateway::types::ChatRequest;
# async fn run() -> anyhow::Result<()> {
let client = Client::from_env();
let mut stream = client.chat_stream(ChatRequest {
model: "qwen/qwen2-7b-instruct".into(),
messages: vec![("user".into(), "流式测试".into())],
}).await?;
while let Some(chunk) = stream.next().await {
let text = chunk?; // each item is a text delta
print!("{}", text);
}
# Ok(())
# }
Note: The implementation uses OpenAI-compatible SSE parsing (lines beginning with data: ). If a provider deviates, we can add provider-specific parsers.
This complements the quick-start snippets above and shows the full path from configuration to calling chat, including model discovery and caching.
# Only config controls discovery in library mode
discover_models: true
discover_models_ttl_secs: 600 # 10 minutes; when 0, a built-in default of 300s is used
deepseek:
base_url: https://api.deepseek.com/v1
qwen:
base_url: https://dashscope.aliyun.com/compatible-mode/v1
glm:
base_url: https://open.bigmodel.cn/api/paas/v4
kimi:
base_url: https://api.moonshot.cn/v1
discover_models: false
deepseek:
base_url: https://api.deepseek.com/v1
models:
- deepseek-chat
- deepseek-reasoner
Tip: Avoid committing API keys to the repo. Put them in your local config or inject at runtime.
use llm_gateway::Client;
let client = if let Some(c) = Client::from_yaml_file_auto() {
c
} else {
// Fallback to environment-driven defaults for provider api_key/base_url only;
// discovery and TTL are config-only in library mode
Client::from_env()
};
let local = client.list_models();
println!("local: {:?}", local);
let auto = client.list_models_auto().await?;
println!("auto: {:?}", auto);
Cache semantics:
use llm_gateway::types::ChatRequest;
let resp = client.chat(ChatRequest {
model: "glm/glm-4".into(), // prefer the provider/model form
messages: vec![("user".into(), "Write a 2-line haiku about Rust".into())],
}).await?;
println!("{}", resp.text);
Troubleshooting
See also: ../../docs/config.sample.yaml
MIT