| Crates.io | llmkit |
| lib.rs | llmkit |
| version | 0.1.3 |
| created_at | 2026-01-12 02:51:42.226334+00 |
| updated_at | 2026-01-16 03:09:17.947345+00 |
| description | Production-grade LLM client - 100+ providers, 11,000+ models. Pure Rust. |
| homepage | |
| repository | https://github.com/yfedoseev/llmkit |
| max_upload_size | |
| id | 2036913 |
| size | 4,858,578 |
The production-grade LLM client. One API for 100+ providers. Pure Rust core with native bindings.
11,000+ models · 100+ providers · Rust | Python | Node.js
┌──────────────┐
│ Rust Core │
└──────┬───────┘
┌──────────┬─────────┼─────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│Python │ │ Node │ │ WASM │ │ Go │ │ ... │
│ ✅ │ │ ✅ │ │ Soon │ │ Soon │ │ │
└───────┘ └───────┘ └───────┘ └───────┘ └───────┘
Documentation · Changelog · Contributing
LLMKit is written in pure Rust — no Python runtime, no garbage collector, no memory leaks. Deploy with confidence knowing your LLM infrastructure won't degrade over time or crash under load.
| Feature | Description |
|---|---|
| Smart Router | ML-based provider selection optimizing for latency, cost, or reliability |
| Circuit Breaker | Automatic failure detection and recovery with anomaly detection |
| Rate Limiting | Lock-free, hierarchical rate limiting at scale |
| Cost Tracking | Multi-tenant metering with cache-aware pricing |
| Guardrails | PII detection, secret scanning, prompt injection prevention |
| Observability | OpenTelemetry integration for tracing and metrics |
use llmkit::{LLMKitClient, Message, CompletionRequest};
let client = LLMKitClient::from_env()?;
let response = client.complete(
CompletionRequest::new("anthropic/claude-sonnet-4-20250514", vec![Message::user("Hello!")])
).await?;
println!("{}", response.text_content());
from llmkit import LLMKitClient, Message, CompletionRequest
client = LLMKitClient.from_env()
response = client.complete(CompletionRequest(
model="openai/gpt-4o",
messages=[Message.user("Hello!")]
))
print(response.text_content())
import { LLMKitClient, Message, CompletionRequest } from 'llmkit-node'
const client = LLMKitClient.fromEnv()
const response = await client.complete(
new CompletionRequest('anthropic/claude-sonnet-4-20250514', [Message.user('Hello!')])
)
console.log(response.textContent())
[dependencies]
llmkit = { version = "0.1", features = ["anthropic", "openai"] }
pip install llmkit-python
npm install llmkit-node
| Chat | Media | Specialized |
|---|---|---|
| Streaming | Image Generation | Embeddings |
| Tool Calling | Vision/Images | Token Counting |
| Structured Output | Audio STT/TTS | Batch Processing |
| Extended Thinking | Video Generation | Model Registry |
| Prompt Caching | 11,000+ Models |
| Category | Providers |
|---|---|
| Core | Anthropic, OpenAI, Azure OpenAI |
| Cloud | AWS Bedrock, Google Vertex AI, Google AI |
| Fast Inference | Groq, Mistral, Cerebras, SambaNova, Fireworks, DeepSeek |
| Enterprise | Cohere, AI21 |
| Hosted | Together, Perplexity, DeepInfra, OpenRouter |
| Local | Ollama, LM Studio, vLLM |
| Audio | Deepgram, ElevenLabs |
| Video | Runware |
See PROVIDERS.md for the full list with environment variables.
let mut stream = client.complete_stream(request).await?;
while let Some(chunk) = stream.next().await {
if let Some(text) = chunk?.text() { print!("{}", text); }
}
from llmkit import ToolBuilder
tool = ToolBuilder("get_weather") \
.description("Get current weather") \
.string_param("city", "City name", required=True) \
.build()
request = CompletionRequest(model, messages).with_tools([tool])
# Cache large system prompts - save up to 90% on repeated calls
request = CompletionRequest(
model="anthropic/claude-sonnet-4-20250514",
messages=[Message.system(large_prompt), Message.user("Question")]
).with_cache()
// Unified reasoning API across providers
const request = new CompletionRequest('anthropic/claude-sonnet-4-20250514', messages)
.withThinking({ budgetTokens: 10000 })
const response = await client.complete(request)
console.log(response.thinkingContent()) // See the reasoning process
console.log(response.textContent()) // Final answer
from llmkit import get_model_info, get_models_by_provider
# Get model details - no API calls, instant lookup
info = get_model_info("anthropic/claude-sonnet-4-20250514")
print(f"Context: {info.context_window}, Price: ${info.input_price}/1M tokens")
# Find models by provider
anthropic_models = get_models_by_provider("anthropic")
For more examples, see examples/.
git clone https://github.com/yfedoseev/llmkit
cd llmkit
cargo build --release
cargo test
# Python bindings
cd llmkit-python && maturin develop
# Node.js bindings
cd llmkit-node && pnpm install && pnpm build
We welcome contributions! See CONTRIBUTING.md for guidelines.
Dual-licensed under MIT or Apache-2.0.
Built with Rust · Production Ready · GitHub