| Crates.io | cllient |
| lib.rs | cllient |
| version | 0.1.2 |
| created_at | 2025-10-28 06:01:16.045973+00 |
| updated_at | 2025-12-26 09:49:27.747531+00 |
| description | A comprehensive Rust client for LLM APIs with unified interface and model management |
| homepage | https://github.com/JuggernautLabs/cllient |
| repository | https://github.com/JuggernautLabs/cllient |
| max_upload_size | |
| id | 1904244 |
| size | 591,836 |
A config-driven LLM client in Rust. Define providers and models in YAML instead of code.
┌──────────────────────────────────────────────────────────────────┐
│ YAML Configs │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ service/ │ │ family/ │ │
│ │ openai.yaml │◄───│ gpt/4o.yaml │ model references │
│ │ anthropic.yaml │◄───│ claude/*.yaml │ service by name │
│ │ deepseek.yaml │◄───│ deepseek/*.yaml│ │
│ └─────────────────┘ └─────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ ModelRegistry │ │
│ │ - Loads configs (embedded or files) │ │
│ │ - Renders Handlebars templates │ │
│ │ - Substitutes env vars (API keys) │ │
│ └─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ HTTP + SSE Streaming │ │
│ │ Provider-specific parsers (OpenAI, │ │
│ │ Anthropic, Google) extract content │ │
│ └─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Streaming JSON Output │ │
│ │ Valid JSON emitted incrementally - │ │
│ │ watch the response build in real-time │ │
│ └─────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
Experimental: Proof-of-concept with 3 tested providers (OpenAI, Anthropic, DeepSeek). 242 additional models available via OpenRouter but not validated. Not production-ready.
# Install from source
cargo install --path .
# Set up API keys
cp .env.example .env
# Edit .env with your API keys
# List available models (alias: ls)
cllient list
# Simple completion (JSON output by default)
cllient ask gpt-4o-mini "What is Rust programming?"
# Streaming - outputs valid JSON incrementally as tokens arrive
cllient stream deepseek-chat "Tell me a story"
# Human-readable output with decorations
cllient --pretty stream deepseek-chat "Tell me a story"
# Clean output for piping (just the response, no JSON)
cllient --clean ask gpt-4o-mini "Hello" | wc -w
# Interactive chat
cllient chat claude-3-haiku-20240307
# Compare models
cllient compare gpt-4o-mini,claude-3-haiku "Explain quantum computing"
# Get help on any command
cllient --help
cllient ask --help
cllient [OPTIONS] <COMMAND>
Commands:
list (ls) List available models
list-services List available services/providers
ask Single completion request
stream Streaming completion request
chat Interactive chat session
compare Compare multiple models on same prompt
debug-response Debug API response issues
Options:
-v, --verbose Enable debug logging
--pretty Human-readable output with decorations
--clean Raw response only (for piping)
-h, --help Print help (works on all subcommands)
-V, --version Print version
use cllient::{ModelRegistry, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let registry = ModelRegistry::new()?;
// Simple single message
let response = registry
.from_id("gpt-4o-mini")?
.prompt("Hello, world!")
.send()
.await?;
// Multi-turn conversation
let response = registry
.from_id("claude-3-haiku-20240307")?
.messages(vec![
Message::system("You are a helpful assistant"),
Message::user("What is 2+2?"),
Message::assistant("4"),
Message::user("And 4+4?"),
])
.send()
.await?;
// Use cheapest model matching pattern
let response = registry
.use_cheapest("claude-*")?
.prompt("Explain AI")
.send()
.await?;
println!("Response: {}", response.content);
Ok(())
}
The library provides real-time streaming via StreamEvent. Each event represents a piece of the response as it arrives:
use cllient::{ClientFactory, EmbeddedConfigLoader, EmbeddedClientFactory, CompletionRequest};
use cllient::streaming::StreamEvent;
use futures::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let loader = EmbeddedConfigLoader::new()?;
let factory = EmbeddedClientFactory::new(loader);
let client = factory.create_client("gpt-4o-mini")?;
let request = CompletionRequest::text("user", "Tell me a story")
.with_streaming(true);
let mut stream = client.complete_stream(&request).await?;
while let Some(event) = stream.next().await {
match event? {
StreamEvent::Content(text) => print!("{}", text), // Token chunk
StreamEvent::Start => println!("--- stream started ---"),
StreamEvent::Finish(reason) => println!("\n--- done: {:?} ---", reason),
StreamEvent::Usage { input_tokens, output_tokens, .. } => {
println!("Tokens: in={:?}, out={:?}", input_tokens, output_tokens);
}
StreamEvent::Error(e) => eprintln!("Error: {}", e),
_ => {}
}
}
Ok(())
}
StreamEvent variants:
| Variant | Description |
|---|---|
Content(String) |
A chunk of text content (the main output) |
Start |
Stream has started |
Finish(Option<String>) |
Stream complete, with optional finish reason |
Usage { input_tokens, output_tokens, total_tokens } |
Token usage stats |
Role(String) |
Role information (usually at start) |
Error(String) |
An error occurred |
Raw(String) |
Raw event data (for debugging) |
Use StreamEventExt::parse_json<T>() to extract typed data from streaming responses. This is useful when the LLM returns structured JSON (tool calls, structured output, etc.):
use cllient::streaming::{StreamEventExt, StreamItem};
use serde::Deserialize;
use schemars::JsonSchema;
use futures::StreamExt;
#[derive(Deserialize, JsonSchema)]
struct ToolCall {
name: String,
args: serde_json::Value,
}
let stream = client.complete_stream(&request).await?;
let mut typed_stream = stream.parse_json::<ToolCall>();
while let Some(item) = typed_stream.next().await {
match item {
StreamItem::Data(tool) => println!("Tool call: {}", tool.name),
StreamItem::Text(t) => print!("{}", t.text), // Non-JSON text
StreamItem::Token(tok) => print!("{}", tok), // Raw tokens
}
}
StreamItem<T> variants:
| Variant | Description |
|---|---|
Data(T) |
Successfully parsed JSON matching your type |
Text(TextContent) |
Plain text or JSON that doesn't match type T |
Token(String) |
Individual token for real-time display |
The core idea is "unassociated truth" - services and models are separate entities that can be mixed:
Traditional: gpt-4 → hardcoded to OpenAI API
cllient: gpt-4 → { OpenAI, Azure, OpenRouter }
Model configs reference service configs
Service configs are HTTP templates
How it works:
Service configs (config/service/*.yaml) define HTTP request templates using Handlebars:
http:
request: |
POST /v1/chat/completions HTTP/1.1
Authorization: Bearer ${OPENAI_API_KEY}
{"model": "{{model_id}}", "messages": {{json messages}}}
Model configs (config/family/*/*.yaml) reference a service and add metadata:
model:
id: gpt-4o-mini
service: openai # Points to config/service/openai.yaml
capabilities:
context_window: 128000
Runtime loads configs, renders templates, makes HTTP requests
Why this is neat:
Core Components:
Tested & Working (97 models across 5 direct integrations):
gpt-4o, gpt-4o-mini, o1, o1-mini, etc.claude-3-opus, claude-3-5-sonnet, claude-3-haikudeepseek-chat, deepseek-coder, deepseek-v3gemini-2.0-flash, gemini-pro, gemma variantsVia OpenRouter (242 models, untested):
OPEN_ROUTER_API_KEY environment variable# Tested providers (add to .env)
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
DEEPSEEK_API_KEY=your_deepseek_key
# OpenRouter (gives access to 242 models)
OPEN_ROUTER_API_KEY=your_openrouter_key
HTTP_REFERER=http://localhost:3000 # Required by OpenRouter
X_TITLE=cllient # Optional app name
# Google (untested but configured)
GOOGLE_API_KEY=your_google_key
# Optional: Custom config directory
CLLIENT_CONFIG_DIR=/path/to/custom/configs
# config/service/newprovider.yaml
service:
name: NewProvider
base_url: https://api.newprovider.com
http:
request: |
POST /v1/chat/completions HTTP/1.1
Authorization: Bearer ${NEWPROVIDER_API_KEY}
{
"model": "{{model_id}}",
"messages": {{json messages}},
"stream": {{stream}}
}
Complete Configuration Guide: 4. Configuration Documentation
# Build project
make build
# Run tests
make test
# See all commands
make help
# Fetch latest models from providers
make models-openai
make models-anthropic
make models-all
# Pricing analysis
make pricing-cheapest
make cost-analysis
Complete Development Guide: 6. Development Documentation
Completed:
Aspirational (Not Started):
This is an experimental project, so contributions are welcome but come with caveats:
Good first contributions:
Before contributing:
If you still want to help:
git checkout -b feature/amazing-featuremake test && make lintSee: 6. Development Guide for detailed workflows.
Use this if:
Don't use this if:
Alternatives to consider:
openai, anthropic, google-generativeai Python packagesThis project is licensed under the terms specified in the LICENSE file.
src/bin/cllient.rs - CLI implementationsrc/runtime.rs - High-level Runtime APIsrc/client.rs - HTTP client and streamingsrc/config.rs - Configuration systemconfig/ - Service and model configurationsdocs/ - Complete documentationQuick Links: Documentation | Examples | Configuration | Architecture
Disclaimer: This is experimental research software. For production use, consider LiteLLM or official provider SDKs.