| Crates.io | llm-connector |
| lib.rs | llm-connector |
| version | 0.5.13 |
| created_at | 2025-09-28 10:58:27.040517+00 |
| updated_at | 2026-01-03 13:36:58.209845+00 |
| description | Next-generation Rust library for LLM protocol abstraction with native multi-modal support. Supports 11+ providers (OpenAI, Anthropic, Google, Aliyun, Zhipu, Ollama, Tencent, Volcengine, LongCat, Moonshot, DeepSeek) with clean Protocol/Provider separation, type-safe interface, and universal streaming. |
| homepage | |
| repository | https://github.com/lipish/llm-connector |
| max_upload_size | |
| id | 1858240 |
| size | 603,704 |
Next-generation Rust library for LLM protocol abstraction with native multi-modal support.
Supports 11+ providers: OpenAI, Anthropic, Aliyun, Zhipu, Ollama, Tencent, Volcengine, LongCat, Moonshot, DeepSeek, and more. Clean architecture with unified output format, multi-modal content support, and configuration-driven design for maximum flexibility.
StreamingResponse typeAdd to your Cargo.toml:
[dependencies]
llm-connector = "0.5.13"
tokio = { version = "1", features = ["full"] }
Optional features:
# Streaming support
llm-connector = { version = "0.5.13", features = ["streaming"] }
use llm_connector::{LlmClient, types::{ChatRequest, Message, Role}};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// OpenAI
let client = LlmClient::openai("sk-...")?;
// Anthropic Claude
let client = LlmClient::anthropic("sk-ant-...")?;
// Aliyun DashScope
let client = LlmClient::aliyun("sk-...")?;
// Zhipu GLM
let client = LlmClient::zhipu("your-api-key")?;
// Ollama (local, no API key needed)
let client = LlmClient::ollama()?;
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::text(Role::User, "Hello!")],
..Default::default()
};
let response = client.chat(&request).await?;
println!("Response: {}", response.content);
Ok(())
}
Send text and images in a single message:
use llm_connector::{LlmClient, types::{ChatRequest, Message, MessageBlock, Role}};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = LlmClient::openai("sk-...")?;
// Text + Image URL
let request = ChatRequest {
model: "gpt-4o".to_string(),
messages: vec![
Message::new(
Role::User,
vec![
MessageBlock::text("What's in this image?"),
MessageBlock::image_url("https://example.com/image.jpg"),
],
),
],
..Default::default()
};
// Text + Base64 Image
let request = ChatRequest {
model: "gpt-4o".to_string(),
messages: vec![
Message::new(
Role::User,
vec![
MessageBlock::text("Analyze this image"),
MessageBlock::image_base64("image/jpeg", base64_data),
],
),
],
..Default::default()
};
let response = client.chat(&request).await?;
println!("Response: {}", response.content);
Ok(())
}
Supported Content Types:
MessageBlock::text(text) - Text contentMessageBlock::image_url(url) - Image from URL (OpenAI format)MessageBlock::image_base64(media_type, data) - Base64 encoded imageMessageBlock::image_url_anthropic(url) - Image from URL (Anthropic format)Provider Support:
See examples/multimodal_basic.rs for more examples.
Get a list of all supported provider names:
use llm_connector::LlmClient;
fn main() {
let providers = LlmClient::supported_providers();
for provider in providers {
println!("{}", provider);
}
}
Output:
openai
aliyun
anthropic
zhipu
ollama
tencent
volcengine
longcat_anthropic
azure_openai
openai_compatible
See examples/list_providers.rs for a complete example.
llm-connector supports 11+ LLM providers with a unified interface:
| Provider | Quick Start | Features |
|---|---|---|
| OpenAI | LlmClient::openai("sk-...") |
Chat, Streaming, Tools, Multi-modal, Reasoning (o1) |
| Anthropic | LlmClient::anthropic("sk-ant-...") |
Chat, Streaming, Multi-modal |
| Aliyun | LlmClient::aliyun("sk-...") |
Chat, Streaming, Qwen models |
| Zhipu | LlmClient::zhipu("key") |
Chat, Streaming, Tools, GLM models |
| Ollama | LlmClient::ollama() |
Chat, Streaming, Local models, Model management |
| Tencent | LlmClient::tencent("id", "key") |
Chat, Streaming, Hunyuan models (Native V3) |
| Volcengine | LlmClient::volcengine("key") |
Chat, Streaming, Reasoning (Doubao-Seed-Code) |
| DeepSeek | LlmClient::deepseek("sk-...") |
Chat, Streaming, Reasoning (R1) |
| Moonshot | LlmClient::moonshot("sk-...") |
Chat, Streaming, Long context |
LlmClient::google("key") |
Chat, Gemini models | |
| LongCat | LlmClient::longcat_openai("ak-...") |
Chat, Streaming |
For detailed provider documentation and advanced configuration, see:
llm-connector supports OpenAI-compatible function calling (tools) with both streaming and non-streaming modes.
use llm_connector::{LlmClient, types::*};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = LlmClient::zhipu("your-api-key")?;
// Define tools
let tools = vec![Tool {
tool_type: "function".to_string(),
function: Function {
name: "get_weather".to_string(),
description: Some("Get weather information for a city".to_string()),
parameters: serde_json::json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. Beijing, Shanghai"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}),
},
}];
let request = ChatRequest {
model: "glm-4-plus".to_string(),
messages: vec![Message::text(Role::User, "What's the weather in Beijing?")],
tools: Some(tools),
tool_choice: Some(ToolChoice::auto()),
..Default::default()
};
let response = client.chat(&request).await?;
// Check if tools were called
if let Some(choice) = response.choices.first() {
if let Some(tool_calls) = &choice.message.tool_calls {
for call in tool_calls {
println!("Tool: {}", call.function.name);
println!("Arguments: {}", call.function.arguments);
}
}
}
Ok(())
}
use futures_util::StreamExt;
let request = ChatRequest {
model: "glm-4-plus".to_string(),
messages: vec![Message::text(Role::User, "What's the weather?")],
tools: Some(tools),
stream: Some(true),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(choice) = chunk.choices.first() {
if let Some(tool_calls) = &choice.delta.tool_calls {
// Process tool calls incrementally
for call in tool_calls {
println!("Tool: {}", call.function.name);
}
}
}
}
Key Features:
For complete examples, see:
examples/zhipu_tools.rs - Basic tool callingexamples/zhipu_multiround_tools.rs - Multi-round conversations with toolsexamples/test_aliyun_streaming_tools.rs - Streaming tool callsTechnical Details: See docs/ARCHITECTURE.md for implementation details.
llm-connector provides unified streaming support across all providers with the streaming feature.
[dependencies]
llm-connector = { version = "0.5.12", features = ["streaming"] }
use llm_connector::{LlmClient, types::{ChatRequest, Message, Role}};
use futures_util::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = LlmClient::openai("sk-...")?;
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::text(Role::User, "Tell me a story")],
stream: Some(true),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
if let Some(content) = chunk?.get_content() {
print!("{}", content);
}
}
Ok(())
}
All providers return the same StreamingResponse type, making it easy to switch between providers without changing your code.
For examples, see:
examples/anthropic_streaming.rsexamples/ollama_streaming.rsexamples/volcengine_streaming.rsStandard OpenAI API format with multiple deployment options.
// OpenAI (default)
let client = LlmClient::openai("sk-...")?;
// Custom base URL
let client = LlmClient::openai_with_base_url("sk-...", "https://api.deepseek.com")?;
// Azure OpenAI
let client = LlmClient::azure_openai(
"your-key",
"https://your-resource.openai.azure.com",
"2024-02-15-preview"
)?;
// OpenAI-compatible services
let client = LlmClient::openai_compatible("sk-...", "https://api.deepseek.com", "deepseek")?;
Features:
models()Example Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini
Claude Messages API with multiple deployment options.
// Standard Anthropic API
let client = LlmClient::anthropic("sk-ant-...")?;
// Google Vertex AI
let client = LlmClient::anthropic_vertex("project-id", "us-central1", "access-token")?;
// Amazon Bedrock
let client = LlmClient::anthropic_bedrock("us-east-1", "access-key", "secret-key")?;
Models: claude-3-5-sonnet-20241022, claude-3-opus, claude-3-haiku
Supports both native and OpenAI-compatible formats.
// Native format
let client = LlmClient::zhipu("your-api-key")?;
// OpenAI-compatible format (recommended)
let client = LlmClient::zhipu_openai_compatible("your-api-key")?;
Models: glm-4, glm-4-flash, glm-4-air, glm-4-plus, glm-4x
Custom protocol for Qwen models with regional support.
// Default (China)
let client = LlmClient::aliyun("sk-...")?;
// International
let client = LlmClient::aliyun_international("sk-...")?;
// Private cloud
let client = LlmClient::aliyun_private("sk-...", "https://your-endpoint.com")?;
Models: qwen-turbo, qwen-plus, qwen-max
Local LLM server with comprehensive model management.
// Default: localhost:11434
let client = LlmClient::ollama()?;
// Custom URL
let client = LlmClient::ollama_with_base_url("http://192.168.1.100:11434")?;
// With custom configuration
let client = LlmClient::ollama_with_config(
"http://localhost:11434",
Some(120), // timeout in seconds
None // proxy
)?;
Models: llama3.2, llama3.1, mistral, mixtral, qwen2.5, etc.
Features:
OpenAI-compatible API for Tencent Cloud.
// Default (Native API v3)
let client = LlmClient::tencent("secret-id", "secret-key")?;
// With custom configuration
let client = LlmClient::tencent_with_config(
"secret-id",
"secret-key",
None, // base_url (uses default)
Some(60), // timeout in seconds
None // proxy
)?;
// Streaming example
#[cfg(feature = "streaming")]
{
use futures_util::StreamExt;
let request = ChatRequest {
model: "hunyuan-standard".to_string(),
messages: vec![Message::user("Write a short poem about the ocean.")],
stream: Some(true),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
if let Some(content) = chunk?.get_content() {
print!("{}", content);
}
}
}
Models: hunyuan-lite, hunyuan-standard, hunyuan-pro, hunyuan-turbo
See examples/tencent_native_streaming.rs for a runnable end-to-end streaming example.
OpenAI-compatible API with custom endpoint paths. Supports both standard chat models and reasoning models (Doubao-Seed-Code).
// Default
let client = LlmClient::volcengine("api-key")?;
// With custom configuration
let client = LlmClient::volcengine_with_config(
"api-key",
None, // base_url (uses default: https://ark.cn-beijing.volces.com)
Some(120), // timeout in seconds
None // proxy
)?;
// Streaming example (works with both standard and reasoning models)
#[cfg(feature = "streaming")]
{
use futures_util::StreamExt;
let request = ChatRequest {
model: "ep-20250118155555-xxxxx".to_string(), // Use endpoint ID as model
messages: vec![Message::user("Introduce yourself")],
stream: Some(true),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
if let Some(content) = chunk?.get_content() {
print!("{}", content);
}
}
}
Endpoint: Uses /api/v3/chat/completions instead of /v1/chat/completions
Models:
ep-...)reasoning_content field, automatically handled)Streaming Support: Full support for both standard and reasoning models. The library automatically extracts content from the appropriate field (content or reasoning_content).
Supports both OpenAI and Anthropic formats.
// OpenAI format
let client = LlmClient::longcat_openai("ak-...")?;
// Anthropic format (with Bearer auth)
let client = LlmClient::longcat_anthropic("ak-...")?;
Models: LongCat-Flash-Chat and other LongCat models
Note: LongCat's Anthropic format uses Authorization: Bearer instead of x-api-key
OpenAI-compatible API for Moonshot AI.
// Default
let client = LlmClient::moonshot("sk-...")?;
// With custom configuration
let client = LlmClient::moonshot_with_config(
"sk-...",
None, // base_url (uses default)
Some(60), // timeout in seconds
None // proxy
)?;
Models: moonshot-v1-8k, moonshot-v1-32k, moonshot-v1-128k
Features:
OpenAI-compatible API with reasoning models support.
// Default
let client = LlmClient::deepseek("sk-...")?;
// With custom configuration
let client = LlmClient::deepseek_with_config(
"sk-...",
None, // base_url (uses default)
Some(60), // timeout in seconds
None // proxy
)?;
Models:
deepseek-chat - Standard chat modeldeepseek-reasoner - Reasoning model with thinking processFeatures:
Reasoning Model Example:
let request = ChatRequest {
model: "deepseek-reasoner".to_string(),
messages: vec![Message {
role: Role::User,
content: "Which is larger, 9.11 or 9.9?".to_string(),
..Default::default()
}],
..Default::default()
};
let response = client.chat(&request).await?;
// Get reasoning process (thinking)
if let Some(reasoning) = response.reasoning_content {
println!("Thinking: {}", reasoning);
}
// Get final answer
println!("Answer: {}", response.content);
Native Google Gemini API support.
// Environment variable (recommended)
// export GEMINI_API_KEY="your-api-key"
// Default
let client = LlmClient::google("api-key")?;
// With custom configuration
let client = LlmClient::google_with_config(
"api-key",
None, // base_url (uses default)
Some(60), // timeout in seconds
None // proxy
)?;
Models: gemini-2.0-flash, gemini-2.0-flash-lite-preview-02-05, gemini-1.5-flash, gemini-1.5-pro
Features:
Notes:
examples/google_basic.rs for a runnable end-to-end example.examples/google_streaming.rsStreaming:
Run the streaming example:
export GEMINI_API_KEY="your-api-key"
export GEMINI_MODEL="gemini-2.5-flash"
cargo run --example google_streaming --features streaming
Supported environment variables in google_streaming example:
GEMINI_API_KEY: Google AI Studio API keyGEMINI_MODEL: model name (e.g. gemini-2.5-flash)GEMINI_PROMPT: override the prompt textGEMINI_MAX_TOKENS: override max_tokens (e.g. 4096)GEMINI_TIMEOUT_SECS: per-chunk timeout used by the exampleAccess Ollama-specific features through the special interface:
let client = LlmClient::ollama()?;
// Access Ollama-specific features
if let Some(ollama) = client.as_ollama() {
// List all installed models
let models = ollama.models().await?;
for model in models {
println!("Available model: {}", model);
}
// Pull a new model
ollama.pull_model("llama3.2").await?;
// Get detailed model information
let details = ollama.show_model("llama3.2").await?;
println!("Model format: {}", details.details.format);
// Check if model exists
let exists = ollama.model_exists("llama3.2").await?;
println!("Model exists: {}", exists);
// Delete a model
ollama.delete_model("llama3.2").await?;
}
models() - Get all locally installed modelspull_model(name) - Download models from registrydelete_model(name) - Remove local modelsshow_model(name) - Get comprehensive model informationmodel_exists(name) - Verify if model is installedThe library provides comprehensive streaming support with universal format abstraction for maximum flexibility:
use futures_util::StreamExt;
use llm_connector::{LlmClient, types::{ChatRequest, Message, Role}};
let client = LlmClient::anthropic("sk-ant-...")?;
let request = ChatRequest {
model: "claude-3-5-sonnet-20241022".to_string(),
messages: vec![Message {
role: Role::User,
content: "Hello!".to_string(),
..Default::default()
}],
max_tokens: Some(200),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(content) = chunk.get_content() {
print!("{}", content);
}
}
For perfect compatibility with tools like Zed.dev, use the pure Ollama streaming format:
use futures_util::StreamExt;
// Use pure Ollama format (perfect for Zed.dev)
let mut stream = client.chat_stream_ollama(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
// chunk is now a pure OllamaStreamChunk
if !chunk.message.content.is_empty() {
print!("{}", chunk.message.content);
}
// Check for final chunk
if chunk.done {
println!("\nStreaming complete!");
break;
}
}
For backward compatibility, the embedded format is still available:
use futures_util::StreamExt;
// Use embedded Ollama format (legacy)
let mut stream = client.chat_stream_ollama_embedded(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
// chunk.content contains Ollama-formatted JSON string
if let Ok(ollama_chunk) = serde_json::from_str::<serde_json::Value>(&chunk.content) {
if let Some(content) = ollama_chunk
.get("message")
.and_then(|m| m.get("content"))
.and_then(|c| c.as_str())
{
print!("{}", content);
}
}
}
For real-time streaming responses, use the streaming interface:
use llm_connector::types::{ChatRequest, Message};
use futures_util::StreamExt;
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Tell me a story")],
stream: Some(true),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
// Get content from the current chunk
if let Some(content) = chunk.get_content() {
print!("{}", content);
}
// Access reasoning content (for providers that support it)
if let Some(reasoning) = &chunk.reasoning_content {
println!("Reasoning: {}", reasoning);
}
}
The streaming response provides rich information and convenience methods:
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
// Access structured data
println!("Model: {}", chunk.model);
println!("ID: {}", chunk.id);
// Get content from first choice
if let Some(content) = chunk.get_content() {
print!("{}", content);
}
// Access all choices
for choice in &chunk.choices {
if let Some(content) = &choice.delta.content {
print!("{}", content);
}
}
// Check for completion
if chunk.choices.iter().any(|c| c.finish_reason.is_some()) {
println!("\nStream completed!");
break;
}
}
| Format | Output Example | Use Case |
|---|---|---|
| JSON | {"content":"hello"} |
API responses, standard JSON |
| SSE | data: {"content":"hello"}\n\n |
Web real-time streaming |
| NDJSON | {"content":"hello"}\n |
Log processing, data pipelines |
message_start, content_block_delta, message_delta, message_stop eventsFetch the latest available models from the API:
let client = LlmClient::openai("sk-...")?;
// Fetch models online from the API
let models = client.models().await?;
println!("Available models: {:?}", models);
Supported by:
/api/tags)Example Results:
["deepseek-chat", "deepseek-reasoner"]["glm-4.5", "glm-4.5-air", "glm-4.6"]["moonshot-v1-32k", "kimi-latest", ...]Recommendation:
models() results to avoid repeated API callslet request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![
Message::system("You are a helpful assistant."),
Message::user("Hello!"),
],
temperature: Some(0.7),
max_tokens: Some(100),
..Default::default()
};
let request = ChatRequest {
model: "claude-3-5-sonnet-20241022".to_string(),
messages: vec![Message::user("Hello!")],
max_tokens: Some(200), // Required for Anthropic
..Default::default()
};
let request = ChatRequest {
model: "qwen-max".to_string(),
messages: vec![Message::user("Hello!")],
..Default::default()
};
let request = ChatRequest {
model: "llama3.2".to_string(),
messages: vec![Message::user("Hello!")],
..Default::default()
};
If you expose an Ollama-compatible API while the backend actually calls Zhipu glm-4.6 (remote gateway), you do NOT need any local model installation. Just point the client to your gateway and use the model id defined by your service:
use futures_util::StreamExt;
use llm_connector::{LlmClient, types::{ChatRequest, Message}};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Point to your remote Ollama-compatible gateway (replace with your actual URL)
let client = LlmClient::ollama(Some("https://your-ollama-gateway.example.com"));
let request = ChatRequest {
model: "glm-4.6".to_string(),
messages: vec![Message::user("Briefly explain the benefits of streaming.")],
max_tokens: Some(128),
..Default::default()
};
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(content) = chunk.get_content() {
print!("{}", content);
}
}
Ok(())
}
Run example (requires streaming feature):
cargo run --example ollama_streaming --features streaming
Note: This setup targets a remote Ollama-compatible gateway. The model id is defined by your backend (e.g. glm-4.6); no local installation is required. If your gateway uses a different identifier, replace it accordingly.
llm-connector provides universal support for reasoning models across different providers. No matter which field the reasoning content is in (reasoning_content, reasoning, thought, thinking), it's automatically extracted and available via get_content().
| Provider | Model | Reasoning Field | Status |
|---|---|---|---|
| Volcengine | Doubao-Seed-Code | reasoning_content |
Verified |
| DeepSeek | DeepSeek R1 | reasoning_content / reasoning |
Supported |
| OpenAI | o1-preview, o1-mini | thought / reasoning_content |
Supported |
| Qwen | Qwen-Plus | reasoning |
Supported |
| Anthropic | Claude 3.5 Sonnet | thinking |
Supported |
The same code works for all reasoning models:
use futures_util::StreamExt;
// Works with Volcengine Doubao-Seed-Code
let provider = volcengine_with_config("api-key", None, Some(60), None)?;
// Works with DeepSeek R1
// let provider = openai_with_config("deepseek-key", Some("https://api.deepseek.com"), None, None)?;
// Works with OpenAI o1
// let provider = openai("openai-key")?;
let request = ChatRequest {
model: "ep-20250118155555-xxxxx".to_string(), // or "deepseek-reasoner", "o1-preview", etc.
messages: vec![Message::user("Solve this problem")],
stream: Some(true),
..Default::default()
};
let mut stream = provider.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
if let Some(content) = chunk?.get_content() {
print!("{}", content); // Automatically extracts reasoning content
}
}
Key Benefits:
content field takes precedence when availableSee Reasoning Models Support Guide for detailed documentation.
use llm_connector::error::LlmConnectorError;
match client.chat(&request).await {
Ok(response) => {
println!("Response: {}", response.choices[0].message.content);
}
Err(e) => {
match e {
LlmConnectorError::AuthenticationError(msg) => {
eprintln!("Auth error: {}", msg);
}
LlmConnectorError::RateLimitError(msg) => {
eprintln!("Rate limit: {}", msg);
}
LlmConnectorError::UnsupportedOperation(msg) => {
eprintln!("Not supported: {}", msg);
}
_ => eprintln!("Error: {}", e),
}
}
}
let client = LlmClient::openai("your-api-key");
export OPENAI_API_KEY="sk-your-key"
export ANTHROPIC_API_KEY="sk-ant-your-key"
export ALIYUN_API_KEY="sk-your-key"
use std::env;
let api_key = env::var("OPENAI_API_KEY")?;
let client = LlmClient::openai(&api_key, None);
let client = LlmClient::openai("sk-...")?;
// Get provider name
println!("Provider: {}", client.provider_name());
// Fetch models online (requires API call)
let models = client.models().await?;
println!("Available models: {:?}", models);
Many providers return hidden or provider-specific keys for model reasoning content (chain-of-thought). To simplify usage across providers, we normalize four common keys:
reasoning_content, reasoning, thought, thinkingPost-processing automatically scans raw JSON and fills these optional fields on both regular messages (Message) and streaming deltas (Delta). You can read the first available value via a convenience method:
// Non-streaming
let msg = &response.choices[0].message;
if let Some(reason) = msg.reasoning_any() {
println!("Reasoning: {}", reason);
}
// Streaming
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(reason) = chunk.choices[0].delta.reasoning_any() {
println!("Reasoning (stream): {}", reason);
}
}
Notes:
None if the provider does not return any reasoning keys.StreamingResponse also backfills its top-level reasoning_content from the first delta that contains reasoning.All providers output the same unified StreamingResponse format, regardless of their native API format.
Different Input Formats → Protocol Conversion → Unified StreamingResponse
// Same code works with ANY provider
let mut stream = client.chat_stream(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?; // Always StreamingResponse
// Unified access methods
if let Some(content) = chunk.get_content() {
print!("{}", content);
}
if let Some(reason) = chunk.get_finish_reason() {
println!("\nfinish_reason: {}", reason);
}
if let Some(usage) = chunk.usage {
println!("usage: {:?}", usage);
}
}
| Provider | Native Format | Conversion | Complexity |
|---|---|---|---|
| OpenAI | OpenAI standard | Direct mapping | Simple |
| Tencent | OpenAI compatible | Direct mapping | Simple |
| Volcengine | OpenAI compatible | Direct mapping | Simple |
| Anthropic | Multi-event stream | Custom parser | Complex |
| Aliyun | DashScope format | Custom parser | Medium |
| Zhipu | GLM format | Custom parser | Medium |
All conversions happen transparently in the Protocol layer - you just get consistent StreamingResponse objects!
Authentication Error:
Authentication failed: Incorrect API key provided
Solutions:
Timeout Error:
Request timeout
Solutions:
*_with_timeout() methodsModel Not Found:
Model not found
Solutions:
fetch_models() to get available modelsTencent Native API v3
TC3-HMAC-SHA256)tencent() now accepts (secret_id, secret_key)Streaming Tool Calls Fix
Universal Reasoning Models Support
Simplified Configuration Architecture
chat_stream() methodFor complete changelog, see CHANGELOG.md
Minimal by Design:
Protocol-first:
Check out the examples/ directory for various usage examples:
# Basic usage examples
cargo run --example openai_basic
cargo run --example anthropic_streaming --features streaming
cargo run --example aliyun_basic
cargo run --example zhipu_basic
cargo run --example ollama_basic
cargo run --example tencent_basic
# Multi-modal support
cargo run --example multimodal_basic
# Ollama model management
cargo run --example ollama_model_management
cargo run --example ollama_streaming --features streaming
# Function calling / Tools
cargo run --example zhipu_tools
cargo run --example zhipu_multiround_tools
# Streaming examples
cargo run --example volcengine_streaming --features streaming
cargo run --example test_longcat_anthropic_streaming --features streaming
# List all available providers
cargo run --example list_providers
Basic Examples:
openai_basic.rs - Simple OpenAI chat exampleanthropic_streaming.rs - Anthropic streaming with proper event handlingaliyun_basic.rs - Aliyun DashScope basic usagezhipu_basic.rs - Zhipu GLM basic usageollama_basic.rs - Ollama local model usagetencent_basic.rs - Tencent Hunyuan basic usageAdvanced Examples:
multimodal_basic.rs - Multi-modal content (text + images)ollama_model_management.rs - Complete Ollama model CRUD operationszhipu_tools.rs - Function calling with Zhipuzhipu_multiround_tools.rs - Multi-round conversation with toolsvolcengine_streaming.rs - Volcengine streaming with reasoning modelsUtility Examples:
list_providers.rs - List all available providers and their configurationsContributions are welcome! Please feel free to submit a Pull Request. Please check RULES.md for project rules and docs/CONTRIBUTING.md for development guidelines.
MIT