siumai

Crates.io	siumai
lib.rs	siumai
version	0.11.0-beta.5
created_at	2025-06-20 14:37:04.392186+00
updated_at	2026-01-14 16:54:04.575567+00
description	A unified LLM interface library for Rust
homepage	https://github.com/YumchaLabs/siumai
repository	https://github.com/YumchaLabs/siumai
max_upload_size
id	1719712
size	15,867,032

Latias94 (Latias94)

documentation

https://docs.rs/siumai

README

Siumai — Unified LLM Interface for Rust

Siumai (烧卖) is a type-safe Rust library that provides a single, consistent API over multiple LLM providers. It focuses on clear abstractions, predictable behavior, and practical extensibility.

This README keeps things straightforward: what you can do, how to customize, and short examples.

What It Provides

Unified clients for multiple providers (OpenAI, Anthropic, Google Gemini, Ollama, Groq, xAI, and OpenAI‑compatible vendors)
Capability traits for chat, streaming, tools, vision, audio, files, embeddings, and rerank
Streaming with start/delta/usage/end events and cancellation
Tool calling and a lightweight orchestrator for multi‑step workflows
Structured outputs:
- Provider‑native structured outputs (OpenAI/Anthropic/Gemini, etc.)
- Provider‑agnostic decoding helpers with JSON repair and validation (via siumai-extras)
HTTP interceptors, middleware, and a simple retry facade
Optional extras for telemetry, OpenTelemetry, schema validation, and server adapters

Install

[dependencies]
siumai = "0.11.0-beta.5"
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }

Migration (beta.5)

Upgrading from 0.11.0-beta.4 (or earlier)?

See docs/migration/migration-0.11.0-beta.5.md

Feature flags (enable only what you need):

# One provider
siumai = { version = "0.11.0-beta.5", features = ["openai"] }

# Multiple providers
siumai = { version = "0.11.0-beta.5", features = ["openai", "anthropic", "google"] }

# All
siumai = { version = "0.11.0-beta.5", features = ["all-providers"] }

Note: siumai enables openai by default. Disable defaults via default-features = false.

Optional package for advanced utilities:

[dependencies]
siumai = "0.11.0-beta.5"
siumai-extras = { version = "0.11.0-beta.5", features = ["schema", "telemetry", "opentelemetry", "server", "mcp"] }

Usage

Registry (recommended)

Use the registry to resolve models via provider:model and get a handle with a uniform API.

use siumai::prelude::unified::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let reg = registry::global();
    let model = reg.language_model("openai:gpt-4o-mini")?;
    let resp = model.chat(vec![user!("Hello")], None).await?;
    println!("{}", resp.content_text().unwrap_or_default());
    Ok(())
}

Supported examples of provider:model:

openai:gpt-4o, openai:gpt-4o-mini
anthropic:claude-3-5-sonnet-20240620
anthropic-vertex:claude-3-5-sonnet-20240620
gemini:gemini-2.0-flash-exp
groq:llama-3.1-70b-versatile
xai:grok-beta
ollama:llama3.2
minimaxi:minimax-text-01

OpenAI‑compatible vendors follow the same pattern (API keys read as {PROVIDER_ID}_API_KEY when possible). See docs for details.

Builder (unified or provider‑specific)

Provider‑specific client:

use siumai::prelude::unified::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Provider::openai()
        .api_key(std::env::var("OPENAI_API_KEY")?)
        .model("gpt-4o")
        .build()
        .await?;

    let resp = client.chat(vec![user!("Hi")]).await?;
    println!("{}", resp.content_text().unwrap_or_default());
    Ok(())
}

Unified interface (portable across providers):

use siumai::prelude::unified::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Siumai::builder()
        .anthropic()
        .api_key(std::env::var("ANTHROPIC_API_KEY")?)
        .model("claude-3-5-sonnet")
        .build()
        .await?;

    let resp = client.chat(vec![user!("What's Rust?")]).await?;
    println!("{}", resp.content_text().unwrap_or_default());
    Ok(())
}

OpenAI‑compatible (custom base URL):

use siumai::prelude::unified::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // For OpenAI-compatible local endpoints, you can use any non-empty API key.
    let vllm = Siumai::builder().openai()
        .api_key("dummy")
        .base_url("http://localhost:8000/v1")
        .model("meta-llama/Llama-3.1-8B-Instruct")
        .build()
        .await?;

    let resp = vllm.chat(vec![user!("Hello from vLLM")]).await?;
    println!("{}", resp.content_text().unwrap_or_default());
    Ok(())
}

Streaming

use futures::StreamExt;
use siumai::prelude::unified::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Siumai::builder().openai()
        .api_key(std::env::var("OPENAI_API_KEY")?)
        .model("gpt-4o-mini")
        .build()
        .await?;

    let mut stream = client.chat_stream(vec![user!("Stream a long answer")], None).await?;
    while let Some(ev) = stream.next().await {
        if let Ok(ChatStreamEvent::ContentDelta { delta, .. }) = ev { print!("{}", delta); }
    }
    Ok(())
}

Structured output

1) Provider‑agnostic decoding (recommended for cross‑provider flows)

Use siumai-extras to parse model text into typed JSON with optional schema validation and repair:

use serde::Deserialize;
use siumai::prelude::unified::*;
use siumai_extras::highlevel::object::generate_object;

#[derive(Deserialize, Debug)]
struct Post { title: String }

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Siumai::builder().openai().model("gpt-4o-mini").build().await?;
    let (post, _resp) = generate_object::<Post>(
        &client,
        vec![user!("Return JSON: {\"title\":\"hi\"}")],
        None,
        Default::default(),
    ).await?;
    println!("{}", post.title);
    Ok(())
}

Under the hood this uses siumai_extras::structured_output::OutputDecodeConfig to:

enforce shape hints (object/array/enum)
optionally validate against a JSON Schema
repair common issues (markdown fences, trailing commas, partial slices)

2) Provider‑native structured outputs (example: OpenAI Responses API)

For providers that expose native structured outputs, configure them via provider options. You still can combine them with the decoding helpers above if you want:

use siumai::prelude::unified::*;
use siumai::provider_ext::openai::options::{OpenAiChatRequestExt, OpenAiOptions, ResponsesApiConfig};
use serde_json::json;

let schema = json!({"type":"object","properties":{"title":{"type":"string"}},"required":["title"]});
let req = ChatRequestBuilder::new()
    .message(user!("Return an object with title"))
    .build()
    .with_openai_options(OpenAiOptions::new().with_responses_api(
        ResponsesApiConfig::new().with_response_format(json!({
            "type": "json_object",
            "json_schema": { "schema": schema, "strict": true }
        }))
    ));
let resp = client.chat_request(req).await?;
// Optionally: further validate/repair/deserialize using `siumai-extras` helpers.

Retries

use siumai::prelude::unified::*;
use siumai::retry_api::{retry_with, RetryOptions};

let text = client
    .ask_with_retry("Hello".to_string(), RetryOptions::backoff())
    .await?;

Customization

HTTP client and headers
Middleware chain (defaults, clamping, reasoning extraction)
HTTP interceptors (request/response hooks, SSE observation)
Retry options and backoff

HTTP configuration

You have three practical ways to control HTTP behavior, from simple to advanced.

Per‑builder toggles (most common)

use siumai::prelude::unified::*;

// Unified builder (Siumai::builder) with SSE stability control
let client = Siumai::builder()
    .openai()
    .api_key(std::env::var("OPENAI_API_KEY")?)
    .model("gpt-4o-mini")
    .http_timeout(std::time::Duration::from_secs(30))
    .http_connect_timeout(std::time::Duration::from_secs(10))
    .http_user_agent("my-app/1.0")
    .http_header("X-User-Project", "acme")
    .http_stream_disable_compression(true) // keep SSE stable; default can be controlled by env
    .build()
    .await?;

HttpConfig builder + shared client builder (centralized configuration)

use siumai::experimental::execution::http::client::build_http_client_from_config;
use siumai::prelude::unified::*;

// Construct a reusable HTTP config
let http_cfg = HttpConfig::builder()
    .timeout(Some(std::time::Duration::from_secs(30)))
    .connect_timeout(Some(std::time::Duration::from_secs(10)))
    .user_agent(Some("my-app/1.0"))
    .proxy(Some("http://proxy.example.com:8080"))
    .header("X-User-Project", "acme")
    .stream_disable_compression(true) // explicit SSE stability
    .build();

// Build reqwest client using the shared helper
let http = build_http_client_from_config(&http_cfg)?;

// Inject it into a builder (takes precedence over other HTTP settings)
let client = Siumai::builder()
    .with_http_client(http)
    .openai()
    .api_key(std::env::var("OPENAI_API_KEY")?)
    .model("gpt-4o-mini")
    .build()
    .await?;

Fully custom reqwest client (maximum control)

use siumai::prelude::unified::*;

let http = reqwest::Client::builder()
    .timeout(std::time::Duration::from_secs(30))
    // .danger_accept_invalid_certs(true) // if needed for dev
    .build()?;

let client = Siumai::builder()
    .with_http_client(http)
    .openai()
    .api_key(std::env::var("OPENAI_API_KEY")?)
    .model("gpt-4o-mini")
    .build()
    .await?;

Notes:

Streaming stability: By default, stream_disable_compression is derived from SIUMAI_STREAM_DISABLE_COMPRESSION (true unless set to false|0|off|no). You can override it per request via the unified builder method http_stream_disable_compression.
When a custom reqwest::Client is provided via .with_http_client(..), it takes precedence over any other HTTP settings on the builder.

Registry with custom middleware and interceptors:

use siumai::prelude::unified::*;
use siumai::experimental::execution::middleware::samples::chain_default_and_clamp;
use siumai::experimental::execution::http::interceptor::LoggingInterceptor;
use siumai::prelude::unified::registry::{create_provider_registry, RegistryOptions};
use std::collections::HashMap;
use std::sync::Arc;

let reg = create_provider_registry(
    HashMap::new(),
    Some(RegistryOptions {
        separator: ':',
        language_model_middleware: chain_default_and_clamp(),
        http_interceptors: vec![Arc::new(LoggingInterceptor)],
        http_config: None,
        retry_options: None,
        max_cache_entries: Some(128),
        client_ttl: None,
        auto_middleware: true,
    })
);

Extras (`siumai-extras`)

Telemetry subscribers and helpers
OpenTelemetry middleware (W3C Trace Context)
JSON schema validation
Server adapters (Axum SSE)
MCP utilities

See the siumai-extras crate for details and examples.

Examples

Examples are under siumai/examples/:

01-quickstart — basic chat, streaming, provider switching
02-core-api — chat, streaming, tools, multimodal
03-advanced-features — middleware, retry, orchestrator, error types
04-provider-specific — provider‑unique capabilities
05-integrations — registry, MCP, telemetry
06-applications — chatbot, code assistant, API server

Typical commands:

cargo run --example basic-chat --features openai
cargo run --example streaming --features openai
cargo run --example basic-orchestrator --features openai

Status and notes

OpenAI Responses API web_search is wired through hosted_tools::openai::web_search and the OpenAI Responses pipeline, but is still considered experimental and may change.
Several modules were reorganized in 0.11: HTTP helpers live under execution::http::*, Vertex helpers under auth::vertex. See CHANGELOG for migration notes.

API keys and environment variables:

OpenAI: .api_key(..) or OPENAI_API_KEY
Anthropic: .api_key(..) or ANTHROPIC_API_KEY
Groq: .api_key(..) or GROQ_API_KEY
Gemini: .api_key(..) or GEMINI_API_KEY
xAI: .api_key(..) or XAI_API_KEY
Ollama: no API key
OpenAI‑compatible via Registry: reads {PROVIDER_ID}_API_KEY (e.g., DEEPSEEK_API_KEY)
OpenAI‑compatible via Builder: .api_key(..) or {PROVIDER_ID}_API_KEY

Acknowledgements

This project draws inspiration from:

Vercel AI SDK (adapter patterns)
Cherry Studio (transformer design)

Changelog and license

See CHANGELOG.md for detailed changes and migration tips.

Licensed under either of:

Apache License, Version 2.0, or
MIT license

at your option.

Commit count: 487