| Crates.io | api_ollama |
| lib.rs | api_ollama |
| version | 0.2.0 |
| created_at | 2025-11-06 19:03:49.440138+00 |
| updated_at | 2025-11-30 00:20:27.722187+00 |
| description | Ollama local LLM runtime API client for HTTP communication. |
| homepage | https://github.com/Wandalen/api_llm/tree/master/api/ollama |
| repository | https://github.com/Wandalen/api_llm/tree/master/api/ollama |
| max_upload_size | |
| id | 1920127 |
| size | 1,495,530 |
Rust HTTP client for the Ollama local LLM runtime API.
This API crate is designed as a stateless HTTP client with zero persistence requirements. It provides:
This ensures lightweight, containerized deployments and eliminates operational complexity.
Expose Ollama's API directly without abstraction layers, enabling developers to access all capabilities with explicit control.
Key principles:
Core Capabilities:
Enterprise Reliability:
API Patterns:
[dependencies]
api_ollama = { version = "0.1.0", features = ["full"] }
use api_ollama::{ OllamaClient, ChatRequest, ChatMessage, MessageRole };
#[tokio::main]
async fn main() -> Result< (), Box< dyn std::error::Error > >
{
let mut client = OllamaClient::new(
"http://localhost:11434".to_string(),
std::time::Duration::from_secs( 30 )
);
// Check availability
if !client.is_available().await
{
println!( "Ollama is not available" );
return Ok( () );
}
// List available models
let models = client.list_models().await?;
println!( "Available models: {:?}", models );
// Send chat request
let request = ChatRequest
{
model: "llama3.2".to_string(),
messages: vec![ ChatMessage
{
role: MessageRole::User,
content: "Hello!".to_string(),
images: None,
#[cfg( feature = "tool_calling" )]
tool_calls: None,
}],
stream: None,
options: None,
#[cfg( feature = "tool_calling" )]
tools: None,
#[cfg( feature = "tool_calling" )]
tool_messages: None,
};
let response = client.chat( request ).await?;
println!( "Response: {:?}", response );
Ok( () )
}
| Feature | Description |
|---|---|
enabled |
Master switch for basic functionality |
streaming |
Real-time streaming responses |
embeddings |
Text embedding generation |
vision_support |
Image inputs for vision models |
tool_calling |
Function/tool calling support |
builder_patterns |
Fluent builder APIs |
retry |
Exponential backoff retry |
circuit_breaker |
Circuit breaker pattern |
rate_limiting |
Token bucket rate limiting |
failover |
Automatic endpoint failover |
health_checks |
Endpoint health monitoring |
request_caching |
Response caching with TTL |
sync_api |
Synchronous blocking API |
full |
Enable all features |
# Unit tests
cargo nextest run
# Integration tests (requires running Ollama)
cargo nextest run --features integration
# Full validation
w3 .test level::3
Testing Policy: Integration tests require a running Ollama instance. Tests fail clearly when Ollama is unavailable.
MIT