| Crates.io | herolib-ai |
| lib.rs | herolib-ai |
| version | 0.3.13 |
| created_at | 2025-12-27 19:44:50.086484+00 |
| updated_at | 2026-01-24 05:24:28.742892+00 |
| description | AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover |
| homepage | |
| repository | https://github.com/threefoldtech/sal |
| max_upload_size | |
| id | 2007716 |
| size | 172,945 |
AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover.
This crate provides a unified AI client that supports multiple providers with:
Add to your Cargo.toml:
[dependencies]
herolib-ai = "0.1.0"
Set API keys using environment variables:
export GROQ_API_KEY="your-groq-key"
export OPENROUTER_API_KEY="your-openrouter-key"
export SAMBANOVA_API_KEY="your-sambanova-key"
use herolib_ai::{AiClient, Model, PromptBuilderExt};
let client = AiClient::from_env();
let response = client
.prompt()
.model(Model::Llama3_3_70B)
.system("You are a helpful coding assistant")
.user("Write a hello world in Rust")
.execute_content()
.unwrap();
println!("{}", response);
use herolib_ai::{AiClient, Model, PromptBuilderExt};
/// Verifies that the response is valid JSON.
/// Returns Ok(()) if valid, or Err with feedback for the AI to retry.
fn verify_json(content: &str) -> Result<(), String> {
match serde_json::from_str::<serde_json::Value>(content) {
Ok(_) => Ok(()),
Err(e) => Err(format!("Invalid JSON: {}. Please output only valid JSON.", e)),
}
}
let client = AiClient::from_env();
let response = client
.prompt()
.model(Model::Qwen2_5Coder32B)
.system("You are a JSON generator. Only output valid JSON.")
.user("Generate a JSON object with name and age fields")
.verify(verify_json)
.max_retries(3)
.execute_verified()
.unwrap();
GPT-OSS 20B is OpenAI's efficient 20B parameter model with reasoning support, available through Groq:
use herolib_ai::{AiClient, Model, PromptBuilderExt};
let client = AiClient::from_env();
let response = client
.prompt()
.model(Model::GptOss20B)
.system("You are a helpful assistant with strong reasoning capabilities")
.user("Explain how machine learning differs from traditional programming")
.execute_content()
.unwrap();
println!("{}", response);
Pricing via Groq:
use herolib_ai::{AiClient, Provider, ProviderConfig, Model};
let client = AiClient::new()
.with_provider(ProviderConfig::new(Provider::Groq, "your-api-key"))
.with_provider(ProviderConfig::new(Provider::OpenRouter, "your-api-key"))
.with_default_temperature(0.7)
.with_default_max_tokens(2000);
| Model | Description | Providers |
|---|---|---|
llama3_3_70b |
Fast, capable model for general tasks | Groq, SambaNova, OpenRouter |
llama3_1_70b |
Versatile model for various tasks | Groq, SambaNova, OpenRouter |
llama3_1_8b |
Small, fast model for simple tasks | Groq, SambaNova, OpenRouter |
qwen2_5_coder_32b |
Specialized for code generation | Groq, SambaNova, OpenRouter |
deepseek_coder_v2_5 |
Advanced coding model | OpenRouter, SambaNova |
deepseek_v3 |
Latest DeepSeek model | OpenRouter, SambaNova |
llama3_1_405b |
Largest Llama model for complex tasks | SambaNova, OpenRouter |
mixtral_8x7b |
Efficient mixture of experts model | Groq, OpenRouter |
llama3_2_90b_vision |
Multimodal model with vision | Groq, OpenRouter |
llama3_2_11b_vision |
Smaller vision model | Groq, SambaNova, OpenRouter |
nemotron_nano_30b |
NVIDIA MoE model with reasoning | OpenRouter |
gpt_oss_120b |
OpenAI's open-weight 120B MoE model | Groq, SambaNova, OpenRouter |
gpt_oss_20b |
OpenAI's efficient 20B MoE model | Groq |
Embedding models convert text into vector representations for semantic search, RAG, and similarity tasks.
| Model | Description | Dimensions | Context | Provider |
|---|---|---|---|---|
text_embedding_3_small |
OpenAI fast, efficient embedding | 1536 | 8,191 | OpenRouter |
qwen3_embedding_8b |
Multilingual embedding model | - | 32,768 | OpenRouter |
use herolib_ai::{AiClient, EmbeddingModel};
let client = AiClient::from_env();
// Single text embedding
let response = client
.embed(EmbeddingModel::Qwen3Embedding8B, "Hello, world!")
.unwrap();
println!("Vector dimensions: {}", response.embedding().unwrap().len());
// Batch embedding
let texts = vec!["Hello".to_string(), "World".to_string()];
let response = client
.embed_batch(EmbeddingModel::Qwen3Embedding8B, texts)
.unwrap();
for embedding in response.embeddings() {
println!("Embedding length: {}", embedding.len());
}
Speech-to-text transcription using Whisper models via Groq's ultra-fast inference.
| Model | Description | Speed | Translation | Provider |
|---|---|---|---|---|
whisper_large_v3_turbo |
Fast multilingual transcription | 216x RT | No | Groq |
whisper_large_v3 |
High accuracy transcription | 189x RT | Yes | Groq |
Supported audio formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm (max 25MB).
use herolib_ai::{AiClient, TranscriptionModel, TranscriptionOptions};
use std::path::Path;
let client = AiClient::from_env();
// Simple transcription from file
let response = client
.transcribe_file(TranscriptionModel::WhisperLargeV3Turbo, Path::new("audio.mp3"))
.unwrap();
println!("Transcription: {}", response.text);
// With options (language hint, temperature)
let options = TranscriptionOptions::new()
.with_language("en")
.with_temperature(0.0);
let response = client
.transcribe_file_with_options(
TranscriptionModel::WhisperLargeV3Turbo,
Path::new("audio.mp3"),
options,
)
.unwrap();
// Verbose response with timestamps
let options = TranscriptionOptions::new().with_language("en");
let response = client
.transcribe_bytes_verbose(
TranscriptionModel::WhisperLargeV3,
&audio_bytes,
"audio.mp3",
options,
)
.unwrap();
println!("Duration: {:?}s", response.duration);
for segment in response.segments.unwrap_or_default() {
println!("[{:.2}s - {:.2}s] {}", segment.start, segment.end, segment.text);
}
https://api.groq.com/openai/v1/chat/completionsGROQ_API_KEYhttps://openrouter.ai/api/v1/chat/completionsOPENROUTER_API_KEYhttps://api.sambanova.ai/v1/chat/completionsSAMBANOVA_API_KEYThe modeltest binary tests model availability across all configured providers.
# Build and run
cargo run --bin modeltest
# Or after building
./target/debug/modeltest
herolib-ai Model Test Utility
Testing model availability across all providers
Configured providers:
- Groq
- OpenRouter
======================================================================
Phase 1: Querying Provider Model Lists
======================================================================
Querying Groq... OK (42 models)
Querying OpenRouter... OK (256 models)
======================================================================
Phase 2: Validating Model Mappings
======================================================================
Llama 3.3 70B (Llama 3.3 70B - Fast, capable model for general tasks):
Groq llama-3.3-70b-versatile -> OK
OpenRouter meta-llama/llama-3.3-70b-instruct -> OK
======================================================================
Phase 3: Testing Models with 'whoami' Query
======================================================================
--------------------------------------------------
Testing: Llama 3.3 70B
--------------------------------------------------
Groq (llama-3.3-70b-versatile)... OK (523ms)
Response: I'm LLaMA, a large language model trained by Meta AI.
======================================================================
Final Report
======================================================================
Test Summary:
Total tests: 20
Successful: 18
Failed: 2
Success rate: 90.0%
All tests passed!
./build.sh
./run.sh
Apache-2.0