herolib-ai

Crates.io	herolib-ai
lib.rs	herolib-ai
version	0.3.13
created_at	2025-12-27 19:44:50.086484+00
updated_at	2026-01-24 05:24:28.742892+00
description	AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover
homepage
repository	https://github.com/threefoldtech/sal
max_upload_size
id	2007716
size	172,945

kristof de spiegeleer (despiegk)

documentation

README

herolib-ai

AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover.

Overview

This crate provides a unified AI client that supports multiple providers with:

Multi-provider support: Automatically tries providers in order of preference
OpenAI-compatible API: Works with any OpenAI-compatible endpoint
Automatic failover: Falls back to alternative providers on failure
Verification support: Retry with feedback until response passes validation
Model abstraction: Use our model names, mapped to provider-specific IDs

Installation

Add to your Cargo.toml:

[dependencies]
herolib-ai = "0.1.0"

Environment Variables

Set API keys using environment variables:

export GROQ_API_KEY="your-groq-key"
export OPENROUTER_API_KEY="your-openrouter-key"
export SAMBANOVA_API_KEY="your-sambanova-key"

Usage

Simple Chat

use herolib_ai::{AiClient, Model, PromptBuilderExt};

let client = AiClient::from_env();

let response = client
    .prompt()
    .model(Model::Llama3_3_70B)
    .system("You are a helpful coding assistant")
    .user("Write a hello world in Rust")
    .execute_content()
    .unwrap();

println!("{}", response);

With Verification

use herolib_ai::{AiClient, Model, PromptBuilderExt};

/// Verifies that the response is valid JSON.
/// Returns Ok(()) if valid, or Err with feedback for the AI to retry.
fn verify_json(content: &str) -> Result<(), String> {
    match serde_json::from_str::<serde_json::Value>(content) {
        Ok(_) => Ok(()),
        Err(e) => Err(format!("Invalid JSON: {}. Please output only valid JSON.", e)),
    }
}

let client = AiClient::from_env();

let response = client
    .prompt()
    .model(Model::Qwen2_5Coder32B)
    .system("You are a JSON generator. Only output valid JSON.")
    .user("Generate a JSON object with name and age fields")
    .verify(verify_json)
    .max_retries(3)
    .execute_verified()
    .unwrap();

Using GPT-OSS 20B via Groq

GPT-OSS 20B is OpenAI's efficient 20B parameter model with reasoning support, available through Groq:

use herolib_ai::{AiClient, Model, PromptBuilderExt};

let client = AiClient::from_env();

let response = client
    .prompt()
    .model(Model::GptOss20B)
    .system("You are a helpful assistant with strong reasoning capabilities")
    .user("Explain how machine learning differs from traditional programming")
    .execute_content()
    .unwrap();

println!("{}", response);

Pricing via Groq:

Input: $0.075 per million tokens
Output: $0.30 per million tokens

Manual Provider Configuration

use herolib_ai::{AiClient, Provider, ProviderConfig, Model};

let client = AiClient::new()
    .with_provider(ProviderConfig::new(Provider::Groq, "your-api-key"))
    .with_provider(ProviderConfig::new(Provider::OpenRouter, "your-api-key"))
    .with_default_temperature(0.7)
    .with_default_max_tokens(2000);

Available Models

Model	Description	Providers
`llama3_3_70b`	Fast, capable model for general tasks	Groq, SambaNova, OpenRouter
`llama3_1_70b`	Versatile model for various tasks	Groq, SambaNova, OpenRouter
`llama3_1_8b`	Small, fast model for simple tasks	Groq, SambaNova, OpenRouter
`qwen2_5_coder_32b`	Specialized for code generation	Groq, SambaNova, OpenRouter
`deepseek_coder_v2_5`	Advanced coding model	OpenRouter, SambaNova
`deepseek_v3`	Latest DeepSeek model	OpenRouter, SambaNova
`llama3_1_405b`	Largest Llama model for complex tasks	SambaNova, OpenRouter
`mixtral_8x7b`	Efficient mixture of experts model	Groq, OpenRouter
`llama3_2_90b_vision`	Multimodal model with vision	Groq, OpenRouter
`llama3_2_11b_vision`	Smaller vision model	Groq, SambaNova, OpenRouter
`nemotron_nano_30b`	NVIDIA MoE model with reasoning	OpenRouter
`gpt_oss_120b`	OpenAI's open-weight 120B MoE model	Groq, SambaNova, OpenRouter
`gpt_oss_20b`	OpenAI's efficient 20B MoE model	Groq

Embedding Models

Embedding models convert text into vector representations for semantic search, RAG, and similarity tasks.

Model	Description	Dimensions	Context	Provider
`text_embedding_3_small`	OpenAI fast, efficient embedding	1536	8,191	OpenRouter
`qwen3_embedding_8b`	Multilingual embedding model	-	32,768	OpenRouter

Embedding Usage

use herolib_ai::{AiClient, EmbeddingModel};

let client = AiClient::from_env();

// Single text embedding
let response = client
    .embed(EmbeddingModel::Qwen3Embedding8B, "Hello, world!")
    .unwrap();

println!("Vector dimensions: {}", response.embedding().unwrap().len());

// Batch embedding
let texts = vec!["Hello".to_string(), "World".to_string()];
let response = client
    .embed_batch(EmbeddingModel::Qwen3Embedding8B, texts)
    .unwrap();

for embedding in response.embeddings() {
    println!("Embedding length: {}", embedding.len());
}

Transcription Models

Speech-to-text transcription using Whisper models via Groq's ultra-fast inference.

Model	Description	Speed	Translation	Provider
`whisper_large_v3_turbo`	Fast multilingual transcription	216x RT	No	Groq
`whisper_large_v3`	High accuracy transcription	189x RT	Yes	Groq

Supported audio formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm (max 25MB).

Transcription Usage

use herolib_ai::{AiClient, TranscriptionModel, TranscriptionOptions};
use std::path::Path;

let client = AiClient::from_env();

// Simple transcription from file
let response = client
    .transcribe_file(TranscriptionModel::WhisperLargeV3Turbo, Path::new("audio.mp3"))
    .unwrap();

println!("Transcription: {}", response.text);

// With options (language hint, temperature)
let options = TranscriptionOptions::new()
    .with_language("en")
    .with_temperature(0.0);

let response = client
    .transcribe_file_with_options(
        TranscriptionModel::WhisperLargeV3Turbo,
        Path::new("audio.mp3"),
        options,
    )
    .unwrap();

// Verbose response with timestamps
let options = TranscriptionOptions::new().with_language("en");
let response = client
    .transcribe_bytes_verbose(
        TranscriptionModel::WhisperLargeV3,
        &audio_bytes,
        "audio.mp3",
        options,
    )
    .unwrap();

println!("Duration: {:?}s", response.duration);
for segment in response.segments.unwrap_or_default() {
    println!("[{:.2}s - {:.2}s] {}", segment.start, segment.end, segment.text);
}

Providers

Groq

Fast inference provider
API: https://api.groq.com/openai/v1/chat/completions
Env: GROQ_API_KEY

OpenRouter

Unified API for multiple models
API: https://openrouter.ai/api/v1/chat/completions
Env: OPENROUTER_API_KEY

SambaNova

High-performance AI inference
API: https://api.sambanova.ai/v1/chat/completions
Env: SAMBANOVA_API_KEY

Model Test Utility

The modeltest binary tests model availability across all configured providers.

What it does

Queries provider model lists - Fetches available models from each provider's API
Validates model mappings - Checks if our configured model IDs exist on each provider
Tests each model - Sends a simple "whoami" query to verify the model works
Generates a report - Shows success/failure status for each model on each provider

Running the test

# Build and run
cargo run --bin modeltest

# Or after building
./target/debug/modeltest

Example output

herolib-ai Model Test Utility
Testing model availability across all providers

Configured providers:
  - Groq
  - OpenRouter

======================================================================
Phase 1: Querying Provider Model Lists
======================================================================
Querying Groq... OK (42 models)
Querying OpenRouter... OK (256 models)

======================================================================
Phase 2: Validating Model Mappings
======================================================================

Llama 3.3 70B (Llama 3.3 70B - Fast, capable model for general tasks):
  Groq llama-3.3-70b-versatile -> OK
  OpenRouter meta-llama/llama-3.3-70b-instruct -> OK

======================================================================
Phase 3: Testing Models with 'whoami' Query
======================================================================

--------------------------------------------------
Testing: Llama 3.3 70B
--------------------------------------------------
  Groq (llama-3.3-70b-versatile)... OK (523ms)
    Response: I'm LLaMA, a large language model trained by Meta AI.

======================================================================
Final Report
======================================================================

Test Summary:
  Total tests: 20
  Successful:  18
  Failed:      2
  Success rate: 90.0%

All tests passed!

Building

./build.sh

Testing

./run.sh

License

Apache-2.0

Commit count: 0

herolib-ai

documentation

README

herolib-ai

Overview

Installation

Environment Variables

Usage

Simple Chat

With Verification

Using GPT-OSS 20B via Groq

Manual Provider Configuration

Available Models

Embedding Models

Embedding Usage

Transcription Models

Transcription Usage

Providers

Groq

OpenRouter

SambaNova

Model Test Utility

What it does

Running the test

Example output

Building

Testing

License

cargo fmt