| Crates.io | llm-kit-groq |
| lib.rs | llm-kit-groq |
| version | 0.1.0 |
| created_at | 2026-01-18 20:38:07.04793+00 |
| updated_at | 2026-01-18 20:38:07.04793+00 |
| description | Groq provider implementation for the LLM Kit - supports chat and transcription models |
| homepage | |
| repository | https://github.com/saribmah/llm-kit |
| max_upload_size | |
| id | 2053052 |
| size | 177,179 |
Groq provider for LLM Kit - Ultra-fast LLM inference with open-source models powered by Groq's LPU architecture.
Note: This provider uses the standardized builder pattern. See the Quick Start section for the recommended usage.
Add this to your Cargo.toml:
[dependencies]
llm-kit-groq = "0.1"
llm-kit-core = "0.1"
llm-kit-provider = "0.1"
tokio = { version = "1", features = ["full"] }
use llm_kit_groq::GroqClient;
use llm_kit_provider::language_model::LanguageModel;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create provider using the client builder
let provider = GroqClient::new()
.api_key("your-api-key") // Or use GROQ_API_KEY env var
.build();
// Create a language model
let model = provider.chat_model("llama-3.1-8b-instant");
println!("Model: {}", model.model_id());
println!("Provider: {}", model.provider());
Ok(())
}
use llm_kit_groq::{GroqProvider, GroqProviderSettings};
use llm_kit_provider::language_model::LanguageModel;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create provider with settings
let provider = GroqProvider::new(GroqProviderSettings::default());
let model = provider.chat_model("llama-3.1-8b-instant");
println!("Model: {}", model.model_id());
Ok(())
}
Set your Groq API key as an environment variable:
export GROQ_API_KEY=your-api-key
export GROQ_BASE_URL=https://api.groq.com/openai/v1 # Optional
use llm_kit_groq::GroqClient;
let provider = GroqClient::new()
.api_key("your-api-key")
.base_url("https://api.groq.com/openai/v1")
.header("Custom-Header", "value")
.name("my-groq-provider")
.build();
use llm_kit_groq::{GroqProvider, GroqProviderSettings};
let settings = GroqProviderSettings::new()
.with_api_key("your-api-key")
.with_base_url("https://api.groq.com/openai/v1")
.add_header("Custom-Header", "value")
.with_name("my-groq-provider");
let provider = GroqProvider::new(settings);
The GroqClient builder supports:
.api_key(key) - Set the API key (overrides GROQ_API_KEY environment variable).load_api_key_from_env() - Explicitly load API key from GROQ_API_KEY environment variable.base_url(url) - Set custom base URL (default: https://api.groq.com/openai/v1).name(name) - Set provider name (optional).header(key, value) - Add a single custom header.headers(map) - Add multiple custom headers.build() - Build the providerAll Groq models are supported across multiple model families.
llama-3.1-8b-instant, llama-3.3-70b-versatile, meta-llama/llama-guard-4-12b, meta-llama/llama-4-maverick-17b-128e-instruct, meta-llama/llama-4-scout-17b-16e-instructgemma2-9b-itdeepseek-r1-distill-llama-70bqwen/qwen3-32bmixtral-8x7b-32768whisper-large-v3 - Most accurate Whisper modelwhisper-large-v3-turbo - Faster Whisper variant with lower latencydistil-whisper-large-v3-en - English-optimized distilled model for faster performanceplayai-tts - PlayAI text-to-speech modelFor a complete list of available models, see the Groq Models documentation.
Groq supports advanced features through provider options.
Control how reasoning content is returned:
use llm_kit_core::GenerateText;
use serde_json::json;
let result = GenerateText::new(model, prompt)
.provider_options(json!({
"reasoningFormat": "parsed" // "parsed", "raw", or "hidden"
}))
.execute()
.await?;
Configure computational effort for reasoning models:
use llm_kit_core::GenerateText;
use serde_json::json;
let result = GenerateText::new(model, prompt)
.provider_options(json!({
"reasoningEffort": "high" // Effort level as string
}))
.execute()
.await?;
Enable or disable parallel tool execution (default: true):
use llm_kit_core::GenerateText;
use serde_json::json;
let result = GenerateText::new(model, prompt)
.tools(tools)
.provider_options(json!({
"parallelToolCalls": true
}))
.execute()
.await?;
Select the service tier for processing:
use llm_kit_core::GenerateText;
use serde_json::json;
let result = GenerateText::new(model, prompt)
.provider_options(json!({
"serviceTier": "flex" // "on_demand", "flex", or "auto"
}))
.execute()
.await?;
Groq provides metadata about cached tokens to help optimize performance:
use llm_kit_groq::GroqClient;
use llm_kit_core::{GenerateText, Prompt};
let provider = GroqClient::new().build();
let model = provider.chat_model("llama-3.1-8b-instant");
let result = GenerateText::new(model, Prompt::text("Hello!"))
.execute()
.await?;
// Access cache metadata
if let Some(metadata) = result.provider_metadata {
if let Some(groq) = metadata.get("groq") {
println!("Cached tokens: {:?}", groq.get("cachedTokens"));
}
}
| Option | Type | Description |
|---|---|---|
reasoningFormat |
string |
Reasoning content format: "parsed", "raw", "hidden" |
reasoningEffort |
string |
Computational effort level for reasoning |
parallelToolCalls |
bool |
Enable parallel tool execution (default: true) |
structuredOutputs |
bool |
Enable structured outputs (default: true) |
serviceTier |
string |
Service tier: "on_demand", "flex", "auto" |
user |
string |
End-user identifier for abuse monitoring |
See the examples/ directory for complete examples:
chat.rs - Basic chat completion using do_generate() directlystream.rs - Streaming responses using do_stream() directlychat_tool_calling.rs - Tool calling using do_generate() directlystream_tool_calling.rs - Streaming with tools using do_stream() directlyspeech_generation.rs - Text-to-speech using do_generate() directlytranscription.rs - Audio transcription using do_transcribe() directlyRun examples with:
export GROQ_API_KEY="your-api-key"
cargo run --example chat
cargo run --example stream
cargo run --example chat_tool_calling
cargo run --example stream_tool_calling
cargo run --example speech_generation
cargo run --example transcription
Licensed under:
Contributions are welcome! Please see the Contributing Guide for more details.