| Crates.io | api_huggingface |
| lib.rs | api_huggingface |
| version | 0.3.0 |
| created_at | 2025-11-06 19:03:04.371462+00 |
| updated_at | 2025-11-29 20:43:16.591298+00 |
| description | HuggingFace's API for accessing large language models (LLMs) and embeddings. |
| homepage | https://github.com/Wandalen/api_llm/tree/master/api/huggingface |
| repository | https://github.com/Wandalen/api_llm/tree/master/api/huggingface |
| max_upload_size | |
| id | 1920126 |
| size | 1,485,477 |
Comprehensive Rust client for HuggingFace Inference API with Router API support for Pro models.
This API crate is designed as a stateless HTTP client with zero persistence requirements. It provides:
This ensures lightweight, containerized deployments and eliminates operational complexity.
Expose all server-side functionality transparently while maintaining zero client-side intelligence or automatic behaviors.
Key principles:
Core Capabilities:
Multimodal Features:
Enterprise Reliability:
Add to your Cargo.toml:
[dependencies]
api_huggingface = "0.2.0"
use api_huggingface::
{
Client,
environment::HuggingFaceEnvironmentImpl,
components::{ input::InferenceParameters, models::Models },
secret::Secret,
};
#[ tokio::main ]
async fn main() -> Result< (), Box< dyn std::error::Error > >
{
let api_key = Secret::load_from_env( "HUGGINGFACE_API_KEY" )?;
let env = HuggingFaceEnvironmentImpl::build( api_key, None )?;
let client = Client::build( env )?;
let params = InferenceParameters::new()
.with_temperature( 0.7 )
.with_max_new_tokens( 100 );
let response = client.inference()
.create_with_parameters
(
"What is the capital of France?",
Models::llama_3_1_8b_instruct(),
params
)
.await?;
println!( "Response: {:?}", response );
Ok( () )
}
use api_huggingface::{ Client, environment::HuggingFaceEnvironmentImpl, secret::Secret };
#[ tokio::main ]
async fn main() -> Result< (), Box< dyn std::error::Error > >
{
let api_key = Secret::load_from_env( "HUGGINGFACE_API_KEY" )?;
let env = HuggingFaceEnvironmentImpl::build( api_key, None )?;
let client = Client::build( env )?;
let embedding1 = client.embeddings().create( "Hello world" ).await?;
let embedding2 = client.embeddings().create( "Hi there" ).await?;
let similarity = client.embeddings().similarity( &embedding1, &embedding2 )?;
println!( "Similarity: {:.4}", similarity );
Ok( () )
}
Create secret/-secrets.sh in your workspace root:
#!/bin/bash
export HUGGINGFACE_API_KEY="hf_your-key-here"
export HUGGINGFACE_API_KEY="hf_your-key-here"
Get your API key from huggingface.co/settings/tokens.
default - Core async inference and embeddingsinference - Text generation APIembeddings - Embeddings generationmodels - Model discovery and statusinference-streaming - SSE streaming supportembeddings-similarity - Similarity calculationsembeddings-batch - Batch processingcircuit-breaker - Failure detection and recoveryrate-limiting - Token bucket rate limitingfailover - Multi-endpoint failoverhealth-checks - Background health monitoringdynamic-config - Runtime configurationsync - Synchronous API wrapperscaching - LRU caching with TTLperformance-metrics - Request trackingfull - All features enabledintegration - Integration tests with real API| Model | Provider | Capabilities |
|---|---|---|
| moonshotai/Kimi-K2-Instruct-0905 | groq | Chat completions |
| meta-llama/Llama-3.1-8B-Instruct | various | Chat completions |
| mistralai/Mistral-7B-Instruct | various | Chat completions |
| codellama/CodeLlama-34b-Instruct | various | Code generation |
| Model | Task |
|---|---|
| facebook/bart-large-cnn | Summarization |
| gpt2 | Text generation |
| sentence-transformers/all-MiniLM-L6-v2 | Embeddings |
All dependencies workspace-managed for consistency.
cargo clippy -- -D warningsMIT