llm-kit-huggingface

Crates.io	llm-kit-huggingface
lib.rs	llm-kit-huggingface
version	0.1.0
created_at	2026-01-18 19:27:13.005526+00
updated_at	2026-01-18 19:27:13.005526+00
description	Hugging Face provider for LLM Kit
homepage
repository	https://github.com/saribmah/llm-kit
max_upload_size
id	2052922
size	166,895

Sarib Mahmood (saribmah)

documentation

README

LLM Kit Hugging Face

Hugging Face provider for LLM Kit - Complete integration with the Hugging Face Responses API for chat models with advanced features.

Note: This provider uses the standardized builder pattern. See the Quick Start section for the recommended usage.

Features

Text Generation: Generate text using Hugging Face models via the Responses API
Streaming: Stream responses in real-time with support for tool calls
Tool Calling: Support for function calling with automatic execution
Multi-modal: Support for text and image inputs
Provider-Executed Tools: MCP (Model Context Protocol) integration for server-side tool execution
Source Annotations: Automatic source citations in responses
Structured Output: JSON schema support for constrained generation
Reasoning Content: Support for models with reasoning capabilities

Installation

Add this to your Cargo.toml:

[dependencies]
llm-kit-huggingface = "0.1"
llm-kit-core = "0.1"
llm-kit-provider = "0.1"
tokio = { version = "1", features = ["full"] }

Quick Start

Using the Client Builder (Recommended)

use llm_kit_huggingface::HuggingFaceClient;
use llm_kit_provider::language_model::LanguageModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create provider using the client builder
    let provider = HuggingFaceClient::new()
        .api_key("your-api-key")  // Or use HUGGINGFACE_API_KEY env var
        .build();
    
    // Create a language model
    let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");
    
    println!("Model: {}", model.model_id());
    println!("Provider: {}", model.provider());
    Ok(())
}

Using Settings Directly (Alternative)

use llm_kit_huggingface::{HuggingFaceProvider, HuggingFaceProviderSettings};
use llm_kit_provider::language_model::LanguageModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create provider with settings
    let provider = HuggingFaceProvider::new(
        HuggingFaceProviderSettings::new()
            .with_api_key("your-api-key")
    );
    
    let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");
    
    println!("Model: {}", model.model_id());
    Ok(())
}

Configuration

Environment Variables

Set your Hugging Face API key as an environment variable:

export HUGGINGFACE_API_KEY=your-api-key

Using the Client Builder

use llm_kit_huggingface::HuggingFaceClient;

let provider = HuggingFaceClient::new()
    .api_key("your-api-key")
    .base_url("https://router.huggingface.co/v1")
    .header("Custom-Header", "value")
    .name("my-huggingface")
    .build();

Builder Methods

The HuggingFaceClient builder supports:

.api_key(key) - Set the API key (overrides HUGGINGFACE_API_KEY environment variable)
.base_url(url) - Set custom base URL (default: https://router.huggingface.co/v1)
.name(name) - Set provider name (optional)
.header(key, value) - Add a single custom header
.headers(map) - Add multiple custom headers
.build() - Build the provider

Provider-Specific Options

Provider-Executed Tools (MCP)

Hugging Face Responses API supports Model Context Protocol (MCP), allowing tools to be executed on the provider side:

use llm_kit_core::{GenerateText, prompt::Prompt, tool::ToolSet};

// When tools are called, check if they were executed by the provider
let result = GenerateText::new(model, Prompt::text("What's the weather?"))
    .tools(tool_set)
    .execute()
    .await?;

for content in result.content {
    if let LanguageModelContent::ToolCall(call) = content {
        if call.provider_executed == Some(true) {
            println!("Tool executed by provider: {}", call.tool_name);
        }
    }
}

Source Annotations

The API automatically returns source citations as separate content items:

for content in result.content {
    if let LanguageModelContent::Source(source) = content {
        println!("Source: {} - {}", source.url, source.title);
    }
}

Structured Output

Use JSON schema to constrain model output:

use llm_kit_core::response_format::ResponseFormat;
use serde_json::json;

let result = GenerateText::new(model, Prompt::text("Generate a person"))
    .response_format(ResponseFormat::json_schema(
        json!({
            "type": "object",
            "properties": {
                "name": { "type": "string" },
                "age": { "type": "number" }
            }
        })
    ))
    .execute()
    .await?;

Multi-modal Inputs

Include images in your prompts:

use llm_kit_core::prompt::Prompt;

let prompt = Prompt::new()
    .add_text("What's in this image?")
    .add_image("https://example.com/image.jpg");

let result = GenerateText::new(model, prompt)
    .execute()
    .await?;

Supported Models

The provider includes constants for popular models:

use llm_kit_huggingface::{
    LLAMA_3_1_8B_INSTRUCT,
    LLAMA_3_1_70B_INSTRUCT,
    LLAMA_3_3_70B_INSTRUCT,
    DEEPSEEK_V3_1,
    DEEPSEEK_R1,
    QWEN3_32B,
    QWEN2_5_7B_INSTRUCT,
    GEMMA_2_9B_IT,
    KIMI_K2_INSTRUCT,
};

let model = provider.responses(LLAMA_3_1_8B_INSTRUCT);

All models available via the Hugging Face Responses API are supported. You can also use any model ID as a string:

let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");

For a complete list of available models, see the Hugging Face Responses API documentation.

Supported Settings

Setting	Supported	Notes
`temperature`	✅	Temperature for sampling
`top_p`	✅	Nucleus sampling
`max_output_tokens`	✅	Maximum tokens to generate
`tools`	✅	Function calling with MCP support
`tool_choice`	✅	`auto`, `required`, specific tool
`response_format`	✅	JSON schema support
`top_k`	❌	Not supported by API
`seed`	❌	Not supported by API
`presence_penalty`	❌	Not supported by API
`frequency_penalty`	❌	Not supported by API
`stop_sequences`	❌	Not supported by API

Examples

See the examples/ directory for complete examples:

chat.rs - Basic chat completion with usage statistics
stream.rs - Streaming responses with real-time output
chat_tool_calling.rs - Tool calling with function definitions
stream_tool_calling.rs - Streaming with tool calls

Run examples with:

export HUGGINGFACE_API_KEY=your-api-key
cargo run --example chat
cargo run --example stream
cargo run --example chat_tool_calling
cargo run --example stream_tool_calling

Documentation

License

MIT

Contributing

Contributions are welcome! Please see the Contributing Guide for more details.

Commit count: 299