llm-kit-huggingface

Crates.iollm-kit-huggingface
lib.rsllm-kit-huggingface
version0.1.0
created_at2026-01-18 19:27:13.005526+00
updated_at2026-01-18 19:27:13.005526+00
descriptionHugging Face provider for LLM Kit
homepage
repositoryhttps://github.com/saribmah/llm-kit
max_upload_size
id2052922
size166,895
Sarib Mahmood (saribmah)

documentation

README

LLM Kit Hugging Face

Hugging Face provider for LLM Kit - Complete integration with the Hugging Face Responses API for chat models with advanced features.

Note: This provider uses the standardized builder pattern. See the Quick Start section for the recommended usage.

Features

  • Text Generation: Generate text using Hugging Face models via the Responses API
  • Streaming: Stream responses in real-time with support for tool calls
  • Tool Calling: Support for function calling with automatic execution
  • Multi-modal: Support for text and image inputs
  • Provider-Executed Tools: MCP (Model Context Protocol) integration for server-side tool execution
  • Source Annotations: Automatic source citations in responses
  • Structured Output: JSON schema support for constrained generation
  • Reasoning Content: Support for models with reasoning capabilities

Installation

Add this to your Cargo.toml:

[dependencies]
llm-kit-huggingface = "0.1"
llm-kit-core = "0.1"
llm-kit-provider = "0.1"
tokio = { version = "1", features = ["full"] }

Quick Start

Using the Client Builder (Recommended)

use llm_kit_huggingface::HuggingFaceClient;
use llm_kit_provider::language_model::LanguageModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create provider using the client builder
    let provider = HuggingFaceClient::new()
        .api_key("your-api-key")  // Or use HUGGINGFACE_API_KEY env var
        .build();
    
    // Create a language model
    let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");
    
    println!("Model: {}", model.model_id());
    println!("Provider: {}", model.provider());
    Ok(())
}

Using Settings Directly (Alternative)

use llm_kit_huggingface::{HuggingFaceProvider, HuggingFaceProviderSettings};
use llm_kit_provider::language_model::LanguageModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create provider with settings
    let provider = HuggingFaceProvider::new(
        HuggingFaceProviderSettings::new()
            .with_api_key("your-api-key")
    );
    
    let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");
    
    println!("Model: {}", model.model_id());
    Ok(())
}

Configuration

Environment Variables

Set your Hugging Face API key as an environment variable:

export HUGGINGFACE_API_KEY=your-api-key

Using the Client Builder

use llm_kit_huggingface::HuggingFaceClient;

let provider = HuggingFaceClient::new()
    .api_key("your-api-key")
    .base_url("https://router.huggingface.co/v1")
    .header("Custom-Header", "value")
    .name("my-huggingface")
    .build();

Builder Methods

The HuggingFaceClient builder supports:

  • .api_key(key) - Set the API key (overrides HUGGINGFACE_API_KEY environment variable)
  • .base_url(url) - Set custom base URL (default: https://router.huggingface.co/v1)
  • .name(name) - Set provider name (optional)
  • .header(key, value) - Add a single custom header
  • .headers(map) - Add multiple custom headers
  • .build() - Build the provider

Provider-Specific Options

Provider-Executed Tools (MCP)

Hugging Face Responses API supports Model Context Protocol (MCP), allowing tools to be executed on the provider side:

use llm_kit_core::{GenerateText, prompt::Prompt, tool::ToolSet};

// When tools are called, check if they were executed by the provider
let result = GenerateText::new(model, Prompt::text("What's the weather?"))
    .tools(tool_set)
    .execute()
    .await?;

for content in result.content {
    if let LanguageModelContent::ToolCall(call) = content {
        if call.provider_executed == Some(true) {
            println!("Tool executed by provider: {}", call.tool_name);
        }
    }
}

Source Annotations

The API automatically returns source citations as separate content items:

for content in result.content {
    if let LanguageModelContent::Source(source) = content {
        println!("Source: {} - {}", source.url, source.title);
    }
}

Structured Output

Use JSON schema to constrain model output:

use llm_kit_core::response_format::ResponseFormat;
use serde_json::json;

let result = GenerateText::new(model, Prompt::text("Generate a person"))
    .response_format(ResponseFormat::json_schema(
        json!({
            "type": "object",
            "properties": {
                "name": { "type": "string" },
                "age": { "type": "number" }
            }
        })
    ))
    .execute()
    .await?;

Multi-modal Inputs

Include images in your prompts:

use llm_kit_core::prompt::Prompt;

let prompt = Prompt::new()
    .add_text("What's in this image?")
    .add_image("https://example.com/image.jpg");

let result = GenerateText::new(model, prompt)
    .execute()
    .await?;

Supported Models

The provider includes constants for popular models:

use llm_kit_huggingface::{
    LLAMA_3_1_8B_INSTRUCT,
    LLAMA_3_1_70B_INSTRUCT,
    LLAMA_3_3_70B_INSTRUCT,
    DEEPSEEK_V3_1,
    DEEPSEEK_R1,
    QWEN3_32B,
    QWEN2_5_7B_INSTRUCT,
    GEMMA_2_9B_IT,
    KIMI_K2_INSTRUCT,
};

let model = provider.responses(LLAMA_3_1_8B_INSTRUCT);

All models available via the Hugging Face Responses API are supported. You can also use any model ID as a string:

let model = provider.responses("meta-llama/Llama-3.1-8B-Instruct");

For a complete list of available models, see the Hugging Face Responses API documentation.

Supported Settings

Setting Supported Notes
temperature Temperature for sampling
top_p Nucleus sampling
max_output_tokens Maximum tokens to generate
tools Function calling with MCP support
tool_choice auto, required, specific tool
response_format JSON schema support
top_k Not supported by API
seed Not supported by API
presence_penalty Not supported by API
frequency_penalty Not supported by API
stop_sequences Not supported by API

Examples

See the examples/ directory for complete examples:

  • chat.rs - Basic chat completion with usage statistics
  • stream.rs - Streaming responses with real-time output
  • chat_tool_calling.rs - Tool calling with function definitions
  • stream_tool_calling.rs - Streaming with tool calls

Run examples with:

export HUGGINGFACE_API_KEY=your-api-key
cargo run --example chat
cargo run --example stream
cargo run --example chat_tool_calling
cargo run --example stream_tool_calling

Documentation

License

MIT

Contributing

Contributions are welcome! Please see the Contributing Guide for more details.

Commit count: 299

cargo fmt