omniference

Crates.io	omniference
lib.rs	omniference
version	0.1.3
created_at	2026-01-10 18:20:38.494936+00
updated_at	2026-01-13 22:36:40.937965+00
description	A multi-protocol inference engine with provider adapters
homepage
repository	https://github.com/Henrik-3/omniference
max_upload_size
id	2034516
size	595,033

Henrik Mertens (Henrik-3)

documentation

README

Omniference

A flexible, multi-protocol inference engine that provides a unified interface for interacting with various AI model providers such as Ollama, OpenAI, and others through a common API.

Features

Multi-provider support: Ollama, OpenAI, and extensible architecture for more providers
Streaming support: Real-time streaming responses from AI models
OpenAI-compatible API: Drop-in replacement for OpenAI's API
Multiple interfaces: HTTP server, Discord bot, CLI, library usage
Embeddable: Can be integrated into existing Axum applications
Async/await: Built on Tokio for high-performance async operations
Type-safe: Strong typing throughout the library

Architecture

The library is organized in layers:

Core Layer: Router, adapters, and types (pure inference logic)
Service Layer: Provider management and model resolution
Interface Layer: HTTP APIs, Discord bot, CLI, etc.
Application Layer: Full server or embeddable components

Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
omniference = "0.1.0"

Usage Examples

1. Library Usage

Use Omniference as a library in any async context:

use omniference::{OmniferenceEngine, types::{ProviderConfig, ProviderKind, ProviderEndpoint}};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Create engine
    let mut engine = OmniferenceEngine::new();
    
    // Register provider
    engine.register_provider(ProviderConfig {
        name: "ollama".to_string(),
        endpoint: ProviderEndpoint {
            kind: ProviderKind::Ollama,
            base_url: "http://localhost:11434".to_string(),
            api_key: None,
            extra_headers: std::collections::BTreeMap::new(),
            timeout: Some(30000),
        },
        enabled: true,
    }).await?;
    
    // Create chat request
    let request = omniference::types::ChatRequestIR {
        model: omniference::types::ModelRef {
            alias: "llama3.2".to_string(),
            provider: engine.list_models().await[0].provider.clone(),
        },
        messages: vec![
            omniference::types::Message {
                role: "user".to_string(),
                content: "Hello! How are you?".to_string(),
            },
        ],
        metadata: std::collections::HashMap::new(),
        stream: false,
    };
    
    // Execute chat
    let response = engine.chat_complete(request).await?;
    println!("Response: {}", response);
    
    Ok(())
}

2. Standalone HTTP Server

Run as a standalone HTTP server with OpenAI-compatible API:

use omniference::{server::OmniferenceServer, types::{ProviderConfig, ProviderKind, ProviderEndpoint}};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut server = OmniferenceServer::new();
    
    server.add_provider(ProviderConfig {
        name: "ollama".to_string(),
        endpoint: ProviderEndpoint {
            kind: ProviderKind::Ollama,
            base_url: "http://localhost:11434".to_string(),
            api_key: None,
            extra_headers: std::collections::BTreeMap::new(),
            timeout: Some(30000),
        },
        enabled: true,
    }).await?;
    
    server.run("0.0.0.0:8080").await
}

3. Embedded in Existing Axum Application

Integrate into an existing Axum application:

use axum::{routing::get, Router};
use omniference::{server::OmniferenceServer, types::{ProviderConfig, ProviderKind, ProviderEndpoint}};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Your existing app
    let app = Router::new()
        .route("/", get(|| async { "Welcome to my app!" }));
    
    // Add Omniference
    let mut omniference_server = OmniferenceServer::new();
    
    omniference_server.add_provider(ProviderConfig {
        name: "ollama".to_string(),
        endpoint: ProviderEndpoint {
            kind: ProviderKind::Ollama,
            base_url: "http://localhost:11434".to_string(),
            api_key: None,
            extra_headers: std::collections::BTreeMap::new(),
            timeout: Some(30000),
        },
        enabled: true,
    }).await?;
    
    // Mount Omniference under /ai
    let app = app.nest("/ai", omniference_server.app());
    
    // Run combined app
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
    axum::serve(listener, app).await?;
    
    Ok(())
}

4. Discord Bot Integration

Create a Discord bot with AI capabilities

[dependencies]
omniference = { version = "0.1.0" }

use omniference::{OmniferenceEngine, types::{ProviderConfig, ProviderKind, ProviderEndpoint}};
use std::sync::Arc;

// Set up engine in your Discord bot handler
let mut engine = OmniferenceEngine::new();
engine.register_provider(ProviderConfig {
    // ... provider configuration
}).await?;

// Use in message handlers
let response = engine.chat_complete(request).await?;

Examples

The crate includes several examples:

cargo run --example library_usage - Basic library usage
cargo run --example embedded_axum - Embed in existing Axum app
cargo run --example standalone_server - Run as standalone server
cargo run --example discord_bot - Discord bot integration

API Endpoints

When running as a server, Omniference provides:

POST /api/openai/v1/responses - OpenAI Responses API (new, OpenAI-only)
GET /api/openai/v1/models - List available models
POST /api/openai-compatible/v1/chat/completions - OpenAI-compatible Chat Completions
GET /api/openai-compatible/v1/models - OpenAI-compatible models endpoint

Configuration

Examples via .env

Examples read configuration from environment variables. Copy .env.example to .env and set values as needed:

cp .env.example .env
# then edit .env

Key variables:

OLLAMA_BASE_URL (default http://localhost:11434)
OPENAI_BASE_URL (default https://api.openai.com)
OPENAI_API_KEY (required for OpenAI-compatible examples)
DISCORD_TOKEN (required for the Discord example)
SERVER_ADDR and EMBEDDED_SERVER_ADDR to change example ports

Provider Configuration

Configure providers with custom endpoints and settings:

ProviderConfig {
    name: "my-ollama".to_string(),
    endpoint: ProviderEndpoint {
        kind: ProviderKind::Ollama,
        base_url: "http://custom-server:11434".to_string(),
        api_key: Some("your-api-key".to_string()),
        extra_headers: {
            let mut headers = std::collections::BTreeMap::new();
            headers.insert("Custom-Header".to_string(), "value".to_string());
            headers
        },
        timeout: Some(60000),
    },
    enabled: true,
}

Model Resolution

Models are auto-discovered from providers and can be referenced using:

Provider-prefixed format: ollama/llama3.2
Direct model names: llama3.2
Custom aliases configured by your application

Building and Testing

# Build the library
cargo build

# Run tests
cargo test

# Run examples
cargo run --example library_usage
cargo run --example standalone_server

# Run with Discord support
cargo run --example discord_bot --features discord

Features

default: Core functionality without optional dependencies
discord: Enables Discord bot integration with Serenity

License

MIT - see LICENSE for details.

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

Roadmap

Support for more providers (OpenAI, Anthropic, etc.)
Additional protocol skins (Anthropic Messages API, etc.)
Advanced configuration management
Performance optimizations and caching
Monitoring and metrics
Load balancing and failover

Commit count: 0

omniference

documentation

README

Omniference

Features

Architecture

Quick Start

Installation

Usage Examples

1. Library Usage

2. Standalone HTTP Server

3. Embedded in Existing Axum Application

4. Discord Bot Integration

Examples

API Endpoints

Configuration

Examples via .env

Provider Configuration

Model Resolution

Building and Testing

Features

License

Contributing

Roadmap

cargo fmt