rexis-llm

Crates.iorexis-llm
lib.rsrexis-llm
version0.1.0
created_at2025-11-09 11:29:45.39326+00
updated_at2025-11-09 11:29:45.39326+00
descriptionRexis LLM - Multi-provider LLM client with streaming, tool calling, and JSON schema support
homepagehttps://github.com/0xteamhq/rexis/tree/main/crates/rexis-llm
repositoryhttps://github.com/0xteamhq/rexis
max_upload_size
id1924027
size312,759
vasanth (0xvasanth)

documentation

https://docs.rs/rexis-llm

README

RSLLM - Rust LLM Client Library

Crates.io Documentation License: MIT

RSLLM is a Rust-native client library for Large Language Models with multi-provider support, streaming capabilities, and type-safe interfaces.

πŸš€ Features

  • πŸ€– Multi-Provider Support: OpenAI, Anthropic Claude, Ollama, and more
  • ⚑ Streaming Responses: Real-time token streaming with async iterators
  • πŸ›‘οΈ Type Safety: Compile-time guarantees for API contracts
  • πŸ“Š Memory Efficient: Zero-copy operations where possible
  • πŸ”Œ Easy Integration: Seamless integration with RAG frameworks like RRAG
  • βš™οΈ Configurable: Flexible configuration with builder patterns
  • 🌊 Async-First: Built around async/await from the ground up

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Application   │───▢│    RSLLM        │───▢│   LLM Provider  β”‚
β”‚   (RRAG, etc)   β”‚    β”‚    Client       β”‚    β”‚  (OpenAI/etc)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Streaming     │◀───│   Provider      │◀───│    HTTP/API     β”‚
β”‚   Response      β”‚    β”‚   Abstraction   β”‚    β”‚    Transport    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Add RSLLM to your Cargo.toml:

[dependencies]
rsllm = "0.1"
tokio = { version = "1.0", features = ["full"] }

Basic Chat Completion

use rsllm::{Client, Provider, ChatMessage, MessageRole};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create client with OpenAI provider
    let client = Client::builder()
        .provider(Provider::OpenAI)
        .api_key("your-api-key")
        .model("gpt-4")
        .build()?;

    // Simple chat completion
    let messages = vec![
        ChatMessage::new(MessageRole::User, "What is Rust?")
    ];

    let response = client.chat_completion(messages).await?;
    tracing::debug!("Response: {}", response.content);

    Ok(())
}

Streaming Responses

use rsllm::{Client, Provider, ChatMessage, MessageRole};
use futures::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::builder()
        .provider(Provider::OpenAI)
        .api_key("your-api-key")
        .model("gpt-4")
        .build()?;

    let messages = vec![
        ChatMessage::new(MessageRole::User, "Tell me a story")
    ];

    let mut stream = client.chat_completion_stream(messages).await?;

    while let Some(chunk) = stream.next().await {
        tracing::debug!("{}", chunk?.content);
    }

    Ok(())
}

Multiple Providers

use rsllm::{Client, Provider};

// OpenAI
let openai_client = Client::builder()
    .provider(Provider::OpenAI)
    .api_key("openai-api-key")
    .model("gpt-4")
    .build()?;

// Anthropic Claude
let claude_client = Client::builder()
    .provider(Provider::Claude)
    .api_key("claude-api-key")
    .model("claude-3-sonnet")
    .build()?;

// Local Ollama
let ollama_client = Client::builder()
    .provider(Provider::Ollama)
    .base_url("http://localhost:11434")
    .model("llama3.1")
    .build()?;

πŸ”§ Configuration

RSLLM supports extensive configuration options:

use rsllm::{Client, Provider, ClientConfig};
use std::time::Duration;

let client = Client::builder()
    .provider(Provider::OpenAI)
    .api_key("your-api-key")
    .model("gpt-4")
    .base_url("https://api.openai.com/v1")
    .timeout(Duration::from_secs(60))
    .max_tokens(4096)
    .temperature(0.7)
    .build()?;

Environment Variables

RSLLM supports configuration through environment variables, perfect for CI/CD pipelines, different deployment environments, and custom/self-hosted endpoints:

use rsllm::Client;

// Load configuration from environment variables
let client = Client::from_env()?;

Supported Environment Variables

Provider Configuration:

  • RSLLM_PROVIDER - Provider name (openai, claude, ollama)
  • RSLLM_API_KEY - API key for the provider

Base URL Configuration (supports custom/self-hosted endpoints):

  • RSLLM_BASE_URL - Generic base URL (works for any provider)
  • RSLLM_OPENAI_BASE_URL - OpenAI-specific base URL (overrides generic)
  • RSLLM_OLLAMA_BASE_URL - Ollama-specific base URL (overrides generic)
  • RSLLM_CLAUDE_BASE_URL - Claude-specific base URL (overrides generic)

Model Configuration (supports custom models and fine-tuned models):

  • RSLLM_MODEL - Generic model name
  • RSLLM_OPENAI_MODEL - OpenAI-specific model (overrides generic)
  • RSLLM_OLLAMA_MODEL - Ollama-specific model (overrides generic)
  • RSLLM_CLAUDE_MODEL - Claude-specific model (overrides generic)

Other Settings:

  • RSLLM_TEMPERATURE - Temperature setting (0.0 to 2.0)
  • RSLLM_MAX_TOKENS - Maximum tokens to generate
  • RSLLM_TIMEOUT - Request timeout in seconds

Example .env file:

# Using Ollama with custom model
RSLLM_PROVIDER=ollama
RSLLM_OLLAMA_BASE_URL=http://localhost:11434/api
RSLLM_OLLAMA_MODEL=llama3.2:3b
RSLLM_TEMPERATURE=0.7

# Or using a self-hosted OpenAI-compatible endpoint
RSLLM_PROVIDER=openai
RSLLM_OPENAI_BASE_URL=https://my-custom-llm.example.com/v1
RSLLM_OPENAI_MODEL=my-fine-tuned-gpt-4
RSLLM_API_KEY=my-custom-api-key

Key Features:

  • βœ… Custom Models Supported: Use any model name, not limited to predefined lists
  • βœ… Custom Endpoints: Point to self-hosted or custom LLM endpoints
  • βœ… URL Flexibility: Base URLs work with or without trailing slashes
  • βœ… Provider-Specific Priority: Provider-specific env vars override generic ones

🌟 Supported Providers

Provider Status Models Streaming
OpenAI βœ… GPT-4, GPT-3.5 βœ…
Anthropic Claude βœ… Claude-3 (Sonnet, Opus, Haiku) βœ…
Ollama βœ… Llama, Mistral, CodeLlama βœ…
Azure OpenAI 🚧 GPT-4, GPT-3.5 🚧
Cohere πŸ“ Command πŸ“
Google Gemini πŸ“ Gemini Pro πŸ“

Legend: βœ… Supported | 🚧 In Progress | πŸ“ Planned

πŸ“– Documentation

πŸ”§ Feature Flags

[dependencies.rsllm]
version = "0.1"
features = [
    "openai",        # OpenAI provider support
    "claude",        # Anthropic Claude support
    "ollama",        # Ollama local model support
    "streaming",     # Streaming response support
    "json-schema",   # JSON schema support for structured outputs
]

🀝 Integration with RRAG

RSLLM is designed to work seamlessly with the RRAG framework:

use rrag::prelude::*;
use rsllm::Client;

let llm_client = Client::builder()
    .provider(Provider::OpenAI)
    .api_key("your-api-key")
    .build()?;

let rag_system = RragSystemBuilder::new()
    .with_llm_client(llm_client)
    .build()
    .await?;

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please see our Contributing Guidelines for details.


Part of the RRAG ecosystem - Build powerful RAG applications with Rust.

Commit count: 0

cargo fmt