llmprogram

Crates.iollmprogram
lib.rsllmprogram
version0.1.0
created_at2025-08-25 14:00:44.058509+00
updated_at2025-08-25 14:00:44.058509+00
descriptionA Rust library that provides a structured and powerful way to create and run programs that use Large Language Models (LLMs). It uses a YAML-based configuration to define the behavior of your LLM programs, making them easy to create, manage, and share.
homepagehttps://github.com/skelf-research/llmprogram-rs
repositoryhttps://github.com/skelf-research/llmprogram-rs
max_upload_size
id1809641
size139,305
Dipankar Sarkar (dipankar)

documentation

https://docs.rs/llmprogram

README

LLM Program (Rust Implementation)

llmprogram is a Rust crate that provides a structured and powerful way to create and run programs that use Large Language Models (LLMs). It uses a YAML-based configuration to define the behavior of your LLM programs, making them easy to create, manage, and share.

Features

  • YAML-based Configuration: Define your LLM programs using simple and intuitive YAML files.
  • Input/Output Validation: Use JSON schemas to validate the inputs and outputs of your programs, ensuring data integrity.
  • Tera Templating: Use the power of Tera templates (Rust's Jinja2 equivalent) to create dynamic prompts for your LLMs.
  • Caching: Built-in support for Redis caching to save time and reduce costs.
  • Execution Logging: Automatically log program executions to a SQLite database for analysis and debugging.
  • Analytics: Comprehensive analytics tracking with SQLite for token usage, LLM calls, program usage, and timing metrics.
  • Streaming: Support for streaming responses from the LLM.
  • Batch Processing: Process multiple inputs in parallel for improved performance.
  • CLI for Dataset Generation: A command-line interface to generate instruction datasets for LLM fine-tuning from your logged data.
  • AI-Assisted YAML Generation: Generate LLM program YAML files automatically based on natural language descriptions.

Installation

Add this to your Cargo.toml:

[dependencies]
llmprogram = "0.1.0"

Or install the CLI globally:

cargo install llmprogram

Usage

CLI Usage

  1. Set your OpenAI API Key:

    export OPENAI_API_KEY='your-api-key'
    
  2. Create a program YAML file:

    Create a file named sentiment_analysis.yaml:

    name: sentiment_analysis
    description: Analyzes the sentiment of a given text.
    version: 1.0.0
    
    model:
      provider: openai
      name: gpt-4.1-mini
      temperature: 0.5
      max_tokens: 100
      response_format: json_object
    
    system_prompt: |
      You are a sentiment analysis expert. Analyze the sentiment of the given text and return a JSON response with the following format:
      - sentiment (string): "positive", "negative", or "neutral"
      - score (number): A score from -1 (most negative) to 1 (most positive)
    
    input_schema:
      type: object
      required:
        - text
      properties:
        text:
          type: string
          description: The text to analyze.
    
    output_schema:
      type: object
      required:
        - sentiment
        - score
      properties:
        sentiment:
          type: string
          enum: ["positive", "negative", "neutral"]
        score:
          type: number
          minimum: -1
          maximum: 1
    
    template: |
      Analyze the following text:
      {{text}}
    
  3. Run the program using the CLI:

    # Using a JSON input file
    llmprogram run sentiment_analysis.yaml --inputs examples/sentiment_inputs.json
    
    # Using inline JSON
    llmprogram run sentiment_analysis.yaml --input-json '{"text": "I love this product!"}'
    
    # Using stdin
    echo '{"text": "I love this product!"}' | llmprogram run sentiment_analysis.yaml
    
    # Using streaming output
    llmprogram run sentiment_analysis.yaml --inputs examples/sentiment_inputs.json --stream
    
    # Saving output to a file
    llmprogram run sentiment_analysis.yaml --inputs examples/sentiment_inputs.json --output result.json
    

Programmatic Usage

You can also use the llmprogram library directly in your Rust code:

use llmprogram::LLMProgram;
use std::collections::HashMap;
use serde_json::Value;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create and run the sentiment analysis program
    let program = LLMProgram::new("sentiment_analysis.yaml")?;
    
    let mut inputs = HashMap::new();
    inputs.insert("text".to_string(), Value::String("I love this new product! It is amazing.".to_string()));
    
    let result = program.run(&inputs).await?;
    println!("{}", serde_json::to_string_pretty(&result)?);
    
    Ok(())
}

Configuration

The behavior of each LLM program is defined in a YAML file. Here are the key sections:

  • name, description, version: Basic metadata for your program.
  • model: Defines the LLM provider, model name, and other parameters like temperature and max_tokens.
  • system_prompt: The instructions that are given to the LLM to guide its behavior.
  • input_schema: A JSON schema that defines the expected input for the program. The program will validate the input against this schema before execution.
  • output_schema: A JSON schema that defines the expected output from the LLM. The program will validate the LLM's output against this schema.
  • template: A Tera template that is used to generate the prompt that is sent to the LLM. The template is rendered with the input variables.

Using with other OpenAI-compatible endpoints

You can use llmprogram with any OpenAI-compatible endpoint, such as Ollama. To do this, you can pass the api_key and base_url to the LLMProgram constructor:

let program = LLMProgram::new_with_options(
    "your_program.yaml",
    Some("your-api-key".to_string()),
    Some("http://localhost:11434/v1".to_string()),  // example for Ollama
    true,  // enable_cache
    "redis://localhost:6379"
)?;

Caching

llmprogram supports caching of LLM responses to Redis to improve performance and reduce costs. To enable caching, you need to have a Redis server running.

By default, caching is enabled. You can disable it or configure the Redis connection and cache TTL (time-to-live) when you create an LLMProgram instance:

let program = LLMProgram::new_with_options(
    "your_program.yaml",
    None,  // api_key
    None,  // base_url
    false, // enable_cache
    "redis://localhost:6379"
)?;

Logging and Dataset Generation

llmprogram automatically logs every execution of a program to a SQLite database. The database file is created in the same directory as the program YAML file, with a .db extension.

This logging feature is not just for debugging; it's also a powerful tool for creating high-quality datasets for fine-tuning your own LLMs. Each record in the log contains:

  • function_input: The input given to the program.
  • function_output: The output received from the LLM.
  • llm_input: The prompt sent to the LLM.
  • llm_output: The raw response from the LLM.

Generating a Dataset

You can use the built-in CLI to generate an instruction dataset from the logged data. The dataset is created in JSONL format, which is commonly used for fine-tuning.

llmprogram generate-dataset /path/to/your_program.db /path/to/your_dataset.jsonl

Each line in the output file will be a JSON object with the following keys:

  • instruction: The system prompt and the user prompt, combined to form the instruction for the LLM.
  • output: The output from the LLM.

Command-Line Interface (CLI)

llmprogram comes with a command-line interface for common tasks.

run

Run an LLM program with inputs from command line or files.

Usage:

# First, set your OpenAI API key
export OPENAI_API_KEY='your-api-key'

# Run with inputs from a JSON file
llmprogram run program.yaml --inputs inputs.json

# Run with inputs from command line
llmprogram run program.yaml --input-json '{"text": "I love this product!"}'

# Run with inputs from stdin
echo '{"text": "I love this product!"}' | llmprogram run program.yaml

# Run with streaming output
llmprogram run program.yaml --inputs inputs.json --stream

# Save output to a file
llmprogram run program.yaml --inputs inputs.json --output result.json

Arguments:

  • program_path: The path to the program YAML file.
  • --inputs, -i: Path to JSON/YAML file containing inputs.
  • --input-json: JSON string of inputs.
  • --output, -o: Path to output file (default: stdout).
  • --stream, -s: Stream the response.

generate-yaml

Generate an LLM program YAML file based on description using an AI assistant.

Usage:

# Generate a YAML program with a simple description
llmprogram generate-yaml "Create a program that analyzes the sentiment of text" --output sentiment_analyzer.yaml

# Generate a YAML program with examples
llmprogram generate-yaml "Create a program that extracts key information from customer reviews" \
  --example-input "The battery life on this phone is amazing! It lasts all day." \
  --example-output '{"product_quality": "positive", "battery": "positive", "durability": "neutral"}' \
  --output review_analyzer.yaml

# Generate a YAML program and output to stdout
llmprogram generate-yaml "Create a program that summarizes long texts"

Arguments:

  • description: A detailed description of what the LLM program should do.
  • --example-input: Example of the input the program will receive.
  • --example-output: Example of the output the program should generate.
  • --output, -o: Path to output YAML file (default: stdout).
  • --api-key: OpenAI API key (optional, defaults to OPENAI_API_KEY env var).

analytics

Show analytics data collected from LLM program executions.

Usage:

# Show all analytics data
llmprogram analytics

# Show analytics for a specific program
llmprogram analytics --program sentiment_analysis

# Show analytics for a specific model
llmprogram analytics --model gpt-4

# Use a custom analytics database path
llmprogram analytics --db-path /path/to/custom/analytics.db

Arguments:

  • --db-path: Path to the analytics database (default: llmprogram_analytics.db).
  • --program: Filter by program name.
  • --model: Filter by model name.

generate-dataset

Generate an instruction dataset for LLM fine-tuning from a SQLite log file.

Usage:

llmprogram generate-dataset <database_path> <output_path>

Arguments:

  • database_path: The path to the SQLite database file.
  • output_path: The path to write the generated dataset to.

Examples

You can find more examples in the examples directory:

  • Sentiment Analysis: A simple program to analyze the sentiment of a piece of text. (examples/sentiment_analysis.yaml)
  • Code Generator: A program that generates Python code from a natural language description. (examples/code_generator.yaml)
  • Email Generator: A program that generates professional emails based on input parameters. (examples/email_generator.yaml)

To run the examples:

  1. Navigate to the project directory.

  2. Run the corresponding example command:

    # Using the CLI with a JSON input file
    llmprogram run examples/sentiment_analysis.yaml --inputs examples/sentiment_inputs.json
    
    # Using the CLI with batch processing
    llmprogram run examples/sentiment_analysis.yaml --inputs examples/sentiment_batch_inputs.json
    
    # Using the CLI with streaming
    llmprogram run examples/sentiment_analysis.yaml --inputs examples/sentiment_inputs.json --stream
    
    # Using the CLI and saving output to a file
    llmprogram run examples/sentiment_analysis.yaml --inputs examples/sentiment_inputs.json --output result.json
    
    # View analytics data
    llmprogram analytics
    
    # View analytics for a specific program
    llmprogram analytics --program sentiment_analysis
    
    # Generate a new YAML program
    llmprogram generate-yaml "Create a program that classifies email priority" \
      --example-input "Subject: Urgent meeting tomorrow. Body: Please prepare the Q3 report." \
      --example-output '{"priority": "high", "category": "work", "response_required": true}' \
      --output email_classifier.yaml
    
    # Generate a dataset
    llmprogram generate-dataset sentiment_analysis.db dataset.jsonl
    

Development

To run the tests for this package:

cargo test

To build the documentation:

cargo doc --open

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Commit count: 1

cargo fmt