llm-cost-ops

Crates.io	llm-cost-ops
lib.rs	llm-cost-ops
version	0.1.1
created_at	2025-11-16 07:38:58.026131+00
updated_at	2025-11-16 08:07:23.414555+00
description	Core library for cost operations on LLM deployments
homepage	https://github.com/globalbusinessadvisors/llm-cost-ops
repository	https://github.com/globalbusinessadvisors/llm-cost-ops
max_upload_size
id	1935306
size	595,277

GBA (globalbusinessadvisors)

documentation

https://docs.rs/llm-cost-ops

README

LLM Cost Ops

Enterprise-grade cost operations platform for LLM deployments

A comprehensive, production-ready Rust library for tracking, analyzing, and optimizing costs across multiple Large Language Model (LLM) providers including OpenAI, Anthropic, Google Vertex AI, Azure OpenAI, AWS Bedrock, Cohere, and Mistral.

Features

Core Cost Management

Multi-Provider Support - OpenAI, Anthropic, Google Vertex AI, Azure, AWS Bedrock, Cohere, Mistral
Flexible Pricing Models - Per-token, per-request, tiered pricing with cache discounts
Real-time Cost Calculation - Sub-millisecond cost computations with decimal precision
Multi-Currency Support - USD, EUR, GBP with automatic conversion
Usage Analytics - Token consumption, cost trends, provider comparisons

Data Ingestion

Webhook Server - High-performance ingestion with validation
Stream Processing - NATS and Redis Streams support
Rate Limiting - Per-organization limits with burst allowance
Batch Processing - Bulk ingestion with partial success handling
Dead Letter Queue - Failed ingestion handling with retry policies

Compliance & Governance

GDPR Compliance - Data Subject Access Requests, Right to Erasure, Consent Management
Breach Notifications - Automated 72-hour breach notification workflow
Audit Logging - Comprehensive audit trail with tamper detection
Policy Management - Retention policies, access policies, data classification
SOC2 Compliance - Automated compliance checks and evidence collection

Authentication & Security

JWT Authentication - Secure token-based authentication
Role-Based Access Control (RBAC) - Granular permission system
API Key Management - Secure key rotation and revocation
Audit Middleware - Automatic request/response logging
Rate Limiting - Protect APIs from abuse

Observability

Prometheus Metrics - Request counts, latencies, error rates
Distributed Tracing - OpenTelemetry integration
Structured Logging - JSON logs with correlation IDs
Health Checks - Database, cache, external service monitoring
Performance Monitoring - Query optimization, slow query detection

Export & Reporting

Multiple Formats - CSV, Excel, JSON, PDF
Scheduled Reports - Automated daily/weekly/monthly reports
Email Delivery - Automated report distribution
Custom Templates - Handlebars-based report templates
Data Aggregation - Flexible grouping by organization, project, provider, model

Quick Start

Add to your Cargo.toml:

[dependencies]
llm-cost-ops = "0.1"
tokio = { version = "1", features = ["full"] }

Basic Usage

use llm_cost_ops::{
    CostCalculator, PricingTable, PricingStructure,
    UsageRecord, Provider, ModelIdentifier, Currency
};
use rust_decimal_macros::dec;
use chrono::Utc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a pricing table for GPT-4
    let pricing = PricingTable::new(
        Provider::OpenAI,
        "gpt-4".to_string(),
        PricingStructure::simple_per_token(
            dec!(30.0),  // $30 per million input tokens
            dec!(60.0),  // $60 per million output tokens
        ),
    );

    // Create a usage record
    let usage = UsageRecord {
        id: uuid::Uuid::new_v4(),
        timestamp: Utc::now(),
        provider: Provider::OpenAI,
        model: ModelIdentifier::new("gpt-4".to_string(), 8192),
        organization_id: "org-123".to_string(),
        project_id: Some("project-456".to_string()),
        user_id: Some("user-789".to_string()),
        prompt_tokens: 1500,
        completion_tokens: 500,
        total_tokens: 2000,
        cached_tokens: None,
        reasoning_tokens: None,
        latency_ms: Some(1200),
        time_to_first_token_ms: Some(150),
        tags: vec!["production".to_string()],
        metadata: serde_json::json!({"endpoint": "/v1/chat"}),
        ingested_at: Utc::now(),
        source: llm_cost_ops::IngestionSource::Api {
            endpoint: "/v1/usage".to_string(),
        },
    };

    // Calculate cost
    let calculator = CostCalculator::new();
    let cost_record = calculator.calculate(&usage, &pricing)?;

    println!("Input cost: ${}", cost_record.input_cost);
    println!("Output cost: ${}", cost_record.output_cost);
    println!("Total cost: ${}", cost_record.total_cost);

    Ok(())
}

Using the Database Repository

use llm_cost_ops::{
    DatabaseConfig, UsageRepository, CostRepository,
    UsageRecord, CostRecord
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure database
    let config = DatabaseConfig::sqlite("cost-ops.db".into());
    let db = config.connect().await?;

    // Run migrations
    llm_cost_ops::run_migrations(&db).await?;

    // Create repositories
    let usage_repo = UsageRepository::new(db.clone());
    let cost_repo = CostRepository::new(db);

    // Insert usage record
    usage_repo.insert(&usage).await?;

    // Insert cost record
    cost_repo.insert(&cost_record).await?;

    // Query usage by organization
    let org_usage = usage_repo
        .find_by_organization("org-123", None, None)
        .await?;

    println!("Found {} usage records", org_usage.len());

    Ok(())
}

Starting the Web Server

use llm_cost_ops::{start_server, ServerConfig, DatabaseConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure database
    let db_config = DatabaseConfig::sqlite("cost-ops.db".into());
    let db = db_config.connect().await?;

    // Configure server
    let config = ServerConfig {
        host: "0.0.0.0".to_string(),
        port: 8080,
        database: db_config,
        jwt_secret: "your-secret-key".to_string(),
        enable_metrics: true,
        metrics_port: 9090,
        log_level: "info".to_string(),
    };

    // Start server
    start_server(config).await?;

    Ok(())
}

CLI Usage

Install the CLI tool:

cargo install llm-cost-ops

Initialize the database:

cost-ops init --database-url sqlite:cost-ops.db

Import usage data:

cost-ops import --file usage-data.json --format openai

Generate reports:

cost-ops report --organization org-123 --format csv --output report.csv

Query costs:

cost-ops query \
  --organization org-123 \
  --start-date 2024-01-01 \
  --end-date 2024-01-31

Docker Deployment

# Build Docker image
docker build -f deployment/Dockerfile -t llm-cost-ops:latest .

# Run with Docker Compose
docker-compose -f deployment/docker-compose.yml up -d

Supported Providers

Provider	Per-Token Pricing	Tiered Pricing	Cache Discounts	Reasoning Tokens
OpenAI	✅	✅	✅	✅
Anthropic	✅	✅	✅	✅
Google Vertex AI	✅	✅	✅	❌
Azure OpenAI	✅	✅	✅	✅
AWS Bedrock	✅	✅	❌	❌
Cohere	✅	❌	❌	❌
Mistral	✅	✅	❌	❌

Performance

Cost Calculation: <1ms per record
Query Performance: <100ms p99 for complex aggregations
Throughput: 1000+ requests/second
Test Coverage: 90%+ with 554+ test cases
Memory Safety: No unsafe code in core logic

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     LLM Cost Ops Platform                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  Ingestion   │  │   Storage    │  │  Analytics   │     │
│  │   Engine     │→ │   Layer      │→ │   Engine     │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
│         ↓                 ↓                  ↓              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  Validation  │  │  PostgreSQL/ │  │   Reporting  │     │
│  │  & Transform │  │   SQLite     │  │   & Export   │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
│         ↓                 ↓                  ↓              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ Rate Limiting│  │  Compliance  │  │ Observability│     │
│  │  & DLQ       │  │   & GDPR     │  │  & Metrics   │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└─────────────────────────────────────────────────────────────┘

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Contributing

Contributions are welcome! Please see the full README on GitHub for contribution guidelines.

Commit count: 0