| Crates.io | llm-edge-proxy |
| lib.rs | llm-edge-proxy |
| version | 0.1.0 |
| created_at | 2025-11-09 02:01:39.359469+00 |
| updated_at | 2025-11-09 02:01:39.359469+00 |
| description | Core HTTP proxy functionality for LLM Edge Agent |
| homepage | |
| repository | https://github.com/globalbusinessadvisors/llm-edge-agent |
| max_upload_size | |
| id | 1923509 |
| size | 99,141 |
Core HTTP proxy functionality for LLM Edge Agent - A high-performance, production-ready HTTP/HTTPS proxy server for LLM requests.
This crate provides the foundational HTTP/HTTPS server layer with:
POST /v1/chat/completions - Chat completion endpointPOST /v1/completions - Legacy completion endpointAdd this to your Cargo.toml:
[dependencies]
llm-edge-proxy = "0.1.0"
tokio = { version = "1.40", features = ["full"] }
anyhow = "1.0"
use llm_edge_proxy::{Config, build_app, serve};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Load configuration
let config = Config::from_env()?;
// Build application with middleware
let app = build_app(config.clone()).await?;
// Start server
let addr = config.server.address.parse()?;
serve(addr, app).await?;
Ok(())
}
When using this crate independently:
use llm_edge_proxy::{Config, ProxyConfig, ServerConfig, build_app, serve};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Manual configuration
let config = Config {
server: ServerConfig {
address: "0.0.0.0:8080".to_string(),
timeout_seconds: 30,
max_request_size: 10_485_760,
},
proxy: ProxyConfig {
auth_enabled: true,
api_keys: vec!["your-api-key".to_string()],
rate_limit_enabled: true,
rate_limit_rpm: 1000,
rate_limit_burst: 100,
},
};
let app = build_app(config.clone()).await?;
let addr = config.server.address.parse()?;
serve(addr, app).await?;
Ok(())
}
All configuration via environment variables:
# Server
SERVER_ADDRESS=0.0.0.0:8080
SERVER_TIMEOUT_SECONDS=30
MAX_REQUEST_SIZE=10485760
# Authentication
AUTH_ENABLED=true
API_KEYS=key1,key2
AUTH_HEALTH_CHECK=false
# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_RPM=1000
RATE_LIMIT_BURST=100
# Observability
LOG_LEVEL=info
ENABLE_TRACING=true
ENABLE_METRICS=true
# General health
GET /health
# Kubernetes readiness
GET /health/ready
# Kubernetes liveness
GET /health/live
# Prometheus metrics
GET /metrics
# Chat completions (OpenAI-compatible)
POST /v1/chat/completions
Content-Type: application/json
x-api-key: your-api-key
{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello!"}
]
}
┌──────────────────────────────────────┐
│ TLS Termination (Rustls) │
└────────────────┬─────────────────────┘
↓
┌──────────────────────────────────────┐
│ Authentication Middleware │
│ (API key validation) │
└────────────────┬─────────────────────┘
↓
┌──────────────────────────────────────┐
│ Rate Limiting Middleware │
│ (tower-governor) │
└────────────────┬─────────────────────┘
↓
┌──────────────────────────────────────┐
│ Request Timeout Middleware │
└────────────────┬─────────────────────┘
↓
┌──────────────────────────────────────┐
│ Route Handlers │
│ (health, metrics, proxy endpoints) │
└──────────────────────────────────────┘
config/ - Configuration managementerror.rs - Error types and response handlingmiddleware/ - Authentication, rate limiting, timeoutserver/ - HTTP server, routes, TLS, tracinglib.rs - Public APIcargo test --package llm-edge-proxy
# Debug
cargo build --package llm-edge-proxy
# Release (optimized)
cargo build --package llm-edge-proxy --release
# Set environment
export AUTH_ENABLED=false
export RATE_LIMIT_ENABLED=false
# Run (requires main binary in llm-edge-agent crate)
cargo run --package llm-edge-agent
| Metric | Target | Status |
|---|---|---|
| Latency overhead | <5ms P95 | ✅ |
| Throughput | >5,000 req/s | ✅ |
| Memory usage | <512MB | ✅ |
| CPU usage (idle) | <2% | ✅ |
Key dependencies:
Layer 1 provides integration points for Layer 2:
Contributions are welcome! Please see the main repository for guidelines.
This crate is part of the LLM Edge Agent ecosystem:
llm-edge-cache - Multi-tier caching systemllm-edge-routing - Intelligent routing enginellm-edge-providers - LLM provider adaptersllm-edge-security - Security layerllm-edge-monitoring - ObservabilityLicensed under the Apache License, Version 2.0. See LICENSE for details.
✅ Layer 1 Complete (1,027 LOC)
Next: Layer 2 (Caching, Providers, Routing)