| Crates.io | llm-optimizer |
| lib.rs | llm-optimizer |
| version | 0.1.1 |
| created_at | 2025-11-11 02:55:32.308457+00 |
| updated_at | 2025-11-11 02:55:32.308457+00 |
| description | Production-ready main service binary for LLM Auto Optimizer |
| homepage | https://github.com/globalbusinessadvisors/llm-auto-optimizer |
| repository | https://github.com/globalbusinessadvisors/llm-auto-optimizer |
| max_upload_size | |
| id | 1926643 |
| size | 302,772 |
Production-ready main service binary with enterprise-grade quality for the LLM Auto Optimizer system.
The llm-optimizer binary is a single executable that orchestrates all components of the LLM Auto Optimizer system, including:
/metrics endpoint┌─────────────────────────────────────────────────────────────────┐
│ LLM Auto Optimizer │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ REST API │ │ gRPC API │ │ Integrations │ │
│ │ Port 8080 │ │ Port 50051 │ │ Service │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └──────────────────┼──────────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Processor │ │
│ │ Service │ │
│ └────────┬────────┘ │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ ┌──────▼───────┐ ┌──────▼───────┐ ┌──────▼───────┐ │
│ │ Collector │ │ Storage │ │ Integrations│ │
│ │ Service │ │ Service │ │ Service │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Service Manager & Orchestrator │ │
│ │ - Dependency Resolution - Health Monitoring │ │
│ │ - Lifecycle Management - Auto Recovery │ │
│ │ - Signal Handling - Metrics Aggregation │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Storage Service (no dependencies)
↓
Collector Service (no dependencies)
↓
Integrations Service (no dependencies)
↓
Processor Service (depends on: collector, storage)
↓
REST API (depends on: processor, storage)
↓
gRPC API (depends on: processor, storage)
The Service Manager orchestrates all services with:
Tracks health of all services:
Prometheus-compatible metrics:
Unix signal handling:
# Clone the repository
git clone https://github.com/llm-devops/llm-auto-optimizer
cd llm-auto-optimizer
# Build the binary
cargo build --release -p llm-optimizer
# The binary will be at: target/release/llm-optimizer
cargo install --path crates/llm-optimizer
Create a configuration file (TOML or YAML):
# config.toml
[service]
name = "llm-optimizer"
environment = "production"
host = "0.0.0.0"
[collector]
enabled = true
kafka_brokers = ["localhost:9092"]
kafka_topic = "llm-feedback"
[processor]
enabled = true
worker_threads = 4
[rest_api]
enabled = true
port = 8080
[grpc_api]
enabled = true
port = 50051
[storage]
postgres_url = "postgres://localhost:5432/llm_optimizer"
redis_url = "redis://localhost:6379"
sled_path = "./data/sled"
[observability]
log_level = "info"
json_logging = true
metrics_port = 9090
See config.toml.example for all available options.
Override configuration using environment variables with LLM_OPTIMIZER_ prefix:
export LLM_OPTIMIZER__SERVICE__NAME="my-optimizer"
export LLM_OPTIMIZER__REST_API__PORT="8888"
export LLM_OPTIMIZER__OBSERVABILITY__LOG_LEVEL="debug"
Note: Use double underscores (__) to separate nested keys.
# Start with default configuration
llm-optimizer
# Start with custom configuration file
llm-optimizer --config config.toml
# Override log level
llm-optimizer --config config.toml --log-level debug
# Enable JSON logging
llm-optimizer --config config.toml --json-logs
# Validate configuration without starting
llm-optimizer --config config.toml --validate-config
# Print default configuration
llm-optimizer --print-default-config > default-config.toml
Options:
-c, --config <FILE> Path to configuration file
-l, --log-level <LEVEL> Override log level (trace, debug, info, warn, error)
--json-logs Enable JSON logging
--validate-config Validate configuration and exit
--print-default-config Print default configuration and exit
-h, --help Print help
-V, --version Print version
http://localhost:8080GET /healthGET /metrics (internal)GET /docs (OpenAPI/Swagger)localhost:50051http://localhost:9090/metrics1. Parse command line arguments
2. Load and validate configuration
3. Initialize observability (logging, tracing)
4. Create shared state (config, metrics, health monitor)
5. Initialize signal handler
6. Create service manager
7. Register all services (in dependency order):
a. Storage Service
b. Collector Service
c. Integrations Service
d. Processor Service
e. REST API Service
f. gRPC API Service
8. Start all services (in dependency order)
9. Start resource monitoring
10. Start metrics HTTP server
11. Start health monitoring
12. Enter main event loop (wait for signals)
1. Receive shutdown signal (SIGTERM, SIGINT, or Ctrl+C)
2. Log shutdown initiation
3. Stop all services (in reverse dependency order):
a. gRPC API Service
b. REST API Service
c. Processor Service
d. Integrations Service
e. Collector Service
f. Storage Service
4. Wait for graceful shutdown (with timeout)
5. Generate final health report
6. Exit cleanly
# Send SIGTERM
kill -TERM <pid>
# Or use Ctrl+C
The service will:
# Send SIGHUP
kill -HUP <pid>
The service will:
curl http://localhost:8080/health
Response:
{
"status": "healthy",
"uptime_secs": 3600,
"services": {
"storage": {
"state": "Running",
"healthy": true,
"consecutive_failures": 0,
"message": null,
"metadata": {}
},
"processor": {
"state": "Running",
"healthy": true,
"consecutive_failures": 0,
"message": null,
"metadata": {
"events_processed": "1000",
"windows_triggered": "50"
}
}
}
}
curl http://localhost:9090/metrics
Available metrics:
service_status{service="..."} - Service status (1=running, 0=stopped)service_health{service="..."} - Service health (1=healthy, 0=unhealthy)service_uptime_seconds{service="..."} - Service uptimerequests_total{operation="...",status="..."} - Total requestsrequest_duration_seconds{operation="...",status="..."} - Request duration histogramactive_connections{service="..."} - Active connectionsmemory_usage_bytes - Memory usagecpu_usage_percent - CPU usageThe service manager automatically attempts to recover failed services:
# In ServiceManagerConfig
health_check_interval = "30s"
max_restart_attempts = 3
restart_backoff_base = "1s"
restart_backoff_max = "60s"
Create /etc/systemd/system/llm-optimizer.service:
[Unit]
Description=LLM Auto Optimizer
After=network.target
[Service]
Type=simple
User=llm-optimizer
Group=llm-optimizer
WorkingDirectory=/opt/llm-optimizer
ExecStart=/usr/local/bin/llm-optimizer --config /etc/llm-optimizer/config.toml
Restart=always
RestartSec=10s
StandardOutput=journal
StandardError=journal
# Resource limits
LimitNOFILE=65536
LimitNPROC=32768
[Install]
WantedBy=multi-user.target
Start the service:
sudo systemctl daemon-reload
sudo systemctl enable llm-optimizer
sudo systemctl start llm-optimizer
sudo systemctl status llm-optimizer
FROM rust:1.75 as builder
WORKDIR /app
COPY . .
RUN cargo build --release -p llm-optimizer
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/llm-optimizer /usr/local/bin/
COPY --from=builder /app/crates/llm-optimizer/config.toml.example /etc/llm-optimizer/config.toml
EXPOSE 8080 50051 9090
CMD ["llm-optimizer", "--config", "/etc/llm-optimizer/config.toml"]
apiVersion: apps/v1
kind: Deployment
metadata:
name: llm-optimizer
spec:
replicas: 3
selector:
matchLabels:
app: llm-optimizer
template:
metadata:
labels:
app: llm-optimizer
spec:
containers:
- name: llm-optimizer
image: llm-optimizer:latest
ports:
- containerPort: 8080
name: rest
- containerPort: 50051
name: grpc
- containerPort: 9090
name: metrics
env:
- name: LLM_OPTIMIZER__OBSERVABILITY__LOG_LEVEL
value: "info"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
# prometheus.yml
scrape_configs:
- job_name: 'llm-optimizer'
static_configs:
- targets: ['localhost:9090']
metrics_path: '/metrics'
scrape_interval: 15s
Import the included Grafana dashboard for monitoring:
Check logs:
# Systemd
sudo journalctl -u llm-optimizer -f
# Docker
docker logs -f llm-optimizer
Common issues:
Monitor memory metrics:
curl http://localhost:9090/metrics | grep memory_usage_bytes
Adjust configuration:
Check health status:
curl http://localhost:8080/health
Common causes:
cargo build -p llm-optimizer
cargo test -p llm-optimizer
# With default configuration
cargo run -p llm-optimizer
# With custom configuration
cargo run -p llm-optimizer -- --config dev-config.toml --log-level debug
See the main repository CONTRIBUTING.md for contribution guidelines.
Apache License 2.0 - See LICENSE for details.