| Crates.io | iron_runtime |
| lib.rs | iron_runtime |
| version | 0.4.0 |
| created_at | 2025-11-25 17:03:33.141489+00 |
| updated_at | 2025-12-18 09:32:17.218022+00 |
| description | Agent runtime with LLM request routing and translation |
| homepage | |
| repository | https://github.com/.../iron_runtime |
| max_upload_size | |
| id | 1950087 |
| size | 173,423 |
Agent orchestration and Python bridge for AI agent execution. Provides LlmRouter - a local proxy server for transparent LLM API key management with OpenAI and Anthropic support.
[!IMPORTANT] Audience: Platform contributors developing the iron_runtime Rust crate. End Users: See iron_sdk documentation - just
pip install iron-sdk
The Python library is built using maturin and PyO3.
Prerequisites:
curl -LsSf https://astral.sh/uv/install.sh | sh)Development Install:
cd module/iron_runtime
# Install dependencies and setup environment
uv sync # Automatically creates .venv and installs all dev dependencies
# Build and install in development mode
uv run maturin develop
# Verify installation
uv run python -c "from iron_cage import LlmRouter; print('OK')"
Build Wheel:
# Build wheel for distribution
uv run maturin build --release
# Wheel will be in target/wheels/
ls target/wheels/
# iron_cage-0.1.0-cp38-abi3-*.whl
[dependencies]
iron_runtime = { path = "../iron_runtime" }
Keys are fetched from Iron Cage server:
import os
from iron_cage import LlmRouter
from openai import OpenAI
router = LlmRouter(
api_key=os.environ["IC_TOKEN"],
server_url=os.environ["IC_SERVER"],
)
client = OpenAI(base_url=router.base_url, api_key=router.api_key)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
router.stop()
Use your own API key with optional budget tracking and analytics sync:
from iron_cage import LlmRouter
from openai import OpenAI
router = LlmRouter(
provider_key=os.environ["OPENAI_API_KEY"], # Your actual API key
api_key=os.environ["IC_TOKEN"], # For analytics auth
server_url=os.environ["IC_SERVER"], # Analytics server
budget=10.0, # $10 budget limit
)
client = OpenAI(base_url=router.base_url, api_key=router.api_key)
# Make requests...
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
# Check spending
if router.budget_status:
spent, limit = router.budget_status
print(f"Spent: ${spent:.4f} / ${limit:.2f}")
router.stop() # Flushes analytics to server
With Anthropic:
from iron_cage import LlmRouter
from anthropic import Anthropic
router = LlmRouter(api_key=ic_token, server_url=ic_server)
# Anthropic API doesn't use /v1 suffix
client = Anthropic(
base_url=router.base_url.replace("/v1", ""),
api_key=router.api_key,
)
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100,
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.content[0].text)
router.stop()
Gateway Mode (OpenAI client for Claude):
Use the same OpenAI client for both OpenAI and Claude models - just change the model name:
from iron_cage import LlmRouter
from openai import OpenAI
router = LlmRouter(api_key=ic_token, server_url=ic_server)
client = OpenAI(base_url=router.base_url, api_key=router.api_key)
# Same client works for both providers!
response = client.chat.completions.create(
model="claude-sonnet-4-20250514", # Claude model with OpenAI client!
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
],
max_tokens=100
)
print(response.choices[0].message.content) # OpenAI format response
router.stop()
The router automatically:
Context Manager:
with LlmRouter(api_key=token, server_url=url) as router:
client = OpenAI(base_url=router.base_url, api_key=router.api_key)
# ... use client
# Router automatically stops on exit

Visual Guide:
See root readme for detailed architecture explanation.
Local HTTP proxy server for LLM API requests with automatic key management.
Constructor:
LlmRouter(
api_key: str, # Iron Cage token (IC_TOKEN)
server_url: str, # Iron Cage server URL
cache_ttl_seconds: int = 300, # Key cache TTL
provider_key: str | None = None, # Direct mode: your API key
budget: float | None = None, # Direct mode: spending limit in USD
)
Modes:
provider_key not set - keys fetched from serverprovider_key set - use your own API key with local budget trackingProperties:
| Property | Type | Description |
|---|---|---|
base_url |
str |
Proxy URL for OpenAI client (http://127.0.0.1:{port}/v1) |
api_key |
str |
IC token for client authentication |
port |
int |
Port the proxy is listening on |
provider |
str |
Auto-detected provider ("openai" or "anthropic") |
is_running |
bool |
Whether the proxy is running |
budget_status |
`tuple[float, float] | None` |
Methods:
| Method | Description |
|---|---|
stop() |
Stop the proxy server (flushes analytics in direct mode) |
__enter__() / __exit__() |
Context manager support |
Analytics (Direct Mode):
In direct mode, the router automatically:
stop()Provider Auto-Detection:
sk-ant- → Anthropicsk-* keys → OpenAIcd module/iron_runtimeuv sync --extra dev --extra examples (installs pytest + OpenAI/Anthropic clients into .venv)uv run maturin develop (builds the iron_cage extension into the venv)uv run python - <<'PY'\nfrom iron_cage import LlmRouter; print('iron_cage import OK', LlmRouter)\nPYFrom repo root:
uv venv .venv && source .venv/bin/activate (or deactivate any other venv to avoid the VIRTUAL_ENV ... does not match warning)cd module/iron_runtimeuv sync --extra dev --extra examplesuv run maturin develop (warning about python/.../iron_runtime is expected; optional: pip install patchelf to silence rpath warning)uv run python - <<'PY'\nfrom iron_cage import LlmRouter; print('import OK', LlmRouter)\nPYIC_TOKEN=ic_xxx IC_SERVER=http://localhost:3001 uv run pytest python/tests/test_llm_router_e2e.py -v
-s and RUST_LOG=trace.IC_TOKEN + IC_SERVER for server-mode tests; set OPENAI_API_KEY or ANTHROPIC_API_KEY for direct-mode budget tests (these spend real money—use tiny budgets).IC_TOKEN=ic_xxx IC_SERVER=http://localhost:3001 uv run pytest python/tests/test_llm_router_e2e.py -vOPENAI_API_KEY=sk-xxx uv run pytest python/tests/test_budget_e2e.py -k openai -vIC_TOKEN=... IC_SERVER=... OPENAI_API_KEY=... uv run pytest python/tests -vRUST_LOG=info to see router logs during test runs.cd module/iron_runtime
cargo test -p iron_runtime
uv run python python/examples/test_manual.py openai # Test OpenAI API
uv run python python/examples/test_manual.py anthropic # Test Anthropic API
uv run python python/examples/test_manual.py gateway # Test OpenAI client → Claude
use iron_runtime::LlmRouter;
// Create router
let mut router = LlmRouter::create(
api_key.to_string(),
server_url.to_string(),
300, // cache TTL
)?;
let base_url = router.get_base_url();
println!("Proxy running at: {}", base_url);
// Use with HTTP client...
router.shutdown();
Responsibilities: Bridges Python AI agents with Rust-based safety, cost, and reliability infrastructure via PyO3. Provides LlmRouter for transparent API key management and request proxying. Manages agent lifecycle (spawn, monitor, shutdown), intercepts LLM calls for policy enforcement, coordinates tokio async runtime, and provides WebSocket server for real-time dashboard updates.
In Scope:
Out of Scope:
| File | Responsibility |
|---|---|
| lib.rs | Core runtime for AI agent execution |
| llm_router/ | LLM Router - Local proxy for LLM API requests |
Notes:
Apache-2.0