llmproxy

Crates.io	llmproxy
lib.rs	llmproxy
version	0.2.2
created_at	2025-05-17 12:22:06.713276+00
updated_at	2025-12-20 13:02:08.602863+00
description	A simple HTTP proxy server for llm api requests
homepage
repository
max_upload_size
id	1677830
size	121,218

PuQing (AndPuQing)

documentation

README

Model Service CLI

Crates.io Downloads (recent) Crates.io License Crates.io Size

A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.

Features

Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
Unregister: Remove a previously registered model service from the orchestrator using its index number or address.
List: Display all currently registered model services in a clean table format with index numbers for easy reference.

Prerequisites

Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at http://127.0.0.1:11450.

Building

Clone the repository (if you have it in one):

git clone <your-repo-url>
cd <repository-name>

Build the CLI:
- For a development build:
```
cargo build
```
  The executable will be in ./target/debug/llmproxy.
- For a release (optimized) build:
```
cargo build --release
```
  The executable will be in ./target/release/llmproxy.

Usage

The general command structure is:

./path/to/llmproxy <COMMAND> [OPTIONS]

You can get help for the main command or any subcommand:

./path/to/llmproxy --help
./path/to/llmproxy register --help

Commands

1. `register`

Registers a new model service with the orchestrator.

Options:

--model-name <MODEL_NAME>: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)
--addr <ADDR>: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)

Example:

./target/debug/llmproxy register --model-name "Qwen/Qwen2-7B-Instruct" --addr "127.0.0.1:8001"

Expected Output (Success):

✔ Registered meta-llama/Llama-2-7b-chat-hf at 127.0.0.1:8001

2. `unregister`

Unregisters an existing model service from the orchestrator using its index number or address.

Arguments:

<TARGET>: Service index (e.g., 1, 2, 3) or address (e.g., localhost:8001). (Required)

Examples:

# Unregister by index number (most convenient)
./target/debug/llmproxy unregister 1

# Unregister by address (backward compatible)
./target/debug/llmproxy unregister "127.0.0.1:8001"

Expected Output (Success):

✔ Unregistered service #1 (127.0.0.1:8001)

or:

✔ Unregistered service at 127.0.0.1:8001

Error Examples:

✖ Index 5 not found. Only 2 services are registered.

3. `list`

Lists all currently registered model services in a clean table format with index numbers.

Example:

./target/debug/llmproxy list

Expected Output:

✔ 2 registered services

Label  Model                          Address
#1     meta-llama/Llama-2-7b-chat-hf  10.150.10.75:18012
#2     Qwen/Qwen2-7B-Instruct         127.0.0.1:8001

💡 You can unregister services by index or address:
  → llmproxy unregister 1
  → llmproxy unregister "localhost:8001"

When no services are registered:

ℹ No model services are currently registered
  → Use llmproxy register --model-name <MODEL> --addr <ADDRESS> to register a new service

Backend Server

This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450). The server is responsible for:

Maintaining the list of active model services.
Proxying incoming requests to the appropriate registered model service based on the model field in the request body.

cargo run --release --bin llmproxyd

Troubleshooting

Connection Refused: Ensure the backend server is running and accessible at http://127.0.0.1:11450 (or the configured address if you modify the BASE_URL in the CLI source).
```
✖ Cannot connect to llmproxyd server
  → Make sure the server is running on http://127.0.0.1:11450
  → Start it with: llmproxyd
```
Invalid Index: When using numeric indices, ensure they are within the valid range:
```
✖ Index 5 not found. Only 2 services are registered.
```
Server Errors: The CLI provides clear error messages with context and suggestions for resolution.

Commit count: 0