| Crates.io | reflex-server |
| lib.rs | reflex-server |
| version | 0.2.2 |
| created_at | 2025-12-16 11:45:33.772191+00 |
| updated_at | 2025-12-29 01:47:01.699175+00 |
| description | OpenAI-compatible HTTP gateway for Reflex cache |
| homepage | |
| repository | https://github.com/rawcontext/reflex |
| max_upload_size | |
| id | 1987591 |
| size | 282,447 |
reflex-server owns the HTTP gateway (Axum) and builds the reflex binary.
It depends on the core library crate (reflex-cache) for the cache tiers, storage, embeddings, and vector DB integration.
# Requires Qdrant running (gRPC on 6334)
docker run -d -p 6334:6334 -p 6333:6333 qdrant/qdrant
RUST_LOG=debug cargo run -p reflex-server
cargo run -p reflex-server --release
From repo root:
docker compose up -d
GET /healthzGET /readyPOST /v1/chat/completions (OpenAI-compatible)Responses include X-Reflex-Status:
hit-l1-exact: exact request matchhit-l3-verified: semantic hit verified by L3miss: forwarded to provider and storedThe response choices[].message.content is Tauq-encoded.
Most commonly used env vars:
| Variable | Default | Notes |
|---|---|---|
REFLEX_PORT |
8080 |
HTTP port |
REFLEX_BIND_ADDR |
127.0.0.1 |
Bind address |
REFLEX_QDRANT_URL |
http://localhost:6334 |
Qdrant gRPC |
REFLEX_STORAGE_PATH |
./.data |
Storage base path |
REFLEX_L1_CAPACITY |
10000 |
L1 capacity |
REFLEX_MODEL_PATH |
(unset) | Unset = stub embedder |
REFLEX_RERANKER_PATH |
(unset) | Optional reranker |
REFLEX_RERANKER_THRESHOLD |
0.70 |
L3 threshold |
REFLEX_MOCK_PROVIDER |
(unset) | Set to bypass real provider calls |
export OPENAI_BASE_URL=http://localhost:8080/v1
export OPENAI_API_KEY=sk-your-key
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-your-key")
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quicksort"}],
)
Build/run with one of:
--features metal (Apple Silicon)--features cuda (NVIDIA)--features cpu (docs.rs-style CPU flag; usually you can omit)Example:
cargo run -p reflex-server --release --no-default-features --features metal
cargo test -p reflex-server
Real integration tests (needs Qdrant):
REFLEX_QDRANT_URL=http://localhost:6334 cargo test -p reflex-server --test integration_real