| Crates.io | ort-openrouter-cli |
| lib.rs | ort-openrouter-cli |
| version | 0.3.4 |
| created_at | 2026-01-15 02:17:28.597026+00 |
| updated_at | 2026-01-16 01:45:45.369666+00 |
| description | Open Router CLI |
| homepage | |
| repository | https://github.com/grahamking/ort |
| max_upload_size | |
| id | 2044363 |
| size | 282,855 |
rustup toolchain install nightly-2026-01-07 --component rust-srcort:cargo +nightly-2026-01-07 install --locked -Zbuild-std=core,alloc ort-openrouter-cli
The binary is called ort.
Linux only.
ort sends your prompts to AI models on openrouter.ai.
It is built the old fashioned way, in solid Rust. It doesn't slow you down with Python interpreters. This is a compact 160 KiB ELF binary. It does not use the Rust std library or any external Rust crates.
It's direct. Use the default model model with no fuss: ort "What is the capital of France?". And if you mess up, it tells you straight: OPENROUTER_API_KEY is not set. That's an environment variable.
If you're new here in town, it'll introduce you: ort list [-json]. Everyone's a model here.
You like to know who you're talking to so -m <model> selects your conversation partner, and it knows you don't want to impose more then necessary so -r off|none|low|medium|high|<toks> sets reasoning effort. But you have your own priorities, we all do. Use -p price|throughput|latency for that.
It is from a time when we countered bad arguments with good arguments, so it will show you the reasoning with -rr. As long as you're clear about what you want, it will respect your system prompt -s "<system prompt>". We all got to live here together, and we're the better for it.
Like a good friend, it remembers. -c will continue a conversation. And like a real friend, it accepts you how you are. In a tmux pane? It continues that conversation, not the one happening in the pane next door.
As an honest CLI, it cares about the small stuff. For humans there's ANSI codes. If you pipe the output somewhere else, it's clean ASCII. And you can pipe input in too. You do you.
Harking from a time when we trusted each other, it doesn't check TLS certificates, and because we know our neighbours, it uses hard-coded DNS if you set dns in the config file - you should. I don't mind telling you, those two can get the city folks riled up.
In short, ort is an honest CLI for openrouter, like it says on the box.
Usage:
ort [-m <model>] [-s "<system prompt>"] [-p <price|throughput|latency>] [-pr provider-slug] [-r off|none|low|medium|high|<toks>] [-rr] [-q] [-c] <prompt>
Use Kimi K2, select the provider with lowest price, and set a system prompt:
ort -p price -m moonshotai/kimi-k2 -s "Respond like a pirate" "Write a limerick about AI"
price is lowest price, throughput is lowest inter-token latency, latency is lowest time to first token.ort -c "Are you sure?". All the fields default to the previous message (model, priority, provider, system prompt, etc, but you can override them here, for example continuing the conversation but with a different model, or a higher reasoning effort. The provider of the previous message is set as the first choice, to benefit from caching.Accepts piped stdin: echo 'What is the capital of South Africa?' | ort -m z-ai/glm-4.5-air:free
ort has both a debug and a release build. The debug build and the tests are normal: cargo build and cargo test from workspace root.
To build in release mode use ./build_release.sh. This tries to make the smallest binary possible. It uses immediate abort panic, and specific RUSTFLAGS. Running cargo build --release alone will not work.
Continuation (-c) is TMUX aware. It continues the last conversation from the current tmux pane. That means you can carry on multiple conversations, one per pane. If there is no previous conversation for this pane, or you are not in tmux, it uses the most recent conversation globally.
The conversations are stored in ${XDG_CACHE_HOME}/ort/last-*.json. To disable storing them set save_to_file to false in config.
Stats printed at the end:
openrouter/auto. Useful if you're doing evals because now the output includes the model name.-rr they are not displayed. That can make the TTFT look wrong.The API key and defaults can be stored in ${XDG_CONFIG_HOME}/ort.json, which is usually ~/.config/ort.json. There are also some settings you can use to go faster such as dns.
{
"keys": [{"name": "openrouter", "value": "sk-..."}],
"settings": {
"save_to_file": true,
"dns": ["104.18.2.115", "104.18.3.115"]
},
"prompt_opts": {
"model": "deepseek/deepseek-r1-0528",
"system": "Make your answer concise but complete. No yapping. Direct professional tone. No emoji.",
"priority": "latency",
"quiet": false,
"show_reasoning": false,
"reasoning": {
"enabled": true,
"effort": "medium"
}
}
}
Here are the settings that are not available on the command line:
save_to_file: Whether to also write the output to $XDG_CACHE_HOME}/ort/last.json. Defaults to true. The continuation (-c) feature needs this.dns: The IP address(es) of openrouter.ai. This saves time, no DNS lookups. Allows up to 16 addresses, although fewer is probably better.ort does not validate TLS certificates. This makes the binary smaller and the connection start faster. It also makes it vunlerable to man-in-the-middle attacks. You can verify easily enough by browsing to openrouter.ai to see if your browser shows an alert. If it doesn't you're likely fine. It would be possible to identify an ort request vs a browser request, by fingerprinting the TLS ClientHello, but it is extremely unlikely at this stage.
The biggest risk is a man-in-the-middle gaining your OpenRouter API key. I have a $5 daily limit on mine, which I recommend. Your prompts are already likely saved by the provider, so hopefully you weren't expecting those to be secret anyway.
A man-in-the-middle attack could be perpetrated by the owner of the wi-fi router (so use a VPN when not at home!), your ISP, or a major network provider between your ISP and openrouter.
To summarize, we gain a smaller binary and faster requests, in exchange for a very small risk of losing up to $5 a day, that you can mitigate simply by browsing to openrouter.ai. Security engineers hate this one weird trick.
ort has it's own TLS 1.3 stack, including all crypto operations: AES-128 GCM, HMAC, HKDF, ECDH/X25519, SHA-256. GPT-5 wrote the bulk of these implementations, one at a time. I wrote the tests first. They have not been reviewed by anyone else.
Here's what I got from the models I use regularly.
Required reasoning, MUST pass -r low|medium|high flag:
-r none for no reasoning, it's both required and optional)Optional reasoning with effort, pass -r off|low|medium|high:
Optional reasoning with tokens, pass e.g -r off|4096:
Always fixed reasoning, cannot be configured or disabled:
No reasoning:
All of this in my .bash_aliases file. Then are all bash aliases, so you type your prompt right after: q Is this a question?.
First make them token efficient with a system prompt:
SYSTEM_PROMPT="Make your answer concise but complete. No yapping. Direct professional tone. No emoji."
I use this hundreds of times a day for all my easy questions, it's quicker than reading the docs. This model is very fast, almost free, and high quality. Sterling job Google!
alias q='ort -p latency -m google/gemini-2.5-flash-lite-preview-09-2025 -r off -s "$SYSTEM_PROMPT"'
I just added multi-model support, so after a bit of investigation these are the best combination of fast, token efficient, and good quality:
alias qc='ort -p latency -r off -s "$SYSTEM_PROMPT" \
-m amazon/nova-lite-v1 \
-m qwen/qwen3-next-80b-a3b-instruct \
-m x-ai/grok-4-fast \
-m google/gemini-2.5-flash-lite-preview-09-2025'
Sometimes you need to write some code or ask a hard question. These are IMHO the smartest models today. GPT 5 and 5.1 write very good Rust.
alias gpt='ort -m openai/gpt-5.1 -r medium -s "$SYSTEM_PROMPT"'
alias gemini='ort -m google/gemini-3-pro-preview -r medium'
For open models, I like these two, although I don't use them as much:
alias glm='ort -r medium -pr z-ai -m z-ai/glm-4.6:exacto -s "$SYSTEM_PROMPT"'
alias m2='ort -m minimax/minimax-m2 -r medium -s "$SYSTEM_PROMPT"'
Here's an advanced example of how I use it in tmux:
#!/bin/bash
# # Query multiple models in tmux panes
# Usage: xx Prompt goes here
#
# - Resets window panes to three horizontal rows
# - Runs the query in three LLMs, one in each window
SYSTEM_PROMPT="Make your answer concise but complete. No yapping. Direct professional tone. No emoji."
MODEL_1=z-ai/glm-4.5 # Hybrid reasoning, -r useful here
MODEL_2=moonshotai/kimi-k2-0905 # No reasoning
MODEL_3=deepseek/deepseek-chat-v3.1 # Hybrid
# Close all other panes in the current window (keep only the current one)
tmux kill-pane -a
# Split the current pane horizontally to create 3 equal panes
tmux split-window -v
tmux split-window -v
# Select all panes and distribute them evenly
# try also: even-vertical (2 or 3 panes) or tiled (good for 4 panes)
tmux select-layout even-vertical
# Run commands in each pane
tmux send-keys -t 0 "ort -m $MODEL_1 -r medium -s \"$SYSTEM_PROMPT\" \"$*\"" Enter
tmux send-keys -t 1 "ort -m $MODEL_2 -s \"$SYSTEM_PROMPT\" \"$*\"" Enter
tmux send-keys -t 2 "ort -m $MODEL_3 -r medium -s \"$SYSTEM_PROMPT\" \"$*\"" Enter
# Optional: Select the first pane
t
If you mostly use models from the big-three labs I highly recommend trying OpenRouter. You get to use Qwen, DeepSeek, Kimi, GLM - all of which have impressed me - and all sorts of cutting edge experiments, such as diffusion model Mercury. Orginal version written by GPT-5 to my spec.
MIT Licence.