helmsman

Crates.iohelmsman
lib.rshelmsman
version0.0.0
created_at2026-01-23 23:52:03.660555+00
updated_at2026-01-23 23:52:03.660555+00
descriptionAdaptive instruction server
homepagehttps://github.com/seuros/helmsman
repositoryhttps://github.com/seuros/helmsman
max_upload_size
id2065717
size130,870
Abdelkader Boudih (seuros)

documentation

https://docs.rs/helmsman

README

Helmsman

Adaptive instruction server for AI coding agents.

The Problem

Static AGENTS.md files create instruction entropy collapse.

The same instructions for Opus, Sonnet, and Haiku is fantasy. They have different capabilities, different costs, and different failure modes. Static instructions:

  • Rot silently - written once, never updated, drift from reality
  • Waste tokens - Opus doesn't need step-by-step guidance
  • Cause failures - Haiku needs guardrails it doesn't get
  • Ignore context - "use brew" when you're on Arch with mise

The real cost isn't just tokens - it's instruction determinism. You can't control what you can't adapt.

The Solution

Helmsman serves dynamic, context-aware instructions via MCP (and CLI):

  • Model-aware: Opus gets minimal guidance, Haiku gets verbose hand-holding
  • Environment-aware: Detects OS, shell, available tools
  • Template-based: Jinja2 templates adapt to context
{% if model.tier == "agi" %}
Verify packages exist. You know what to do.
{% else %}
1. Read the file first with Read tool
2. Check for existing patterns
3. Verify packages exist - do NOT invent them
{% endif %}

{% if env.has_mise %}
Use mise for runtime management.
{% elif env.has_brew %}
Use brew for packages.
{% endif %}

Install

cargo install helmsman

Quick Start

  1. Create AGENTS.md.j2 in your project root

  2. Add to .mcp.json:

{
  "mcpServers": {
    "helmsman": {
      "type": "stdio",
      "command": "helmsman"
    }
  }
}
  1. Call the prompt:
/helmsman:instructions claude-opus-4-5-20251101

Tiers

Three capability tiers, parallel to Anthropic's model siblings. Could expand to 4-5, but we're not going the OpenAI route of 40 model names (nano, mini, micro, medium, large...).

monkey

Follows instructions. Useful but needs guardrails. Tell it exactly what to do, what NOT to do, and keep it inside the perimeter. Without guidance, it will hallucinate packages and invent APIs.

Examples: Haiku, GPT-5.2 mini, Gemini Flash

engineer

Knows the basics. Competent but lacks judgment. Will delete your 300GB cache to fix a bug. Bug fixed, but now you wait 4 hours for it to rebuild. Needs boundaries, not hand-holding.

Examples: Sonnet, GPT-5.2 medium, Gemini Pro

agi

The architect. Don't explain how to use cargo or how to publish. It knows. Give it constraints and goals, not procedures. Wasting tokens on step-by-step instructions is burning money.

Examples: Opus, GPT-5.2 high/xhigh, DeepSeek R3

Shortcuts: a/architect (agi), e/eng/standard (engineer), m/basic/simple (monkey)

CLI

helmsman                              # MCP server mode
helmsman -i                           # print instructions (default tier)
helmsman -i m                         # monkey tier
helmsman -i basic                     # monkey tier (neutral alias)
helmsman -i a                         # agi tier
helmsman -i architect                 # agi tier (neutral alias)
helmsman -i claude-opus-4-5-20251101  # resolves to agi tier
helmsman -i gpt-4o-mini               # resolves to monkey tier

# Override tier mapping for new/unknown models
helmsman -i unknown-model --tier engineer

# Show diff between tiers
helmsman -i a --diff e                # show AGI vs Engineer differences

helmsman -s commit                    # render .skills/commit.j2
helmsman -l                           # list available skills
helmsman --validate                   # check skill syntax
helmsman -t                           # show token count

Template Context

{# Model #}
{{ model.id }}        {# "claude-opus-4-5-20251101" #}
{{ model.tier }}      {# "agi", "engineer", "monkey" #}

{# Environment #}
{{ env.os }}          {# "macos", "arch", "debian", "alpine" #}
{{ env.shell }}       {# "zsh", "bash", "fish", "sh" #}
{{ env.in_docker }}   {# true/false #}
{{ env.in_ssh }}      {# true/false #}

{# Tools #}
{{ env.has_mise }}
{{ env.has_brew }}
{{ env.has_apt }}
{{ env.has_gh }}
{{ env.has_git }}

Configuration

helmsman.toml - tier definitions:

[tiers.agi]
max_context = 8000
features = ["planning", "architecture", "autonomous"]

[tiers.monkey]
max_context = 1000
features = ["quick_tasks", "simple_edits"]

[defaults]
tier = "engineer"

src/models.toml - model → tier mappings (embedded into the binary):

[models]
"claude-opus-*" = "agi"
"claude-*-sonnet-*" = "engineer"
"claude-*-haiku-*" = "monkey"
"gpt-5.2-xhigh*" = "agi"
"gpt-5.2-high*" = "agi"
"gpt-5.2-medium*" = "engineer"
"gpt-5.2-mini*" = "monkey"
"deepseek-r3*" = "agi"

Skills

Project skills live in .skills/ and are discovered automatically. Files prefixed with _ are partials.

Environment Detection

Helmsman detects OS, shell, and available tools automatically. Best effort only, never authoritative.

Known edge cases:

  • SSH + Docker may report wrong shell ($SHELL lies)
  • Alpine/busybox may lack expected binaries
  • Container detection uses heuristics (cgroup parsing)

Use these values for optimization hints, not hard requirements.

Non-Goals

Things Helmsman deliberately doesn't do:

  • Prompt engineering framework - not here to optimize your prompts
  • Model memory/learning - stateless, no persistence between calls
  • Teaching tool - assumes you know what you're doing
  • Configuration management - use real tools for that

Helmsman is infrastructure, not a product.

License

BSD-3-Clause

Commit count: 2

cargo fmt