| Crates.io | agent-execution-harness |
| lib.rs | agent-execution-harness |
| version | 0.2.0 |
| created_at | 2026-01-20 04:57:56.771952+00 |
| updated_at | 2026-01-21 17:48:26.003419+00 |
| description | A test harness for validating AI agent behavior against steering guides |
| homepage | |
| repository | https://github.com/tatimblin/agent-execution-harness |
| max_upload_size | |
| id | 2055790 |
| size | 155,365 |
A test harness for validating AI agent (Claude Code) behavior against steering guides. This tool executes Claude with prompts and asserts on the tool calls made.
cargo install agent-execution-harness
brew tap tatimblin/agent-execution-harness
brew install agent-execution-harness
git clone https://github.com/tatimblin/agent-execution-harness
cd agent-execution-harness
cargo build --release
The harness has two main commands:
Execute Claude with a test file and evaluate assertions:
# Run a single test
harness run test.yaml
# Run all tests in a directory
harness run tests/
# With verbose output and custom working directory
harness run test.yaml -v -w /path/to/workdir
Analyze a pre-existing Claude session log against test assertions:
harness analyze test.yaml session.jsonl
Tests are defined in YAML files:
name: "Test name"
prompt: "The prompt to send to Claude"
assertions:
- tool: Read
called: true
params:
file_path: "*.txt" # glob pattern
- tool: Bash
called: false
- tool: Write
called_after: Read
called: true/false - Whether a tool was calledparams - Parameter matching (supports glob patterns, regex, or exact match)called_after - Ordering assertions (tool A must be called after tool B)# Build the project
cargo build
# Run tests
cargo test
# Run a specific test
cargo test test_name
This project has fully automated releases to both crates.io and Homebrew:
Create crates.io API token:
CARGO_REGISTRY_TOKEN secret in GitHub repo settingsCreate GitHub personal access token:
repo permissions for your homebrew-tap repoHOMEBREW_TAP_TOKEN secret in GitHub repo settingsCreate Homebrew tap repository:
gh repo create tatimblin/homebrew-agent-execution-harness --public
# Bump version and trigger fully automated release
./release.sh patch # 0.1.0 -> 0.1.1
./release.sh minor # 0.1.0 -> 0.2.0
./release.sh major # 0.1.0 -> 1.0.0
./release.sh 1.2.3 # specific version
This single command automatically:
Cargo.toml and Cargo.lockNo manual steps required! 🎉
MIT