agent-execution-harness

Crates.io	agent-execution-harness
lib.rs	agent-execution-harness
version	0.2.0
created_at	2026-01-20 04:57:56.771952+00
updated_at	2026-01-21 17:48:26.003419+00
description	A test harness for validating AI agent behavior against steering guides
homepage
repository	https://github.com/tatimblin/agent-execution-harness
max_upload_size
id	2055790
size	155,365

Tristan Timblin (tatimblin)

documentation

README

Agent Execution Harness

A test harness for validating AI agent (Claude Code) behavior against steering guides. This tool executes Claude with prompts and asserts on the tool calls made.

Installation

From crates.io

cargo install agent-execution-harness

From Homebrew

brew tap tatimblin/agent-execution-harness
brew install agent-execution-harness

From source

git clone https://github.com/tatimblin/agent-execution-harness
cd agent-execution-harness
cargo build --release

Usage

The harness has two main commands:

Run tests

Execute Claude with a test file and evaluate assertions:

# Run a single test
harness run test.yaml

# Run all tests in a directory
harness run tests/

# With verbose output and custom working directory
harness run test.yaml -v -w /path/to/workdir

Analyze existing sessions

Analyze a pre-existing Claude session log against test assertions:

harness analyze test.yaml session.jsonl

Test File Format

Tests are defined in YAML files:

name: "Test name"
prompt: "The prompt to send to Claude"
assertions:
  - tool: Read
    called: true
    params:
      file_path: "*.txt"  # glob pattern
  - tool: Bash
    called: false
  - tool: Write
    called_after: Read

Assertion Types

called: true/false - Whether a tool was called
params - Parameter matching (supports glob patterns, regex, or exact match)
called_after - Ordering assertions (tool A must be called after tool B)

Development

# Build the project
cargo build

# Run tests
cargo test

# Run a specific test
cargo test test_name

Release Process

This project has fully automated releases to both crates.io and Homebrew:

One-time setup

Create crates.io API token:
- Go to https://crates.io/settings/tokens and create a token
- Add it as CARGO_REGISTRY_TOKEN secret in GitHub repo settings
Create GitHub personal access token:
- Go to GitHub Settings → Developer settings → Personal access tokens
- Create a token with repo permissions for your homebrew-tap repo
- Add it as HOMEBREW_TAP_TOKEN secret in GitHub repo settings

Create Homebrew tap repository:

gh repo create tatimblin/homebrew-agent-execution-harness --public

Create a new release

# Bump version and trigger fully automated release
./release.sh patch  # 0.1.0 -> 0.1.1
./release.sh minor  # 0.1.0 -> 0.2.0
./release.sh major  # 0.1.0 -> 1.0.0
./release.sh 1.2.3  # specific version

This single command automatically:

Updates version in Cargo.toml and Cargo.lock
Creates and pushes a git tag
Triggers GitHub Actions that:
- Builds binaries for all platforms
- Creates GitHub release with binaries
- Publishes to crates.io
- Calculates SHA256 hashes for Homebrew formula
- Updates and pushes the Homebrew formula to your tap

No manual steps required! 🎉

Architecture

parser.rs - Parses Claude Code JSONL session logs
assertions.rs - Test structure and assertion evaluation
executor.rs - Executes Claude and finds session logs
watcher.rs - File watching for incremental parsing

License

MIT

Commit count: 39