agent-execution-harness

Crates.ioagent-execution-harness
lib.rsagent-execution-harness
version0.2.0
created_at2026-01-20 04:57:56.771952+00
updated_at2026-01-21 17:48:26.003419+00
descriptionA test harness for validating AI agent behavior against steering guides
homepage
repositoryhttps://github.com/tatimblin/agent-execution-harness
max_upload_size
id2055790
size155,365
Tristan Timblin (tatimblin)

documentation

README

Agent Execution Harness

A test harness for validating AI agent (Claude Code) behavior against steering guides. This tool executes Claude with prompts and asserts on the tool calls made.

Installation

From crates.io

cargo install agent-execution-harness

From Homebrew

brew tap tatimblin/agent-execution-harness
brew install agent-execution-harness

From source

git clone https://github.com/tatimblin/agent-execution-harness
cd agent-execution-harness
cargo build --release

Usage

The harness has two main commands:

Run tests

Execute Claude with a test file and evaluate assertions:

# Run a single test
harness run test.yaml

# Run all tests in a directory
harness run tests/

# With verbose output and custom working directory
harness run test.yaml -v -w /path/to/workdir

Analyze existing sessions

Analyze a pre-existing Claude session log against test assertions:

harness analyze test.yaml session.jsonl

Test File Format

Tests are defined in YAML files:

name: "Test name"
prompt: "The prompt to send to Claude"
assertions:
  - tool: Read
    called: true
    params:
      file_path: "*.txt"  # glob pattern
  - tool: Bash
    called: false
  - tool: Write
    called_after: Read

Assertion Types

  • called: true/false - Whether a tool was called
  • params - Parameter matching (supports glob patterns, regex, or exact match)
  • called_after - Ordering assertions (tool A must be called after tool B)

Development

# Build the project
cargo build

# Run tests
cargo test

# Run a specific test
cargo test test_name

Release Process

This project has fully automated releases to both crates.io and Homebrew:

One-time setup

  1. Create crates.io API token:

  2. Create GitHub personal access token:

    • Go to GitHub Settings → Developer settings → Personal access tokens
    • Create a token with repo permissions for your homebrew-tap repo
    • Add it as HOMEBREW_TAP_TOKEN secret in GitHub repo settings
  3. Create Homebrew tap repository:

    gh repo create tatimblin/homebrew-agent-execution-harness --public
    

Create a new release

# Bump version and trigger fully automated release
./release.sh patch  # 0.1.0 -> 0.1.1
./release.sh minor  # 0.1.0 -> 0.2.0
./release.sh major  # 0.1.0 -> 1.0.0
./release.sh 1.2.3  # specific version

This single command automatically:

  1. Updates version in Cargo.toml and Cargo.lock
  2. Creates and pushes a git tag
  3. Triggers GitHub Actions that:
    • Builds binaries for all platforms
    • Creates GitHub release with binaries
    • Publishes to crates.io
    • Calculates SHA256 hashes for Homebrew formula
    • Updates and pushes the Homebrew formula to your tap

No manual steps required! 🎉

Architecture

  • parser.rs - Parses Claude Code JSONL session logs
  • assertions.rs - Test structure and assertion evaluation
  • executor.rs - Executes Claude and finds session logs
  • watcher.rs - File watching for incremental parsing

License

MIT

Commit count: 39

cargo fmt