serdes-ai-evals

Crates.ioserdes-ai-evals
lib.rsserdes-ai-evals
version0.1.2
created_at2026-01-15 23:12:04.956386+00
updated_at2026-01-23 16:43:58.903295+00
descriptionEvaluation framework for testing and benchmarking serdes-ai agents
homepage
repositoryhttps://github.com/janfeddersen-wq/serdesAI
max_upload_size
id2047325
size127,870
(janfeddersen-wq)

documentation

README

serdes-ai-evals

Crates.io Documentation License: MIT

Evaluation framework for testing and benchmarking serdes-ai agents

This crate provides evaluation and testing capabilities for SerdesAI:

  • Test case definitions
  • Evaluation metrics (accuracy, latency, cost)
  • Benchmark harness
  • Regression testing
  • LLM-as-judge evaluators

Installation

[dependencies]
serdes-ai-evals = "0.1"

Usage

use serdes_ai_evals::{EvalSuite, TestCase, Evaluator};

let suite = EvalSuite::new("my-agent-tests")
    .case(TestCase::new("greeting")
        .input("Hello!")
        .expected_contains("Hello"))
    .case(TestCase::new("math")
        .input("What is 2+2?")
        .expected_contains("4"));

let results = suite.run(&agent).await?;
println!("Pass rate: {:.1}%", results.pass_rate() * 100.0);

Part of SerdesAI

This crate is part of the SerdesAI workspace.

For most use cases, you should use the main serdes-ai crate which re-exports these types.

License

MIT License - see LICENSE for details.

Commit count: 42

cargo fmt