serdes-ai-evals

Crates.io	serdes-ai-evals
lib.rs	serdes-ai-evals
version	0.1.2
created_at	2026-01-15 23:12:04.956386+00
updated_at	2026-01-23 16:43:58.903295+00
description	Evaluation framework for testing and benchmarking serdes-ai agents
homepage
repository	https://github.com/janfeddersen-wq/serdesAI
max_upload_size
id	2047325
size	127,870

(janfeddersen-wq)

documentation

README

serdes-ai-evals

Evaluation framework for testing and benchmarking serdes-ai agents

This crate provides evaluation and testing capabilities for SerdesAI:

Test case definitions
Evaluation metrics (accuracy, latency, cost)
Benchmark harness
Regression testing
LLM-as-judge evaluators

Installation

[dependencies]
serdes-ai-evals = "0.1"

Usage

use serdes_ai_evals::{EvalSuite, TestCase, Evaluator};

let suite = EvalSuite::new("my-agent-tests")
    .case(TestCase::new("greeting")
        .input("Hello!")
        .expected_contains("Hello"))
    .case(TestCase::new("math")
        .input("What is 2+2?")
        .expected_contains("4"));

let results = suite.run(&agent).await?;
println!("Pass rate: {:.1}%", results.pass_rate() * 100.0);

Part of SerdesAI

This crate is part of the SerdesAI workspace.

For most use cases, you should use the main serdes-ai crate which re-exports these types.

License

MIT License - see LICENSE for details.

Commit count: 42

serdes-ai-evals

documentation

README

serdes-ai-evals

Installation

Usage

Part of SerdesAI

License

cargo fmt