Evaluate AI agents with Unix-style pipeline commands. Schema-driven adapters for any CLI agent, trajectory capture, pass@k metrics, and multi-run comparison.
Stars
2
Forks
0
Watchers
2
Open Issues
1
Overall repository health assessment
0.8.0^4.3.62.3.141.3.97.0.016.2.75.9.369
commits
fix(trajectory): capture step timestamps at event time and extract tool input/output (#53)
a579bd4View on GitHubfeat(schemas): add discriminated union to QualityMetricsSchema (#51)
08ed406View on GitHubfeat(compare): extract latency and quality metrics from trials data (#49)
073de66View on GitHubchore: migrate from .plaited/ to .agents/ and update publish workflow (#47)
60b70d0View on GitHubfeat: add --stdin flag and memory docs for container orchestration (#46)
3fb01dcView on GitHubfeat(capture,trials): add parallelization with worker pool (#44)
f21ee5eView on GitHubfix(compare): add type discriminator to reliability metrics output (#43)
cc2cb3eView on GitHub