Automated harness evolution for AI agents. A Claude Code plugin that iteratively optimizes system prompts, routing, retrieval, and orchestration code using full-trace counterfactual diagnosis. Based on Meta-Harness (Lee et al., 2026).
Stars
4
Forks
2
Watchers
4
Open Issues
0
Overall repository health assessment
331
commits
fix: check both run.error and outputs.error for rate-limit detection
1da4274View on GitHubMerge pull request #24 from raphaelchristi/fix/eval-reliability
1a79875View on GitHubrefactor: move RATE_LIMIT_RE to _common.py, add --sample-split warning, update docs
40964bbView on GitHubfix: evolve skill passes --sample-split train in light mode
acace0fView on GitHubfix: split-aware sampling + minimum N guard for reliable held_out comparison
42c7d4eView on GitHubfix: rate-limit detection regex, stderr tail truncation, certify canary
94447f9View on GitHubMerge pull request #23 from raphaelchristi/feat/compound-learning-certify
7b4e3b9View on GitHubfix: address code review — regex format, phase heading, terminology, tests
5db61f7View on GitHubdocs: compound learning + certify in CLAUDE.md, FEATURES.md + tests
8206f1fView on GitHubfeat: add /harness:certify skill — verify score stability via repeated eval
8247ca9View on GitHubfeat: consolidator flags promotion candidates + deploy offers learning promotion
620d58aView on GitHubfeat: add promote_learnings.py — compound learning from evolution to CLAUDE.md
dcaff68View on GitHub