TashanGKD/agent-eval-benchmarks - GitHub Explorer | GitHub Explorer | Trending

Stars

0

Forks

0

Watchers

0

Open Issues

0

Repository Health Score

🧡

55/100

Fair

Overall repository health assessment

Activity

Regular updates - updated this month

20/30

67%

Recent Commits

feat: add data analysis + 4 result figures (Phase 1)

Boyuan-Zheng•1 week ago

docs: add README per standards + Phase 1 experiment results

Boyuan-Zheng•1 week ago

exp: StreamBench pilot 50轮 A vs B — HotpotQA 多跳问答结果

BoYuan•2 weeks ago

exp: run experiments 1+3 baseline — AgentHarm + ARC-AGI-2 results

BoYuan•2 weeks ago

feat: add eval_adapter — research plan preparation phase complete

BoYuan•2 weeks ago

docs: add research plan v1.0 for self-evolving agent evaluation

BoYuan•2 weeks ago

init: add agent-eval-benchmarks project with 15 benchmark repos

BoYuan•2 weeks ago