A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Stars
3.3k
Forks
244
Watchers
3.3k
Open Issues
69
Overall repository health assessment
No package.json found
This might not be a Node.js project
19
commits
17
commits
12
commits
11
commits
3
commits
2
commits
2
commits
2
commits
1
commits
1
commits
Docs: clarify Python 3.9 recommended for dependency install
a3cc91aView on GitHubAdd lite presets for starting and evaluating minimal task suite
5acc44bView on GitHub