AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support for service-based models.
Stars
54
Forks
23
Watchers
54
Open Issues
53
Overall repository health assessment
No package.json found
This might not be a Node.js project
47
commits
32
commits
15
commits
10
commits
9
commits
6
commits
1
commits
1
commits
1
commits
1
commits
[Design update] Update Design of Judge Model and GEdit Bench in design doc (#214)
d8bd4eaView on GitHubFix the issue where TTFT and TPOT have no data when running Kimi2.5 i… (#153)
1cc180cView on GitHub[bugfix] fix bugs about reuse infer in judge infer cases (#202)
e0397d8View on GitHub【BugFix】Fix model configuration compatibility in datasets and postprocessors (#190)
7f6b780View on GitHub