Back to search
A comprehensive human-in-the-loop evaluation platform for Large Language Models, built for AI alignment and safety research. This Flask-based application enables human evaluators to provide structured feedback on LLM outputs across multiple quality dimensions.
Stars
2
Forks
0
Watchers
2
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
4
commits