Found 22 repositories(showing 22)
braintrustdata
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
zhouzypaul
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | CoRL 2025
Simon4Yan
Automatic model evaluation (AutoEval) in CVPR'21&TPAMI'22
LogSSim
Trustworthy Evaluation of Robotic Manipulation: A New Benchmark and AutoEval Methods
madhurprash
A framework to generate automated evals for your agentic application.
PengZirong
dify-autoeval
scikit-autoeval
scikit-autoeval: automatic evaluation for ML models
datar-psa
An autoevals-inspired toolkit for scoring AI model responses in Go.
MuhammedFarhanSyed
AutoEval AI – Automatic Exam Paper Evaluation System
frontsail-ai
A minimal, vitest-native evals library for LLM applications. Built-in metrics, G-Eval LLM-as-judge, autoevals integration, and pretty console output.
happy99996
AutoEval
Hamxea
LLM_AutoEval
one-aalam
TypeScript-first LLM evaluation library built on Autoevals with Vitest integration
chrisco210
AutoEval processes forms and data
biocentral
Biocentral Hub: Autoeval Service (plm leaderboard)
compdemocracy
Narrative summary coverage of underlying comments, basis for autoeval
Richard-Wth
AutoEval framework for data generation and model upgrading.
ranfysvalle02
Master RAG system development and validation. This project uses MongoDB Atlas & Azure OpenAI, showing how to build, test, and evaluate AI applications with autoevals for measurable performance.
kclip
code for the paper "Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees"
andrew-lastmile
Streamlit app that uses LastMile AI's AutoEval evaluators fine-tuned on gemini and openai responses
mjc-ma
This is a attack method for autoeval which is aim to make the test set accuracy from low to high. Our attack method is based on the confidence method which is a branch of autoeval.
kunalhonde03
AutoEval is a smart web-based academic analytics system that automates result classification, attendance detection and analysis from online assessment data. It enables teachers to upload Excel results, instantly filter branch/division-wise records, generate insights, and export structured reports reducing manual effort, errors, and analysis time.
All 22 repositories loaded