Found 237,956 repositories(showing 30)
openai
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
EleutherAI
A framework for few-shot evaluation of language models.
:boom: React Native UI Library based on Eva Design System :new_moon_with_face::sparkles:Dark Mode
akveo
A pack of more than 480 beautifully crafted Open Source icons. SVG, Sketch, Web Font and Animations support.
akveo
:boom: Customizable Angular UI Library based on Eva Design System :new_moon_with_face::sparkles:Dark Mode
ktr0731
Evans: more expressive universal gRPC client
openai
No description available
EvolvingLMMs-Lab
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
openai
Code for the paper "Evaluating Large Language Models Trained on Code"
1y0n
ๆฉๆฅ - ๅ ๆๆง่กๅจ็ๆๅทฅๅ ท
georgia-tech-db
Database system for AI-powered apps
baaivision
EVA Series: Visual Representation Fantasies from BAAI
modelscope
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
huggingface
๐ค Evaluate: A library for easily evaluating machine learning models and datasets.
facebookresearch
A python tool for evaluating the quality of sentence embeddings.
huggingface
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
rmcelreath
Statistical Rethinking course at MPI-EVA from Dec 2018 through Feb 2019
Cloud-CV
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
tatsu-lab
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Veil-Framework
Veil Evasion is no longer supported, use Veil 3.0!
TTLabs
Javascript library for browser to S3 multipart resumable uploads
eva-engine
Eva.js is a front-end game engine specifically for creating interactive game projects.
evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
MLGroupJLU
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
BytedanceSpeech
No description available
oddcod3
Python antivirus evasion tool
mattpocock
Evaluate your LLM-powered apps with TypeScript
Maluuba
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
silentmatt
Mathematical expression evaluator in JavaScript
refreshdotdev
An MCP server that autonomously evaluates web applications.