Found 50 repositories(showing 30)
alopatenko
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods.
fhuthmacher
No description available
Nihility123
No description available
Paulyang80
Found out that using A100 and V100 on Vicuna and Llama2 have a different result, while other model such as Falcon doesn't has such question.
AbiramiSukumaran
This lab evaluates a foundation model and fine tuned model based on Automatic Metrics of model results on the evaluation data and you will use the following Google Cloud products: Vertex AI Pipelines Vertex AI Evaluation Services Vertex AI Model Registry Vertex AI Endpoints
metunlp
No description available
NessimBenA
No description available
liuchengyuan123
No description available
prasannavj
LLM Evaluation Approach
animeshsen01
No description available
isathish
LLM Evaluation Framework
LisaComments
llmevaluation
emmavicto
No description available
ShikhaSomvanshi
No description available
AlmajedA
This project is a web application designed to evaluate the responses of different Large Language Models (LLMs) in a Question and Answering task.
krishnaperumal26
No description available
dawndigit
No description available
ki2batt
No description available
sarupetceju
No description available
chaaiitanya
No description available
kavin-crypto
RAG / LLM evaluation using Ragas – implemented metrics to quantify answer quality in Retrieval-Augmented Generation.
AnuSree2468
No description available
Sneh14
This repository provides a testing and evaluation framework for Custom Large Language Models (LLMs) built on Retrieval-Augmented Generation (RAG) architecture.
errodriguez
EvidentlyAI LLM Evaluation Course - cohort 2025
sonephyo
Evaluating different AI models using LLM as a Judge, and scoring different LLM Responses
mailmahee
No description available
KirthiCTS
No description available
laboni68
No description available
ankita1124
RAG LLM Evaluation
rodydubey
Repository dedicated to test LLM evaluation using Hugging Face Open LLM leaderboard and custom evalution