Search Results

Found 50 repositories(showing 30)

LLMEvaluation

alopatenko

🧡65

A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods.

182

HTML

Updated 20 hours ago

evaluationgenerative-ai-benchmarkingllm+2

LLMevaluation

fhuthmacher

❤️30

No description available

Apache-2.0

Jupyter Notebook

Updated 1 month ago

llmEvaluation

Nihility123

❤️25

No description available

Python

Updated 1 year ago

LLMEvaluation-A100-vs-V100-

Paulyang80

❤️35

Found out that using A100 and V100 on Vicuna and Llama2 have a different result, while other model such as Falcon doesn't has such question.

Jupyter Notebook

Updated 2 years ago

a100evaluationllm+1

LLMEvaluationAuto

AbiramiSukumaran

❤️40

This lab evaluates a foundation model and fine tuned model based on Automatic Metrics of model results on the evaluation data and you will use the following Google Cloud products: Vertex AI Pipelines Vertex AI Evaluation Services Vertex AI Model Registry Vertex AI Endpoints

Apache-2.0

Jupyter Notebook

Updated 1 year ago

llmevaluation

metunlp

❤️20

No description available

Python

Updated 7 months ago

LLMEvaluationWithGraphs

NessimBenA

❤️20

No description available

Python

Updated 10 months ago

LegalLLMEvaluation

liuchengyuan123

❤️25

No description available

Updated 1 year ago

LLMEvaluation

prasannavj

❤️40

LLM Evaluation Approach

Apache-2.0

Jupyter Notebook

Updated 1 year ago

LLMEvaluation

animeshsen01

❤️25

No description available

Python

Updated 4 months ago

llmevaluationframework

isathish

🧡60

LLM Evaluation Framework

MIT

HTML

Updated 1 week ago

llmevaluation

LisaComments

❤️35

llmevaluation

JavaScript

Updated 1 year ago

LLMEvaluationRepo

emmavicto

❤️25

No description available

Jupyter Notebook

Updated 6 months ago

LLMEvaluation

ShikhaSomvanshi

❤️25

No description available

Python

Updated 11 months ago

LLMEvaluation

AlmajedA

❤️35

This project is a web application designed to evaluate the responses of different Large Language Models (LLMs) in a Question and Answering task.

HTML

Updated 1 year ago

LLMEvaluationWithRAGAS

krishnaperumal26

❤️25

No description available

Python

Updated 9 months ago

LLMevaluation

dawndigit

❤️20

No description available

Updated 1 year ago

LLMEvaluation

ki2batt

❤️20

No description available

Python

Updated 9 months ago

LLMEvaluation

sarupetceju

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

LLMevaluation

chaaiitanya

❤️25

No description available

HTML

Updated 11 months ago

LLMEvaluation

kavin-crypto

❤️35

RAG / LLM evaluation using Ragas – implemented metrics to quantify answer quality in Retrieval-Augmented Generation.

Python

Updated 5 months ago

LLMEvaluation

AnuSree2468

❤️25

No description available

Python

Updated 3 months ago

LLMEvaluation

Sneh14

❤️30

This repository provides a testing and evaluation framework for Custom Large Language Models (LLMs) built on Retrieval-Augmented Generation (RAG) architecture.

Python

Updated 7 months ago

LLMEvaluation

errodriguez

❤️40

EvidentlyAI LLM Evaluation Course - cohort 2025

AGPL-3.0

Jupyter Notebook

Updated 10 months ago

LLMEvaluation

sonephyo

❤️35

Evaluating different AI models using LLM as a Judge, and scoring different LLM Responses

TypeScript

Updated 1 year ago

LLMevaluation

mailmahee

❤️30

No description available

Apache-2.0

Jupyter Notebook

Updated 2 years ago

LLMEvaluation

KirthiCTS

❤️25

No description available

Python

Updated 4 months ago

LLMEvaluation

laboni68

❤️30

No description available

Updated 2 months ago

LLMEvaluation

ankita1124

❤️45

RAG LLM Evaluation

Python

Updated 2 months ago

LLMEvaluation

rodydubey

❤️35

Repository dedicated to test LLM evaluation using Hugging Face Open LLM leaderboard and custom evalution

Jupyter Notebook

Updated 1 year ago

GitHub Explorer

Search Results

LLMEvaluation

LLMevaluation

llmEvaluation

LLMEvaluation-A100-vs-V100-

LLMEvaluationAuto

llmevaluation

LLMEvaluationWithGraphs

LegalLLMEvaluation

LLMEvaluation

LLMEvaluation

llmevaluationframework

llmevaluation

LLMEvaluationRepo

LLMEvaluation

LLMEvaluation

LLMEvaluationWithRAGAS

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMevaluation

llmEvaluation

LLMEvaluation-A100-vs-V100-

LLMEvaluationAuto

llmevaluation

LLMEvaluationWithGraphs

LegalLLMEvaluation

LLMEvaluation

LLMEvaluation

llmevaluationframework

llmevaluation

LLMEvaluationRepo

LLMEvaluation

LLMEvaluation

LLMEvaluationWithRAGAS

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMevaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation

LLMEvaluation