Search Results

Found 19 repositories(showing 19)

LLM-Conversational-Agent-Evaluator

Dhiac7

❤️25

No description available

Python

Updated 3 months ago

athena-rr

Vrindiesel

❤️40

A Transformer-based Response Evaluator for Open-Domain Spoken Conversation

Apache-2.0

Python

Updated 2 years ago

This repository is used for reference in thesis paper "Enhancing Language Model Efficiency through Sequence-Level Knowledge Distillation with Sparse Transformers" in context on GPT Conversation Evaluator

Python

Updated 11 months ago

conversation-evaluator

AbhishekManiTripathi

❤️25

This is small application which can take Query logs in form of conversation history, User query, bot answer and context and it can evaluate your query on various matrics

MIT

JavaScript

Updated 1 year ago

conversation_evaluator

Pranay-Bhilare

❤️35

A scalable system for evaluating conversation turns on hundreds to thousands of linguistic, pragmatic, safety, and emotional facets using open-weight LLMs.

Python

Updated 9 months ago

Conversation_evaluator

heavyoryx

🧡60

No description available

MIT

Python

Updated 3 hours ago

conversation-evaluator-api

rishabhmohan

❤️25

No description available

Python

Updated 9 months ago

rungopher-conversation-evaluator

alipinch93

🧡55

Automated evaluation pipeline for RunGopher voice agent conversations — PII stripping, GPT-4o summarization, clustering, and branded report generation

Python

Updated 3 weeks ago

Career-Conversations-Chatbot

kmalhotra18

❤️35

Career Conversation Chatbot with evaluator LLM

Python

Updated 10 months ago

Multi-Turn-Chatbot-Conversation-Evaluator-Repo

shouvanikhaldar

❤️30

This is a project dedicated for AWS AI Agent Global Hackathon Initiative

Updated 5 months ago

LLM-Judge-Evaluator-for-German-E-Commerce-Conversations

jayAInth

❤️35

No description available

Python

Updated 2 months ago

evalkit

vrushalideo

🧡55

Customer service AI evaluator using the Anthropic API. Scores support conversations across 5 quality dimensions.

JavaScript

Updated 1 week ago

multiagent-reasoning-academic

AnshumaanKarna92

💛70

Research implementation of a multi-agent problem solving architecture with teacher, student, evaluator, and coordinator agents working collaboratively through structured conversation.

MIT

Python

Updated 25 minutes ago

poker-larvis

udayangaac

❤️35

Poker LARVIS is a command-line poker hand evaluator module. It simulates a conversation between a human and an AI assistant named LARVIS.

Updated 9 months ago

Market-Intelligence-AI-Agent

Hashem-Tabbaa

🧡55

An agentic AI application built with Spring AI and Anthropic Claude that demonstrates core agentic patterns: orchestration, parallel sub-agents, tool calling, structured output, conversation memory, context engineering, and evaluator-optimizer loops.

Java

Updated 3 weeks ago

Voice-Based-Hindi-Government-Scheme-Agent

Bikashnaik07

❤️45

A voice-first, agentic AI system that helps users identify and apply for government welfare schemes in Hindi. Built using a Planner-Executor-Evaluator architecture with conversation memory and multi-tool integration.

Jupyter Notebook

Updated 2 months ago

LLM_evaluation

akshaykulgod

❤️35

A lightweight evaluator that scores LLM responses for relevance, completeness, hallucination risk, and latency/cost. It uses semantic embeddings, claim grounding, and caching for fast, low-cost, scalable analysis, producing an explainable JSON report for each conversation.

Python

Updated 3 months ago

voice-welfare-agent

SnehashisRatna

❤️35

System requirements: - Voice input and voice output only - Native language end-to-end (STT → Agent → TTS) - Planner–Executor–Evaluator agent loop - At least 2 tools (eligibility engine + scheme retrieval) - Conversation memory with contradiction handling - Failure recovery for missing info and STT errors

Updated 3 months ago

conv-to-bench

vmoreli

🧡50

A toolkit to build and evaluate benchmarks derived from implicit user feedback in conversational datasets. The repository provides data extraction pipelines, LangGraph-based workflows to generate checklist requirements from conversations, and an LLM-driven evaluator that scores model responses against those checklists.

MIT

Python

Updated 2 months ago

All 19 repositories loaded

GitHub Explorer

Search Results

LLM-Conversational-Agent-Evaluator

athena-rr

conversation-evaluator

conversation-evaluator

conversation_evaluator

Conversation_evaluator

conversation-evaluator-api

rungopher-conversation-evaluator

Career-Conversations-Chatbot

Multi-Turn-Chatbot-Conversation-Evaluator-Repo

LLM-Judge-Evaluator-for-German-E-Commerce-Conversations

evalkit

multiagent-reasoning-academic

poker-larvis

Market-Intelligence-AI-Agent

Voice-Based-Hindi-Government-Scheme-Agent

LLM_evaluation

voice-welfare-agent

conv-to-bench

LLM-Conversational-Agent-Evaluator

athena-rr

conversation-evaluator

conversation-evaluator

conversation_evaluator

Conversation_evaluator

conversation-evaluator-api

rungopher-conversation-evaluator

Career-Conversations-Chatbot

Multi-Turn-Chatbot-Conversation-Evaluator-Repo

LLM-Judge-Evaluator-for-German-E-Commerce-Conversations

evalkit

multiagent-reasoning-academic

poker-larvis

Market-Intelligence-AI-Agent

Voice-Based-Hindi-Government-Scheme-Agent

LLM_evaluation

voice-welfare-agent

conv-to-bench