Found 19 repositories(showing 19)
No description available
Vrindiesel
A Transformer-based Response Evaluator for Open-Domain Spoken Conversation
deviate-dv8
This repository is used for reference in thesis paper "Enhancing Language Model Efficiency through Sequence-Level Knowledge Distillation with Sparse Transformers" in context on GPT Conversation Evaluator
AbhishekManiTripathi
This is small application which can take Query logs in form of conversation history, User query, bot answer and context and it can evaluate your query on various matrics
Pranay-Bhilare
A scalable system for evaluating conversation turns on hundreds to thousands of linguistic, pragmatic, safety, and emotional facets using open-weight LLMs.
heavyoryx
No description available
rishabhmohan
No description available
alipinch93
Automated evaluation pipeline for RunGopher voice agent conversations — PII stripping, GPT-4o summarization, clustering, and branded report generation
kmalhotra18
Career Conversation Chatbot with evaluator LLM
shouvanikhaldar
This is a project dedicated for AWS AI Agent Global Hackathon Initiative
No description available
vrushalideo
Customer service AI evaluator using the Anthropic API. Scores support conversations across 5 quality dimensions.
AnshumaanKarna92
Research implementation of a multi-agent problem solving architecture with teacher, student, evaluator, and coordinator agents working collaboratively through structured conversation.
udayangaac
Poker LARVIS is a command-line poker hand evaluator module. It simulates a conversation between a human and an AI assistant named LARVIS.
Hashem-Tabbaa
An agentic AI application built with Spring AI and Anthropic Claude that demonstrates core agentic patterns: orchestration, parallel sub-agents, tool calling, structured output, conversation memory, context engineering, and evaluator-optimizer loops.
Bikashnaik07
A voice-first, agentic AI system that helps users identify and apply for government welfare schemes in Hindi. Built using a Planner-Executor-Evaluator architecture with conversation memory and multi-tool integration.
akshaykulgod
A lightweight evaluator that scores LLM responses for relevance, completeness, hallucination risk, and latency/cost. It uses semantic embeddings, claim grounding, and caching for fast, low-cost, scalable analysis, producing an explainable JSON report for each conversation.
SnehashisRatna
System requirements: - Voice input and voice output only - Native language end-to-end (STT → Agent → TTS) - Planner–Executor–Evaluator agent loop - At least 2 tools (eligibility engine + scheme retrieval) - Conversation memory with contradiction handling - Failure recovery for missing info and STT errors
vmoreli
A toolkit to build and evaluate benchmarks derived from implicit user feedback in conversational datasets. The repository provides data extraction pipelines, LangGraph-based workflows to generate checklist requirements from conversations, and an LLM-driven evaluator that scores model responses against those checklists.
All 19 repositories loaded