Found 41 repositories(showing 30)
pilot7747
This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.
emirdemirel
This is a subset of the DALI set consisting of 240 polyphonic recordings that is used to benchmark lyrics transcription evaluation.
Toloka
Benchmark Dataset for Crowdsourced Audio Transcription
MaayanLab
Benchmarking enrichment analysis algorithms using transcription factor-gene and drug-gene libraries.
kalpalabs
cli utility for benchmarking transcription models on Indic Datasets
micdarau
Benchmark transcription APIs against real meeting audio. Measure WER, diarization, latency, and cost.
ehabmmoaty
Benchmark ASR models (Whisper, VibeVoice-ASR, Qwen3-ASR, XEUS, Azure Speech) for Arabic/English transcription — built for Anees AI companion
Bilel-Eljaamii
Benchmarking OpenAI Whisper models (tiny→turbo) for classical Arabic poetry transcription (Amr ibn Kulthum’s Mu'allaqat). Metrics: speed, accuracy, disk usage. Error analysis on diacritics (tashkeel) & archaic vocabulary. Includes Python scripts, dataset (audio samples), and visualizations. #ArabicNLP #ASR #Whisper
extrange
Speech to text model benchmarks
Maryland-State-Innovation-Team
A pipeline to construct state-specific audio transcription benchmarks
danielrosehill
A compilation of resources (model profiles, benchmarks, docs) for multimodal AI models with audio understanding (esp. focused on ASR and transcription use-cases)
amanmsiddiqui
An open-source cognitive benchmarking suite for N=1 biohacking. Features automated AI grading (Gemini 3.0), voice transcription (Whisper), and longitudinal data visualization.
jhu-sheridan-libraries
A comprehensive testing and benchmarking suite for Whisper speech recognition models, focusing on transcription and diarization performance. This project tests C++ and Python implementations to evaluate Whisper's capabilities across different scenarios.
Pandagan-85
ReAct-based AI agent using LangGraph for GAIA benchmark evaluation. Handles audio/video transcription, web search, file analysis, and complex reasoning chains. Achieves autonomous execution on real-world assistant tasks with 15+ specialized tools.
mting4life
Fall into a new career at Amphion! Enjoy working with THE BEST in the country (all US-based). NOW HIRING - Home-Based Employee Status Careers: FT/PT Speech Recognition Editors for days, nights and weekends. Multiple positions using our new technology, "Triton," incorporating M*Modal and Benchmark KB. Also recruiting for experienced eScription and iChart MTs. Requires at least two years of acute-care hospital and/or clinic (at least four specialties) medical transcription/medical editing experience in addition to any formal training. Proven ability to move between accounts with urgency and accuracy a must! Up to 90% of workload will involve voice recognition editing. Work schedule you design needs to include one weekend day or night. PC and high-speed cable-modem or DSL required; no dial-up or satellite, please. Guaranteed hourly rate for the 30 days of employment. Pay per line with production bonuses (65 VBC line) and quality incentives (AHDI Book of Style utilized). Evening and weekend pay differentials. Full-time benefits include Health, Life, Dental, Vision, Flexible Spending Account, 401k, Paid Time Off, CHDS/RHDS Credential Maintenance Reimbursement, Referral Bonuses, and Direct Deposit of Paychecks (on time!). Friendly, knowledgeable, technical support staff with daily feedback from an experienced transcription management team. If you're looking for a career, not just another job, get ready to shine bright at Amphion. We have stable, new, large accounts and we are looking for contributors to our success. If you have proven work independence and an excellent quality and production history, contact us today. Amphion is poised for great things in 2014 and we'd love to include you! Become a fan of Amphion on Facebook! Amphion Medical Solutions. . . Where plenty of work gets done -- but fun makes a regular appearance! On-line application and skills assessment available to you 24/7: http://amphionms.mttest.com Equal Opportunity Employer
nuhs-projects
Speech to text model benchmarks
k-rks
Benchmark Dataset for Crowdsourced Audio Transcription
Erdosity
Speech-To-Text (STT)/Transcription Services Benchmarking Tool
philip-brohan
Solve the auto-transcription benchmark 2 with Machine learning
omar-elamin
AI vendor eval platform — benchmark transcription vendors side-by-side
DevStrategist
Benchmark voice dictation apps — measure transcription latency with precise timing metrics
Files for the Bachelor thesis "Benchmarking deep learning tools for transcription factor prediction .
searchandrescuegg
benchmarking small large language models for use with transcribe (transcription of fire dispatch calls)
ASRBench
A command-line tool for the ASRBench framework, simplifying audio transcription system benchmarking with a single config file, supporting popular and custom transcription systems
jensse
A Python script to benchmark Whisper transcription performance on varying hardware (CPU vs. GPU) using Norwegian audio files.
jv813yh
A comprehensive benchmarking project designed to evaluate and compare three different Retrieval-Augmented Generation (RAG) architectures using clinical medical transcription data.
renganathc
A fully reproducible and deterministic benchmark evaluating Vision Language Models on structured dilbert comic strip transcription with strictly defined evaluation rules.
MehediHasan-ds
A high-performance, offline real-time speech-to-text system optimized for CPU only. This project benchmarks and compares Vosk and Whisper.cpp models for call-quality speech transcription with minimal latency.
gracee3
Small, practical toolkit for audio cleaning and batch transcription using WhisperX, with a simple benchmarking harness for testing ASR configurations and performance.
UmrbekAbdullayev
This project benchmarks multiple Uzbek speech-to-text (ASR) models on the same audio files using Hugging Face pipelines. It automatically loads each model, runs transcription, and saves the output into a `/results` folder.