Found 4,965 repositories(showing 30)
Blaizzy
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
coqui-ai
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
meizhong986
ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD. Noise-robust for JAV
rafaelpadilla
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding box formats as in COCO, PASCAL, Imagenet, etc.
ARahim3
Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.
Prajwal100
Complete E-commerce Website in Laravel 10 - Full-featured eCommerce solution with modern UI, admin panel, PayPal integration, and powered by NepVox AI (TTS, STT, TTI)
snakers4
Open STT
lobehub
🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser
VRCWizard
Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
rapidaai
Rapida is an open-source, end-to-end voice AI orchestration platform for building real-time conversational voice agents with audio streaming, STT, TTS, VAD, multi-channel integration, agent state management, and observability.
evancohen
:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
inket
A simple macOS app for monitoring the status of cloud services
StarmoonAI
A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS 🤖
한국어 음성인식 STT API 리스트. 각 성능 벤치마크.
toverainc
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
ShayneP
Local voice AI powered by Ollama, Kokoro, Nemotron STT, and LiveKit.
Ikaros-521
实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果
NsLearning
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
Siddhesh2377
On-device AI for Android — LLM chat (GGUF/llama.cpp), vision models (VLM), image generation (Stable Diffusion), tool calling, AI personas, RAG knowledge packs, TTS/STT. Fully offline, zero subscriptions, open-source.
This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice and vision-driven conversations, with additional web search capabilities via OpenAI and Langchain agents.
twelvet-projects
(Spring Boot 3. X Microservices framework) 基于Spring Boot 3.X 的 Spring Cloud Alibaba / Spring Cloud Tencent + React的微服务框架。🔝 🔝 点个starrred 关注更新。Chat GPT(RAG、TTS、STT、LLM)
JasonJarvan
开源的AI面试助手,使用OpenAI Whipser模型进行STT(Speak to Text 语音转文字)转录,然后将问题交给ChatGPT回答。
gaborvecsei
Live-Transcription (STT) with Whisper PoC
disler
Fast STT, LLM, and TTS for personal AI assistants using OpenAI, Groq, AssemblyAI and ElevenLabs.
Elleo
NOTE: This plugin is now deprecated in favour of the coqui-stt branch in gst-plugins-bad: https://gitlab.freedesktop.org/philn/gstreamer/-/tree/coqui-stt/subprojects/gst-plugins-bad/ext/coqui
proj-airi
🎤💬 Full example of implementing ChatGPT's realtime voice from scratch with VAD + STT + LLM + TTS technology stack within almost one file!
coqui-ai
Open models for Coqui STT
dialogflow
A best practice for streaming audio from a browser microphone to Dialogflow or Google Cloud STT by using websockets.
MycroftAI
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
coqui-ai
🐸STT integration examples