Search Results

Found 73,132 repositories(showing 30)

whisper.cpp

ggml-org

💚100

Port of OpenAI's Whisper model in C/C++

48.3k

5.4k

MIT

C++

Updated 12 minutes ago

inferenceopenaispeech-recognition+3

TTS

coqui-ai

💚100

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

45.0k

6.0k

MPL-2.0

Python

Updated 31 minutes ago

deep-learningglow-ttshifigan+16

DeepSpeech

mozilla

💚100

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

26.8k

4.1k

MPL-2.0

C++

Updated 14 minutes ago

deep-learningdeepspeechembedded+7

faster-whisper

SYSTRAN

💚95

Faster Whisper transcription with CTranslate2

21.9k

1.8k

MIT

Python

Updated 13 minutes ago

deep-learninginferenceopenai+5

whisperX

m-bain

💚95

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

21.1k

2.2k

BSD-2-Clause

Python

Updated 1 hour ago

asrspeechspeech-recognition+2

index-tts

💚100

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

19.8k

2.4k

NOASSERTION

Python

Updated 2 hours ago

bigvgancross-lingualindextts+4

Handy

cjpais

💚100

A free, open source, and extensible speech-to-text application that works completely offline.

19.1k

1.5k

MIT

Rust

Updated 9 minutes ago

accessibilitycross-platformspeech-to-text+1

leon

leon-ai

💚94

🧠 Leon is your open-source personal assistant.

17.1k

1.4k

MIT

TypeScript

Updated 13 minutes ago

aiai-assistantartificial-intelligence+17

NeMo

NVIDIA-NeMo

💚100

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

17.0k

3.4k

Apache-2.0

Python

Updated 2 hours ago

asrdeeplearninggenerative-ai+7

pyvideotrans

jianchang512

💚95

Translate the video from one language to another and embed dubbing & subtitles.

16.8k

2.0k

GPL-3.0

Python

Updated 27 minutes ago

speech-to-texttext-to-speechvideo-transition

FunASR

modelscope

💚100

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

15.5k

1.6k

MIT

Python

Updated 2 hours ago

audio-visual-speech-recognitionconformerdfsmn+12

kaldi

kaldi-asr

💚100

kaldi-asr/kaldi is the official location of the Kaldi project.

15.4k

5.4k

NOASSERTION

Shell

Updated 3 hours ago

c-plus-pluscudakaldi+6

vosk-api

alphacep

💚94

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

14.5k

1.7k

Apache-2.0

Jupyter Notebook

Updated 3 hours ago

androidasrdeep-learning+17

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

12.6k

2.0k

Apache-2.0

Python

Updated 5 hours ago

asrcode-switchconformer+17

speechbrain

💚91

A PyTorch-based Speech Toolkit

11.4k

1.7k

Apache-2.0

Python

Updated 1 hour ago

asraudioaudio-processing+17

sherpa-onnx

k2-fsa

💛89

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

11.3k

1.3k

Apache-2.0

C++

Updated 4 hours ago

aarch64androidarm32+17

meetily

Zackriya-Solutions

💚91

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.

10.9k

1.0k

MIT

Rust

Updated 25 minutes ago

aiai-meeting-assistantllm+16

piper

rhasspy

💚90

A fast, local neural text to speech system

10.8k

941

MIT

C++

Updated 36 minutes ago

speech-synthesistext-to-speechtts

edge-tts

rany2

💚90

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

10.5k

986

NOASSERTION

Python

Updated 34 minutes ago

speech-synthesistext-to-speechtts

TTS

mozilla

💚93

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

10.1k

1.3k

MPL-2.0

Jupyter Notebook

Updated 14 hours ago

dataset-analysisdeep-learninggantts+13

WhisperLiveKit

QuentinFuxa

💚90

Simultaneous speech-to-text models

10.0k

1.0k

Apache-2.0

Python

Updated 3 hours ago

RealtimeSTT

KoljaB

💛88

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

9.6k

830

MIT

Python

Updated 4 hours ago

pythonrealtimespeech-to-text

speech_recognition

Uberi

💚94

Speech recognition module for Python, supporting several engines and APIs, online and offline.

9.0k

2.4k

BSD-3-Clause

Python

Updated 1 day ago

audiopythonspeech-recognition+1

VoiceCraft

jasonppy

💛86

Zero-Shot Speech Editing and Text-to-Speech in the Wild

8.5k

797

NOASSERTION

Jupyter Notebook

Updated 12 hours ago

ASRT_SpeechRecognition

nl8590687

💚93

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

8.4k

1.9k

GPL-3.0

Python

Updated 5 hours ago

asrtchinese-speech-recognitioncnn+7

SenseVoice

FunAudioLLM

💛80

Multilingual Voice Understanding Model

7.9k

716

NOASSERTION

Python

Updated 55 minutes ago

aiaigcasr+10

vits

jaywalnut310

💚92

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

7.8k

1.4k

MIT

Python

Updated 12 hours ago

deep-learningpytorchspeech-synthesis+2

ChatTTS-ui

jianchang512

💛87

一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

7.5k

906

NOASSERTION

Python

Updated 4 hours ago

chatttstts

MeloTTS

myshell-ai

💛88

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

7.3k

1.0k

MIT

Python

Updated 31 minutes ago

chineseenglishfrench+6

Zonos

Zyphra

💛85

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers.

7.2k

821

Apache-2.0

Python

Updated 7 hours ago

GitHub Explorer

Search Results

whisper.cpp

TTS

DeepSpeech

faster-whisper

whisperX

index-tts

Handy

leon

NeMo

pyvideotrans

FunASR

kaldi

vosk-api

PaddleSpeech

speechbrain

sherpa-onnx

meetily

piper

edge-tts

TTS

WhisperLiveKit

RealtimeSTT

speech_recognition

VoiceCraft

ASRT_SpeechRecognition

SenseVoice

vits

ChatTTS-ui

MeloTTS

Zonos

whisper.cpp

TTS

DeepSpeech

faster-whisper

whisperX

index-tts

Handy

leon

NeMo

pyvideotrans

FunASR

kaldi

vosk-api

PaddleSpeech

speechbrain

sherpa-onnx

meetily

piper

edge-tts

TTS

WhisperLiveKit

RealtimeSTT

speech_recognition

VoiceCraft

ASRT_SpeechRecognition

SenseVoice

vits

ChatTTS-ui

MeloTTS

Zonos