Search Results

Found 16,855 repositories(showing 30)

FunASR

modelscope

💚100

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

15.6k

1.6k

MIT

Python

Updated 9 hours ago

audio-visual-speech-recognitionconformerdfsmn+12

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

11.5k

1.3k

Apache-2.0

C++

Updated 49 minutes ago

aarch64androidarm32+17

silero-vad

snakers4

💛86

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

8.7k

759

MIT

Python

Updated 2 hours ago

onnxonnx-runtimeonnxruntime+9

ffsubsync

smacke

💛81

Automagically synchronize subtitles with video.

7.6k

316

MIT

Python

Updated 2 hours ago

alignmentaudiocaption+17

vaderSentiment

cjhutto

💛86

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

5.0k

1.1k

MIT

Python

Updated 1 day ago

faster-whisper-GUI

CheshireCC

💛75

faster_whisper GUI with PySide6

2.9k

168

AGPL-3.0

Python

Updated 1 day ago

asrfaster-whisperopenai+5

ten-vad

TEN-framework

💛74

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

2.1k

163

NOASSERTION

Updated 6 hours ago

audioautomatic-speech-recognitionconversational-ai+9

vad

ricky0123

💛75

Voice activity detector (VAD) for the browser with a simple API

1.9k

259

NOASSERTION

TypeScript

Updated 6 hours ago

onnxruntimesilero-vadspeech-to-text+4

FluidAudio

FluidInference

🧡69

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

1.8k

243

Apache-2.0

Swift

Updated 5 hours ago

aneasraudio+16

sherpa-ncnn

k2-fsa

💛74

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

1.7k

211

Apache-2.0

C++

Updated 12 hours ago

asrccpp+7

WhisperJAV

meizhong986

💛73

ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD. Noise-robust for JAV

1.4k

125

MIT

Python

Updated 32 minutes ago

aitranslatehallucinationjapanese+10

VAD

hustvl

🧡68

[ICCV 2023 & ICLR 2026] VAD: Vectorized Scene Representation for Efficient Autonomous Driving

1.3k

152

Apache-2.0

Python

Updated 21 hours ago

autonomous-drivingend-to-end

VAD

jtkim-kaist

🧡68

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

869

233

MATLAB

Updated 3 days ago

acamattentionbdnn+9

auditok

amsehili

💛72

An audio/acoustic activity detection and audio segmentation tool

845

100

MIT

Python

Updated 1 day ago

audio-activitiesaudio-dataaudio-segmentation+3

voice-ai

rapidaai

💛73

Rapida is an open-source, end-to-end voice AI orchestration platform for building real-time conversational voice agents with audio streaming, STT, TTS, VAD, multi-channel integration, agent state management, and observability.

718

180

NOASSERTION

Updated 5 hours ago

agent-frameworkai-voiceai-voice-agent+17

vader.vim

junegunn

🧡56

A simple Vimscript test framework

597

Vim Script

Updated 3 days ago

libfvad

dpirch

💛72

Voice activity detection (VAD) library, based on WebRTC's VAD engine

588

191

BSD-3-Clause

Updated 3 days ago

speech-swift

soniqo

💛71

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML

572

Apache-2.0

Swift

Updated 54 minutes ago

apple-siliconasrcoreml+13

WhisperS2T

shashikg

🧡66

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

564

MIT

Jupyter Notebook

Updated 1 day ago

asrdeep-learningspeech-recognition+6

api4sensevoice

0x5446

🧡66

API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition, and speaker verification.

539

Python

Updated 16 hours ago

EventVAD

YihuaJerry

🧡61

[MM 2025] EventVAD: Training-Free Event-Aware Video Anomaly Detection

529

Python

Updated 3 hours ago

ICASSP-2023-24-Papers

DmitryRyumin

💛71

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

521

MIT

Python

Updated 2 days ago

asrdenoisingdomain-adaptation+17

android-vad

gkonovalov

💛71

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

482

MIT

Updated 2 days ago

androidaudio-processingdeep-neural-networks+17

VAD-python

marsbroshok

❤️37

Voice Activity Detector in Python

480

130

Python

Updated 3 months ago

vadnet

hcmlab

💛71

Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks

461

LGPL-3.0

Python

Updated 5 days ago

FireRedASR2S

FireRedTeam

🧡66

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.

461

Apache-2.0

Python

Updated 11 hours ago

asrasr-pipelineaudio-event-classification+15

RuntimeAudioImporter

gtreshchev

🧡61

Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

405

MIT

C++

Updated 3 weeks ago

audioaudio-converteraudio-files+17

Stealthy-Kernelmode-Injector

charliewolfe

🧡66

Manual mapper that uses PTE manipulation, Virtual Address Descriptor (VAD) manipulation, and forceful memory allocation to hide executable pages. (VAD hide / NX bit swapping)

392

Updated 1 day ago

Bench2DriveZoo

Thinklab-SJTU

🧡56

BEVFormer, UniAD, VAD in Closed-Loop CARLA Evaluation with World Model RL Expert Think2Drive

377

NOASSERTION

Python

Updated 5 days ago

voice_activity_detection

filippogiruzzi

🧡61

Voice Activity Detection based on Deep Learning & TensorFlow

370

GPL-3.0

Python

Updated 2 weeks ago

artificial-intelligencedeep-learningdeep-neural-networks+15

GitHub Explorer

Search Results

FunASR

sherpa-onnx

silero-vad

ffsubsync

vaderSentiment

faster-whisper-GUI

ten-vad

vad

FluidAudio

sherpa-ncnn

WhisperJAV

VAD

VAD

auditok

voice-ai

vader.vim

libfvad

speech-swift

WhisperS2T

api4sensevoice

EventVAD

ICASSP-2023-24-Papers

android-vad

VAD-python

vadnet

FireRedASR2S

RuntimeAudioImporter

Stealthy-Kernelmode-Injector

Bench2DriveZoo

voice_activity_detection

FunASR

sherpa-onnx

silero-vad

ffsubsync

vaderSentiment

faster-whisper-GUI

ten-vad

vad

FluidAudio

sherpa-ncnn

WhisperJAV

VAD

VAD

auditok

voice-ai

vader.vim

libfvad

speech-swift

WhisperS2T

api4sensevoice

EventVAD

ICASSP-2023-24-Papers

android-vad

VAD-python

vadnet

FireRedASR2S

RuntimeAudioImporter

Stealthy-Kernelmode-Injector

Bench2DriveZoo

voice_activity_detection