Found 328 repositories(showing 30)
speechbrain
A PyTorch-based Speech Toolkit
speechbrain
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
speechbrain
This repository contains the SpeechBrain Benchmarks
askrella
Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)
speechbrain
Extensions to YAML syntax for better python interaction
ns2250225
基于speechbrain的快速目标说话人提取TSE服务
guxm2021
[ISMIR 2022] Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
guxm2021
[TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing
maximus-choi
Real-time speaker diarization using straightforward, intuitive logic - High accuracy thanks to SpeechBrain/Pyannote-WeSpeaker models
jordicapde
StutterFormer is an AI model that aims to be able to receive a speech sample with stuttering disfluencies, and return it with the disfluencies attenuated or eliminated.
sangramsingnk
Text-to-Speech Recipe Users can create speech signals from an input text by using text-to-speech (TTS), also referred to as speech synthesis. Popular TTS and Vocoder models, such as Tacotron 2, are supported by SpeechBrain (e.g, HiFIGAN).
nuaazs
Backend of anti-fraud system based on speaker identification technology. 基于声纹识别的反诈系统后端
benluks
Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.
luomingshuang
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
alumae
VoxLingua107 recipe for SpeechBrain
lucadellalib
Target speaker automatic speech recognition (TS-ASR)
JusperLee
SpeechBrain中文文档
yan-gao-GY
No description available
SELMA-project
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
OSU-slatelab
A recipe for disfluency detection on the LibriStutter dataset using SpeechBrain
caizexin
No description available
shahad-mahmud
Incremental learning for automatic speech recognition (ASR)
Processing EEG data using Speechbrain-MOABB and model tuning to get best results
A Streamlit web app for speaker diarization and identification in audio files. Upload or record audio, transcribe conversations, and automatically segment and label speakers using reference samples. This app makes it easy to analyze multi-speaker audio, export transcripts, and identify "who spoke when" for meetings, interviews, and more.
amitpuri
Record voice, transcribe a prompt, picturize the prompt, create variations, get description of a celebrity and upload, other use cases on KB
aalto-speech
Implementation of different curriculum learning (CL) methods for speechbrain's ASR recipes.
Hguimaraes
[Research] 2nd place solution at L3DAS21 challenge Task 1. Using FCN architecture and Perceptual Losses. Implemented with the SpeechBrain toolkit
progressionnetwork
Attempting to build a custom pipeline using 100k hours of Russian speech data, leveraging Wav2Vec2 and speechbrain/spkrec-ecapa-voxceleb for embedding extraction. This will involve employing a combination of non-standard clustering approaches.
lgpearson1771
Train custom wake word models with openWakeWord. A granular 13-step pipeline with compatibility patches for torchaudio 2.10+, Piper TTS, and speechbrain. Generates tiny ONNX models (~200 KB) for real-time keyword detection — like building your own "Hey Siri" trigger. WSL2/Linux + CUDA required.
Audio source separation model with a Whisper ECAPA-TDNN counter and pre‑trained speechbrain/sepformer-libri3mix and speechbrain/sepformer-wsj02mix for speech separation, implemented with SpeechBrain.