Search Results

Found 428 repositories(showing 30)

pytorch-kaldi

mravanelli

🧡62

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

2.4k

445

Python

Updated 1 week ago

asrdeep-learningdeep-neural-networks+14

denoiser

facebookresearch

💛70

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

1.9k

318

NOASSERTION

Python

Updated 3 days ago

SONAR

facebookresearch

🧡67

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

881

100

NOASSERTION

Python

Updated 5 days ago

dsd

szechyjs

🧡69

Digital Speech Decoder

773

324

NOASSERTION

C++

Updated 3 days ago

cc-plus-plusdsd

pyctcdecode

kensho-technologies

❤️41

A fast and lightweight python-based CTC beam search decoder for speech recognition.

469

101

Apache-2.0

Python

Updated 1 month ago

brainmagick

facebookresearch

🧡66

Training and evaluation pipeline for MEG and EEG brain signal encoding and decoding using deep learning. Code for our paper "Decoding speech perception from non-invasive brain recordings" published in Nature Machine Intelligence, 2023.

463

NOASSERTION

Python

Updated 5 days ago

kaldi-active-grammar

daanzu

🧡61

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

346

AGPL-3.0

Python

Updated 16 hours ago

codingcommand-and-controldictation+11

dsdcc

f4exb

🧡56

Digital Speech Decoder (DSD) rewritten as a C++ library

322

C++

Updated 2 weeks ago

tfkaldi

vrenkens

❤️41

Speech recognition software where the neural net is trained with TensorFlow and GMM training and decoding is done in Kaldi

173

MIT

Python

Updated 5 months ago

gr-dsd

argilo

❤️36

GNU Radio block for Digital Speech Decoder

148

GPL-3.0

C++

Updated 3 months ago

hacktoberfest

neural_speech_decoding

flinkerlab

❤️40

No description available

114

GPL-3.0

Jupyter Notebook

Updated 1 month ago

NAST-S2x

ictnlp

❤️45

A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.

Python

Updated 1 month ago

non-autoregressivenon-autoregressive-transformerssimultaneous-translation+2

multimodal-decoding

UCSF-Chang-Lab-BRAVO

🧡50

Code associated with the paper titled "A high-performance neuroprosthesis for speech decoding and avatar control" , published in Nature in 2023.

Jupyter Notebook

Updated 3 weeks ago

AliParaformerAsr

manyeyes

🧡55

c# library for decoding paraformer, sensevoice Models，used in speech recognition (ASR)

Apache-2.0

Updated 1 week ago

speechy

chrisenytc

❤️40

A speech recognition API service to decode audio to text

MIT

JavaScript

Updated 2 years ago

Speech2Face

ravising-h

🧡60

Image Processing, Speech Processing, Encoder Decoder, Research Paper implementation

GPL-3.0

Jupyter Notebook

Updated 3 weeks ago

encoder-decoderface-detectionface-normalization+6

juicer

idiap

❤️40

Juicer is a Weighted Finite State Transducer (WFST) based decoder for Automatic Speech Recognition (ASR).

NOASSERTION

C++

Updated 1 year ago

BangalASR

menon92

🧡60

Transformer based Bangla Speech Recognition | Encoder Decoder Architecture

MIT

Jupyter Notebook

Updated 3 weeks ago

attention-is-all-you-needbanglabangla-asr+6

ecog2txt

jgmakin

❤️45

code for decoding speech as text from neural data

Python

Updated 1 month ago

alex-asr

UFAL-DSG

❤️40

Online decoder for Kaldi NNET2 and GMM speech recognition models with Python bindings.

NOASSERTION

Python

Updated 3 years ago

D2Former

alibabasglab

❤️35

This repository contains the audio samples for "D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement" which is submitted to ICASSP 2023.

MIT

Python

Updated 3 months ago

Stable-Hybrid-Auditory-Filterbanks

felixperfler

❤️40

[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement

BSD-3-Clause-Clear

Python

Updated 5 months ago

pytorch_MLP_for_ASR

mravanelli

❤️35

This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided.

Perl

Updated 3 months ago

asrcudadeep-learning+11

NeuSpeech1

NeuSpeech

🧡55

Decode Neural signal as Speech

Apache-2.0

Python

Updated 1 week ago

theano-kaldi-rnn

mravanelli

❤️35

THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.

Perl

Updated 5 months ago

deep-learningdeep-neural-networksgated-recurrent-units+7

Chisco

zhangzihan-is-good

❤️45

We constructed an EEG dataset based on imagined speech and performed semantic decoding on it.

MIT

Python

Updated 2 months ago

Singing-Vocal-Beat-Tracking

mjhydri

❤️40

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and tempo.

MIT

Python

Updated 9 months ago

beat-trackinghubertlinear-transformer+5

silk-sdk

laysent

❤️30

node-gyp version of Silk Speech Codec, able to decode/encode audio from/to silk format (widely used by Tencent apps, such as WeChat/WeiXin, QQ)

MIT

Updated 1 year ago

silksilk-v3

llama-mimi

llm-jp

🧡60

Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequences of interleaved semantic and acoustic tokens.

BSD-3-Clause

Python

Updated 2 weeks ago

fesde

lee-jhwn

❤️45

Toward Fully-End-to-End Listened Speech Decoding from EEG Signals (Interspeech 2024)

Python

Updated 1 month ago

GitHub Explorer

Search Results

pytorch-kaldi

denoiser

SONAR

dsd

pyctcdecode

brainmagick

kaldi-active-grammar

dsdcc

tfkaldi

gr-dsd

neural_speech_decoding

NAST-S2x

multimodal-decoding

AliParaformerAsr

speechy

Speech2Face

juicer

BangalASR

ecog2txt

alex-asr

D2Former

Stable-Hybrid-Auditory-Filterbanks

pytorch_MLP_for_ASR

NeuSpeech1

theano-kaldi-rnn

Chisco

Singing-Vocal-Beat-Tracking

silk-sdk

llama-mimi

fesde

pytorch-kaldi

denoiser

SONAR

dsd

pyctcdecode

brainmagick

kaldi-active-grammar

dsdcc

tfkaldi

gr-dsd

neural_speech_decoding

NAST-S2x

multimodal-decoding

AliParaformerAsr

speechy

Speech2Face

juicer

BangalASR

ecog2txt

alex-asr

D2Former

Stable-Hybrid-Auditory-Filterbanks

pytorch_MLP_for_ASR

NeuSpeech1

theano-kaldi-rnn

Chisco

Singing-Vocal-Beat-Tracking

silk-sdk

llama-mimi

fesde