Found 2 repositories(showing 2)
ayushdh96
End-to-end Speaker Diarization and Transcription system combining Faster-Whisper (ASR), Pyannote (VAD/segmentation), NVIDIA NeMo (speaker embeddings + clustering), and CTC forced aligner for precise word-level timestamps. Includes a React + Vite frontend and Flask + FastAPI backend for easy audio upload, diarized transcripts, and summaries.
ayushdh96
This is the repsoitory for the diarization using nemo and pyannote pipeline which is to be used for hosting the website on the proxmox server
All 2 repositories loaded