Search Results

Found 4,225 repositories(showing 30)

MARS5-TTS

Camb-ai

💛75

MARS5 speech model (TTS) from CAMB.AI

2.8k

243

AGPL-3.0

Jupyter Notebook

Updated 4 days ago

prosodyspeechspeech-synthesis+3

FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod

2.7k

364

GPL-3.0

JavaScript

Updated 5 hours ago

ai-artcodingdeepfake-generation+17

Video-Materials-AutoGEN-Workstation

Norsico

🧡69

一个集内容策划、AI文案自动生成、TTS 批量自动配音、(AI)图片素材合成、ASR自动提取语言字幕脚本、AI自由创作于一体的(短视频)生成工作站。方便管理每期的视频项目。

1.4k

272

Python

Updated 1 day ago

VibeVoice-ComfyUI

Enemyx-net

🧡69

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.

1.4k

226

MIT

Python

Updated 7 hours ago

ai-audioai-ttsai-voice+13

Speech-AI-Forge

lenML

🧡63

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

1.4k

185

AGPL-3.0

Python

Updated 21 hours ago

agentasrchattts+17

langchain4j-aideepin

moyangzhan

💛74

基于AI的工作效率提升工具（聊天、绘画、知识库、工作流、 MCP服务市场、语音输入输出、长期记忆） | Ai-based productivity tools (Chat,Draw,RAG,Workflow,MCP marketplace, ASR,TTS, Long-term memory etc)

1.2k

296

MIT

Java

Updated 2 days ago

ai-agentai-workflowgraphrag+4

MouseTooltipTranslator

ttop32

🧡63

Mouseover Translate Any Language At Once - Chrome Extension: PDF Translator, EBOOK, EPUB, OCR, TTS, NETFLIX, YOUTUBE DUAL SUBTITLES, GOOGLE DOCS, AI, VIEWER, GMAIL, WRITING, IMAGE, DUAL SUBS, MANGA, HOVER, DICTIONARY, WEBTOON, EDGE, JAPANESE, ENGLISH

1.2k

164

MIT

JavaScript

Updated 8 hours ago

browser-extensionchromechrome-extension+17

MOSS-TTS

OpenMOSS

🧡67

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.

1.1k

102

Apache-2.0

Python

Updated 36 minutes ago

audioaudio-tokenizerllm+3

vertex-ai-creative-studio

GoogleCloudPlatform

🧡69

GenMedia Creative Studio is a Vertex AI generative media user experience highlighting the use of Imagen, Veo, Gemini 🍌, Gemini TTS, Chirp 3, Lyria and other generative media APIs on Google Cloud.

1.0k

329

Apache-2.0

Jupyter Notebook

Updated 48 minutes ago

chirpgeminigemini-tts+6

handcrafted-persona-engine

fagenorn

🧡62

An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC. Ideal for VTubing, streaming, and virtual assistant applications.

1.0k

118

Updated 1 day ago

aiai-vtuberai-waifu+3

Complete-Ecommerce-in-laravel-10

Prajwal100

🧡67

Complete E-commerce Website in Laravel 10 - Full-featured eCommerce solution with modern UI, admin panel, PayPal integration, and powered by NepVox AI (TTS, STT, TTI)

1.0k

563

MIT

Blade

Updated 1 week ago

advance-ecommerce-projecte-commerceecommerce+8

openedai-speech

matatonic

💛72

An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.

856

131

AGPL-3.0

Python

Updated 2 days ago

SmartJavaAI

geekwenjie

💛72

🔥🔥🔥Java免费离线AI算法工具箱，支持人脸识别，活体检测，表情识别、目标检测、实例分割、行人检测、OCR文字识别、车牌识别、表格识别、ASR+TTS、机器翻译等功能，Maven引用即可使用。支持PyTorch、Tensorflow，已集成 Mtcnn、InsightFace、SeetaFace6、YOLOv8~v12、PaddleOCR(PPOCRv5)、Whisper等主流模型

805

139

NOASSERTION

Java

Updated 26 minutes ago

androidasrclip+17

ChatGPT-Next-Web-Pro

vual

🧡67

基于chatgpt-next-web，增加了midjourney绘画功能，支持mj-plus的ai换脸和局部重绘，接入了stable-diffusion，支持oss，支持接入fastgpt知识库，支持suno，支持luma。支持dall-e-3、gpt-4-vision-preview、whisper、tts等多模态模型，支持gpt-4-all，支持GPTs商店。新增加了支持后台管理的版本，包括登录注册，平台模型apikey管理，套餐管理，消息保存等。

795

168

Updated 14 minutes ago

adminannyun-aichatgpt+12

JJYB_AI_VideoAutoCut

jianjieyiban

🧡67

JJYB_AI 智剪 - 智能视频自动剪辑与AI解说工具（离线TTS、原创解说、混剪、AI配音）

786

154

HTML

Updated 9 hours ago

voice-ai

rapidaai

💛73

Rapida is an open-source, end-to-end voice AI orchestration platform for building real-time conversational voice agents with audio streaming, STT, TTS, VAD, multi-channel integration, agent state management, and observability.

709

191

NOASSERTION

Updated 1 day ago

agent-frameworkai-voiceai-voice-agent+17

ZerolanLiveRobot

AkagawaTsurunaki

💛71

AI VTuber with LLM, ASR, TTS, OCR, CV and more technologies to live stream or play Minecraft with you.

648

MIT

Python

Updated 10 hours ago

aiai-vtuberasr+8

speech-swift

soniqo

💛71

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML

549

Apache-2.0

Swift

Updated 5 hours ago

apple-siliconasrcoreml+13

Starmoon

StarmoonAI

💛71

A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS 🤖

543

GPL-3.0

TypeScript

Updated 8 hours ago

esp32gptiot+6

alexandria-audiobook

Finrandojin

💛71

AI-powered multi-voice audiobook generator — LLM script annotation, voice cloning, voice design, LoRA training, per-line style control, and export to MP3, chaptered M4B, or Audacity multi-track. Built on Qwen3-TTS.

487

MIT

Python

Updated 3 hours ago

aiaudiobookaudiobook-generator+16

RealtimeSTT_LLM_TTS

Ikaros-521

🧡66

实时STT，连接OpenAI接口/智谱AI（流式LLM）和GPT-SOVITS/Edge-TTS，通过网页的方式，进行跨网络的服务调用，实现实时对话的效果

436

MIT

Python

Updated 3 days ago

llmpythonstt+1

Kuebiko

adi-panda

🧡66

An AI Twitch TTS Chat Bot using GPT-3 and Google Cloud TTS

388

Python

Updated 3 days ago

Maix-Speech

sipeed

🧡51

Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.

361

NOASSERTION

Python

Updated 1 month ago

asrr329riscv+1

video-podcast-maker

Agents365-ai

💛71

AI-powered video podcast creation skill for coding agents. Supports Bilibili & YouTube, multi-language (zh-CN/en-US), 6 TTS engines (Edge/Azure/ElevenLabs/OpenAI/Doubao/CosyVoice), 4K Remotion rendering.

361

MIT

TypeScript

Updated 13 minutes ago

agent-skillsai-videobilibili+14

LangHelper

NsLearning

🧡61

Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

348

MIT

Rust

Updated 1 week ago

aiasrassessment+10

ToolNeuron

Siddhesh2377

🧡66

On-device AI for Android — LLM chat (GGUF/llama.cpp), vision models (VLM), image generation (Stable Diffusion), tool calling, AI personas, RAG knowledge packs, TTS/STT. Fully offline, zero subscriptions, open-source.

324

Apache-2.0

Kotlin

Updated 1 hour ago

ai-personasandroidgguf-models+13

ChatGPT-OpenAI-Smart-Speaker

Olney1

🧡61

This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice and vision-driven conversations, with additional web search capabilities via OpenAI and Langchain agents.

311

MIT

Python

Updated 3 weeks ago

agentsaiartificial-intelligence+14

jarvis

llm-guy

🧡61

Jarvis is a voice-activated, conversational AI assistant powered by a local LLM (Qwen via Ollama). It listens for a wake word, processes spoken commands using a local language model with LangChain, and responds out loud via TTS. It supports tool-calling for dynamic functions like checking the current time.

299

Python

Updated 8 hours ago

Vocalis

Lex-au

🧡56

Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. Features low-latency audio streaming, dynamic visual feedback, and works with local LLM/TTS services via OpenAI-compatible endpoints.

294

Apache-2.0

TypeScript

Updated 1 week ago

artificial-intelligenceconversational-aispeech-to-speech+1

ai-devices

developersdigest

🧡66

AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

293

MIT

TypeScript

Updated 5 days ago

function-callinggpt-4-visiongroq+9

GitHub Explorer

Search Results

MARS5-TTS

Stable-Diffusion

Video-Materials-AutoGEN-Workstation

VibeVoice-ComfyUI

Speech-AI-Forge

langchain4j-aideepin

MouseTooltipTranslator

MOSS-TTS

vertex-ai-creative-studio

handcrafted-persona-engine

Complete-Ecommerce-in-laravel-10

openedai-speech

SmartJavaAI

ChatGPT-Next-Web-Pro

JJYB_AI_VideoAutoCut

voice-ai

ZerolanLiveRobot

speech-swift

Starmoon

alexandria-audiobook

RealtimeSTT_LLM_TTS

Kuebiko

Maix-Speech

video-podcast-maker

LangHelper

ToolNeuron

ChatGPT-OpenAI-Smart-Speaker

jarvis

Vocalis

ai-devices

MARS5-TTS

Stable-Diffusion

Video-Materials-AutoGEN-Workstation

VibeVoice-ComfyUI

Speech-AI-Forge

langchain4j-aideepin

MouseTooltipTranslator

MOSS-TTS

vertex-ai-creative-studio

handcrafted-persona-engine

Complete-Ecommerce-in-laravel-10

openedai-speech

SmartJavaAI

ChatGPT-Next-Web-Pro

JJYB_AI_VideoAutoCut

voice-ai

ZerolanLiveRobot

speech-swift

Starmoon

alexandria-audiobook

RealtimeSTT_LLM_TTS

Kuebiko

Maix-Speech

video-podcast-maker

LangHelper

ToolNeuron

ChatGPT-OpenAI-Smart-Speaker

jarvis

Vocalis

ai-devices