Found 2,447 repositories(showing 30)
OpenBMB
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
google-gemini
A react-based starter app for using the Live API over websockets with Gemini
Intent-Lab
Real-time AI assistant for Meta Ray-Ban smart glasses -- voice + vision + agentic actions via Gemini Live and OpenClaw
kizuna-ai-lab
Live speech translation powered by on-device AI and cloud providers โ OpenAI, Google Gemini, Palabra.ai, Kizuna AI, Volcengine, and more
cruzyjapan
A responsive web-based UI that provides an intuitive interface for Google's Gemini CLI, enabling AI-assisted coding from any device. Features include interactive chat, integrated terminal, file explorer with live editing, Git integration, and session management.
Immergo is an immersive language learning application powered by the Google Gemini Live SDK. It simulates real-world roleplay scenarios (e.g., buying a bus ticket, ordering coffee) to help users practice speaking in various languages with an AI that acts as a native speaker.
ViaAnthroposBenevolentia
Vanilla JS web interface for Gemini 2.0 flash-exp Multimodal API with text, audio, camera, screen inputs and audio responses and function calling
PleasePrompto
Control Claude Code, Codex CLI and Gemini CLI from Telegram. Live streaming, persistent memory, cron jobs, webhooks, Docker sandboxing.
heiko-hotz
A developer guide for Gemini's Multimodal Live API
pipecat-ai
Chat Application Starter Kit โ Gemini Multimodal Live API + Pipecat
danilobatson
AI Trading Agent that transforms social media sentiment into actionable trading signals using LunarCrush analytics and Google Gemini AI. Features real-time progress tracking, background job processing with Inngest, and live dashboard updates via Supabase subscriptions. Built with Next.js 15, TypeScript, and modern AI integration patterns.
pipecat-ai
Gemini Multimodal Live + WebRTC in a single `app.ts`
ykdojo
macOS voice assistant with global hotkeys - transcribe speech to text with offline models (WhisperKit or Parakeet) or cloud-based Gemini API, capture and transcribe screen recordings with visual context, and read selected text aloud with Gemini Live.
gunpal5
Most complete C# .Net SDK for Google Generative AI and Vertex AI (Google Gemini), featuring function calling, easiest JSON Mode, multi-modal live streaming, chat sessions, and more!
google-gemini
Gemini Live provides multimodal realtime agent capabilities. Build voice agents that can process vision and text in realtime.
smnandre
Symfony UX skills for Claude, Gemini, Codex, ... Live Component, Twig Component, Turbo, Stimulus
joryeugene
Edit database tables like Vim buffers. Staged mutations + live SQL preview, transaction undo, schema browser + ER diagrams + DDL, FK navigation, cross-database federation, data profiling, SQL notebooks, AI SQL via Anthropic/OpenAI/Gemini/Ollama, Parquet/CSV/remote files. PostgreSQL ยท SQLite ยท MySQL ยท DuckDB ยท MotherDuck
alandaitch
Real-time AI fact-checker for YouTube videos and live streams. Uses Gemini 2.0 Flash with Google Search grounding.
Intent-Lab
Real-time transcription and AI assistant for Meta Ray-Ban smart glasses. Live speech-to-text, speaker diarization, Gemini Live vision+voice, and WebRTC streaming.
lalomorales22
play mp4 files through terminal, chat with claude, grok, chatgpt, and gemini, live stream via web cam, 3d visualizations in CLI
yeyu2
Gemini Multimodal Live App with Next.js Framework, welcome to my YouTube channel for more interesting projects.
coding-by-feng
Real-time dashboard that turns AI coding agent sessions (Claude Code, Gemini CLI, Codex) into animated 3D robots โ with live terminals, prompt history, tool logs, and queuing. Runs on any device.
aicc2025
Turn any SIP call into a realtime AI voice agent (OpenAI Realtime / Deepgram/Gemini Live)
dylanpersonguy
Polymarket Trading Bot โ Autonomous AI prediction market bot with multi-model ensemble forecasting (GPT-4o, Claude, Gemini), automated research engine, 15+ risk checks, whale tracking, fractional Kelly sizing, and real-time 9-tab monitoring dashboard. Paper & live trading. Open source.
rayl15
Open-source iOS app connecting Meta Ray-Ban smart glasses to AI assistants (OpenClaw + Gemini Live)
daily-co
A demo using Gemini Live where you describe a word and your AI partner tries to guess it
ahmad2b
A real-time voice/call AI agent that lets you talk to a LangGraph agent over LiveKit โ similar to "voice mode" experiences in ChatGPT Voice, OpenAI Realtime API sessions, and Gemini Live. This repo demonstrates adapting any LangGraph agent into a full-duplex, low-latency voice assistant using LiveKit Agents.
mohdhd
๐๏ธ AI-powered project spec generator โ go from idea to implementation-ready spec in minutes. Multi-model support (GPT-5.2, Gemini 3, Claude), live design previews, voice input, and export to markdown.
arii
A real-time fitness monitoring dashboard that streams live heart rate data from Bluetooth devices to a multi-client web interface, with Spotify playback control and a Tabata interval timer. Developed by a fully automated CI/CD system using gemini AI and Jules.
addyosmani
JARVIS built using the Gemini Live API