Found 212 repositories(showing 30)
th1nhhdk
An local, offline (after initial setup), portable OCR software that can process images and PDF files, using DeepSeek-OCR AI (running directly on your machine).
stirling-image
Stirling-PDF but for images. 30+ tools and local AI in a single Docker container - resize, compress, remove backgrounds, upscale, OCR, and more. No cloud, no telemetry. Your images never leave your machine.
SethRobinson
Live AI-powered screen translation via LLMs & GPU OCR. 26 languages, manga support, PDF/CBZ conversion, audio reading, cloud or fully local operation. For gamers, manga readers, and language learners.
illegal-instruction-co
semantic search for your local files find by meaning, not keywords. 120+ file types, OCR, MCP server for AI agents. 100% private.
maniotrix
Vision-first AI agent for desktop automation. Fully offline. Powered by YOLO, OCR & ResNet — building towards local intelligence.
AcoranGonzalezMoray
⛓️💥 VirtualPet is an all-in-one app that integrates features like AI-powered interactive offline chat, image generation, OCR capabilities, an AI-powered code editor, a built-in web browser, a mini video player, shortcut management, and access to public anime content for local viewing and much more, all within a seamless user interface.
bendusy
Full local AI inference stack on Apple Silicon via MLX — LLM, ASR, Embedding, OCR, TTS, Transcription
alexandertaboriskiy
NavixMind is a local-first AI agent for Android (evt. iOS) that runs a ReAct reasoning loop powered by the Claude API. It can execute Python code in a sandboxed environment, process video/audio with FFmpeg, perform OCR via ML Kit, fetch and parse web pages, read/create PDFs, and integrate with Calendar/Gmail, all from a mobile chat interface.
SharanyaAchanta
LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new **BNS/BNSS/BSA** frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.
Model Context Protocol (MCP) server that enables AI assistants to analyze images using xAI's Grok vision API. Supports URL and local file processing with OCR capabilities.
delta-dash
This project is a comprehensive, AI-powered media server designed to manage, search, and interact with your local media library. It uses a FastAPI backend, a database for storing metadata, and various AI models for automatic tagging, OCR, and more.
AndyCG03
AI API Service - API REST con FastAPI para servicios de IA locales: generación de texto, transcripción de audio, embeddings y OCR. Optimizado para bajos recursos, incluye autenticación por API Keys y control granular de permisos. Ejecución local sin dependencia de nube.
MKarthik730
A privacy-first AI desktop assistant with a glassmorphic floating UI, local LLM (Qwen2.5 3B), OCR, voice control, and deep system integration — built with PyQt6.
Offline-first, multimodal AI agent for Windows 11 + WSL2 — integrates voice, vision, OCR, RAG search, and terminal control into one secure local brain. Powered by Ollama (Qwen 7B), Whisper, Piper TTS, and FAISS, it lets your PC listen, see, read, and act — all privately, offline, and auditable.
KDhiraj152
Local-first AI platform for education, research, and complex problem-solving. 8-model pipeline (Qwen, IndicTrans2, Whisper V3, GOT-OCR, BGE-M3) with hardware-adaptive inference (MLX/CoreML/MPS/CUDA), RAG search, multi-tier caching. Handles math, coding, academic research — multilingual with 10 Indian languages. Safe without being restricted.
flatmarstheory
Detect books via webcam, extract cover text using OCR, and generate AI-powered summaries with a local LLM. Real-time GUI built with OpenCV, Tkinter, and Ollama.
Abhiz2411
🌟 AI-powered web app for live problem-solving! Features camera capture, OCR, and AI solutions with cost-efficient local processing. Built with Next.js, TypeScript, TailwindCSS, Tesseract.js, and OpenAI GPT-4. 🚀 Perfect for students and educators!
Luka12-dev
Photo-To-Text is a powerful OCR & AI image-to-text converter. Extracts text using Tesseract or generates descriptions with local BLIP AI. Supports Desktop GUI/CLI and Web interface. Fast, modern, dark-themed, and optimized for both lightweight and heavy systems.
queai-project
Centralize access to **local**, **free** and **open source** Artificial Intelligence solutions, allowing any person or team to install, run and combine AI modules (Chat, RAG, STT, TTS, OCR, etc.) on their own computer without depending on the cloud.
jonaskern-dev
Intelligent PDF document processing for macOS with OCR and AI-powered classification. Automatically extracts, analyzes, and organizes scanned documents using local Ollama models (granite3.2-vision, granite3.3). Drag & drop interface, Finder integration, smart file naming. Privacy-first: all processing runs locally.
mizuharaa
SeroAI is a fast, local-first deepfake detection app (Flask + React/Vite) that fuses multiple forensic signals—watermark OCR, temporal/optical flow, rPPG, artifacts, face dynamics, and scene logic—into calibrated verdicts for videos and images. Capable of detecting Deepfake usage in the latest AI models like SoraAI, Veo, etc.
Treast
A privacy-focused CLI tool that automates OCR and local AI processing to intelligently organize and archive documents.
ewraj
local ai-powered file organizer that understands file meaning using embeddings, ocr, and vision models. runs fully offline.
Crimsab
PatentHub is a self-hosted platform for analyzing patents and scientific documents. It combines global search with a local RAG system to index and chat with PDFs, generate AI-powered technical summaries, and manage documents with OCR and local storage.
AsutoshPati
Turn boring text into stunning visuals with this Python pipeline. Artikle extracts text from images using OCR, summarizes it with AI (OpenAI, Gemini, or Ollama), and generates beautiful illustrations through DALL-E or local diffusion models—inspired by Harry Potter's magical newspapers! #image-generation #visualization#ocr
Yolazega
The LocalTax AI Assistant is a fully local, privacy-first tax automation pipeline for freelancers and small business owners who want maximum control over their sensitive tax data. The system combines local OCR parsing, an AI-powered conversational logic layer, smart matching with bank statements, and automatic form-filling for tax software (e.g. WI
Jeeva-V-2003
AI-powered web scraping pipeline that extracts product data from any website using HTML parsing + OCR, cleans it with local LLMs (via LM Studio), intelligently merges variants, and exports a structured Excel file — fully automated and offline-ready.
bradrame
This project uses a local AI LLM that reasons very quickly to make computer navigation decisions based on very fast and optimized OCR (easyOCR) results. This project has the potential to rival ChatGPT's Operator, Playwright, and Selenium.
krasimirkostadinov
An offline AI-powered KYC document processing system that extracts structured data from Bulgarian ID cards using Tesseract OCR and local LLMs (Mistral 7B via Ollama). Built with React/TypeScript frontend and Node.js backend, featuring Docker containerization for easy deployment.
AIchat - 12GB VRAM - Agentic RAG with GOT OCR2.0