Found 204 repositories(showing 30)
edwineas
Ubuntu Text Capture is a Python tool that captures a selected area of the screen, extracts text using Tesseract OCR, and copies it to the clipboard. It includes a customizable GNOME keyboard shortcut (Shift + Ctrl + T) for quick activation, making text extraction from images fast and easy.
This project offers an efficient method for identifying and recognizing handwritten text from images. Using a Convolutional Recurrent Neural Network (CRNN) for Optical Character Recognition (OCR), it effectively extracts text from images, aiding in the digitization of handwritten documents and automated text extraction.
mjawadshahid
Automate the extraction of key data fields from invoice images using YOLOv8 and OCR. Train custom models to detect fields like invoice ID, total amount, and address, then extract text and export to Excel. Ideal for streamlining data entry and reducing manual effort.
Jacky0111
Explore the world of Optical Character Recognition (OCR) with this beginner-friendly PaddleOCR tutorial. From installation to hands-on projects, this repository guides you through the essentials, making OCR accessible for beginners and intermediate users. Dive in and unlock the potential of text extraction from images using PaddleOCR
Mohsinrazaa
Qubitrics -ML Intern assignment(Text Extraction from image using OCR)
AbhishekMudaraddi
Extract text from PDFs using Google Vision API. This script converts PDF pages to images, preprocesses them for OCR accuracy, and uses Google Vision API for text extraction. It supports parallel processing for efficiency and saves extracted text in a structured format for each PDF.
varshhhy7
Parseo v1 is the first version of a FastAPI-based OCR microservice for text extraction from images and PDFs. Using Tesseract, Docker, and AWS deployment, it provides REST endpoints for scalable parsing. Includes automated testing via pre-commit hooks, making it reliable, developer-friendly, and ready for production.
josmarcristello
Python-based OCR tool using EasyOCR and OpenCV for automated text extraction from images. Customizable image preprocessing steps and options for GPU acceleration make this a versatile and efficient solution for various OCR tasks
Mrigank005
This Python script automates the extraction of text from images using Tesseract OCR. It processes all images in the test_images/ folder and saves the extracted text as .txt files in the extracted_texts/ directory, maintaining the original image filenames.
sonikumaramukions
This project demonstrates how to perform text detection from images using Python. It extracts and identifies text from images using powerful libraries like Tesseract OCR and OpenCV. This can be useful for building applications like document scanners, license plate readers, or data extraction tools.
Applying OCR on manually selected Region of Interests (using mouse drag) for Text extraction from Images
01satria
Free Telegram OCR Bot using Google Apps Script. High-speed text extraction from images with zero hosting costs.
ceodaniyal
This repository contains a Python script to extract text from images using OpenAI's GPT-4 API. The script supports text extraction from both online image URLs and locally stored images (converted to base64). It ensures accurate and structured text extraction, making it a powerful tool for OCR-like tasks. The extracted text is saved to a file
arathikrishnaam
This project extracts medical data from images/PDFs using OCR, validates parameters against normal ranges, and generates reports in tabular and PDF formats. It includes text extraction, validation of health metrics, data visualization, and speech synthesis, automating the process of analyzing and reporting patient health data.
treeleafrnd
This is the task of Optical Character Recognition that automate the data extraction from printed or written text from a scanned document or image file and then converting the text into a machine-readable form where we have created own datasets for implementation into OCR model. Dataset and Output link: https://drive.google.com/drive/folders/10w_Gg4HGCKYTlCCCVXc98nmIWOKl-dhF?usp=sharing Steps of the task For image Alignment:- At first images are read by using opencv library. After reading the image, the image is resized into (600,600) pixels. Then we convert the resized image into gray scale image for applying edge detection. After that canny edge detector is used to detect the edges in the images. In input data or images most of them have horizontal lines so I apply hough transformation to detect the horizontal lines inside the images. Now after that we get lines and use those lines to calculate the slope.Here alogorithm detected many lines in which some of them may be undesired lines so I use median such that 50% of the data comes from median value and use them. Then by using the slope I calculate angle of inclination and finally rotate the images in a certain degree.
prathamj937
No description available
SuryaXanden
Text extraction from an image using Tesseract engine for OCR.
SwekeR-463
an ocr using qwen2-vl for hindi & english text extraction from images
rakshak-salve
Effortlessly extract text from images using OCR and deep learning. This toolkit offers a user-friendly web app and batch processing for fast, accurate text extraction.
hakankopal
This repository provides a Python code example for text extraction from images and PDFs using pytesseract. Use this code as a starting point for developing OCR applications.
DeleLinus
A Telegram bot that automates text extraction from images using OCR technology. Upload images with text, and the bot swiftly analyzes them. Seamlessly integrates with Google Sheets for easy data logging. Simplifies digitizing and organizing textual information.
ArashAzma
Advanced license plate recognition system using state-of-the-art object detection and OCR technologies. The pipeline integrates YOLOv11📸 for license plate detection, OpenCV for image preprocessing, and GOT-OCR2_0 from huggingface🤗 for text extraction.
arif05khan
OCR-Text-Scanner is an application for extracting data from images of Indian passports and driving licenses. Built with a Flask backend using Tesseract for OCR and regex for data extraction, and a React frontend with Tailwind CSS, it provides a streamlined, user-friendly interface for retrieving key document information accurately.
HBX814
This project automates text extraction from Hindi/Sanskrit PDFs using a pipeline that converts pages to images with pdf2image and applies OCR via Tesseract. It cleans and segments text based on Devanagari punctuation, preserving key elements like dates. The output, structured for NLP tasks, enhances accessibility to Indian language documents.
tanishra
DocQuery AI – An AI-powered RAG-based chatbot that lets you upload multiple medical PDFs and chat with them in natural language. It uses FAISS vector database, embeddings, and LLMs to provide accurate, context-aware answers. Supports text extraction from PDFs using PyPDF, and OCR via Tesseract for images inside PDFs.
Config files for my GitHub profile.
anjan-71
Text Extraction from Images using Tesseract OCR.
Chandradeep-0311
Extraction of text from invoice image using different ocr applications
0xZee
🧾 OCR AI Streamlit App – Text Extraction from Images using Groq Vision LLM
xcrap-dev
Xcrap Image Text Extractor is a package of the Xcrap framework that abstracts the extraction of texts from images using the node-tesseract-ocr library.