Found 495 repositories(showing 30)
robela
a console application that would run on Windows server to scan user’s Bill and Receipts, which are either captured by camera or in form of an electronic file like pdf etc. 1. All the invoices/receipts will be uploaded on server in a folder 2. The uploaded invoices/receipts will be scanned by OCR app and extract following information from the file and put them in database table - Vendor/Party Name - Invoice date - Tax amount - Total amount - Line items(Item Name, Item Qty, Item rate, Item Tax & Item Amount) 3. The processing of OCR should be done with 90% of accuracy 4. Application designed be able to handle the noise & quality of the uploaded invoice images.
Keyvanhardani
German-OCR is specifically trained to extract text from German documents including invoices, receipts, forms, and other business documents.
mayaramyadav
Laravel OCR & Document Data Extractor – A powerful OCR and document parsing engine for Laravel. It provides intelligent text extraction, structured data parsing, and AI-powered cleanup for documents like invoices, receipts, and PDFs.
ARUN-S-CODER
Extract tables from invoice images, process text using OCR, extract entities and relationships using LLM and traditional methods, and construct a visual knowledge graph.
dinispeixoto
System that uses OCR (Optical Character Recognition) to extract data from invoice photos (e.g. products, supermarket, prices), and displays it in a dashboard.
agrawalamod
Image Analysis Project: Built a system to take images of receipts or invoices as input and segment the text on the receipt. Using Tesseract OCR, we extracted the text from the receipts and stored it in a file. Future work: Use NLP libraries to get structured data.
mjawadshahid
Automate the extraction of key data fields from invoice images using YOLOv8 and OCR. Train custom models to detect fields like invoice ID, total amount, and address, then extract text and export to Excel. Ideal for streamlining data entry and reducing manual effort.
agrawalamod
Ported OpevCV version to Android. Image Analysis Project: Built a system to take images of receipts or invoices as input and segment the text on the receipt. Using Tesseract OCR, we extracted the text from the receipts and stored it in a file.
Bliitze
Optical character recognition (OCR) allows you to extract printed or handwritten text from images, such as photos of street signs and products, as well as from documents—invoices, bills, financial reports, articles, and more. Microsoft's OCR technologies support extracting printed text in several languages.
Extracting relevant information like invoice number, date, amount etc. from PDF files using OCR and NLP techniques
giruu
This Flask application empowers users to seamlessly upload image files like invoices or receipts, extract text using robust OCR technologies, and efficiently isolate key fields using precise regular expressions and multiprocessing to streamline data extraction and enhance productivity.
hrushikesh009
A TensorFlow OCR solution,Leveraging advanced object detection models like EfficientDet, this tool simplifies date retrieval, streamlining restaurant management processes. Enhance accuracy and efficiency in financial record-keeping with InvoiceDateExtractor.
ai92-github
Extract data from images, pdf, invoices, receipts | Extract tables from pdf, images and convert to Excel/CSV | OCR complex pdfs, images.
A professional OCR automation web application built with Streamlit and EasyOCR, designed to extract structured invoice data from PDFs and images.
satyamaditya
Assignment By - MASTERS INDIA Project - Extract invoice number, invoice date, line items from invoice images. Project Details - MySelf Aditya, After my research toward this assignment, i found, this is the problem of OCR (OPTICAL CHARACTER RECOGNITION) Basically the work of OCR is to transform & extract the data from semi-structured(BILLs, INVOICES) or un-structured(CONTRACT, LEGAL DOCUMENTS) to structured format(CSV, EXCEL, XML, DATABASES). By this project i got idea about REAL-LIFE Problem, the problem of entire enterprise are data entry. Data entry is preety expensive & time consuming So, basically OCR creates an environment without manual data entry. 90% of stuffs can be configure by OCR & humans will only interrupt when accuracy is not good or some intervention is required. TOOLS of OCR are available in the market like- ABBYY, ROSSUM, AUTOMATION ANYWHERE, XTRACTA for the sake of Assignment, I did this Project with the help of CV2 (Computer Vision) and PYTESSERACT (python library for OCR). ''' ''' the main problem with this project is that we dont have any similar kind of format or structure, [things can be done- REGEX or ROI] initially i was looking for regex to solve this problem, but again due to variation in every format, i can just fetch the invoice date automatically by regex. if i show you the images, we can fetch the each given date by REGEX then store it in list, we know that invoice date will be store firstly & be the index 0. And hence we can easily fetch 0th index element every time & append it in CSV. But the problem arrives in remains 2 that we don't have any common starting point or end point hence we can't apply regex here......
khadibd
A Python-based invoice OCR extractor that automatically reads invoices from a folder (PDFs or images), extracts key fields and line items, and exports the results to a formatted Excel report.
JAYASIMMA
No description available
alexngun
An OCR - Optical Character Recognition that can extract financial data from images of invoices.
carlosrod723
A Python analysis using YOLO V4 for object detection and Tesseract OCR for text recognition. This custom OCR system automates invoice scanning by detecting and extracting key fields like Invoice number, Billing Date, and Total amount, streamlining the process of digital invoice reconciliation.
MaxineXiong
This repository houses a UiPath automation solution tailored for Invoice Extraction in RPA workflows. Tasked with reading each table row and extracting invoice details from each invoice photo through OCR, the solution outputs a CSV file with the extracted data alongside the table's ID and Due Date, specifically for two invoice suppliers.
atorhub
ANJ Dual OCR Parser — AI-powered invoice/bill extractor featuring dual-pass OCR (quick + enhanced), smart parsing, automatic field detection, and multi-format export (JSON, CSV, XLSX, PDF, ZIP). 100% client-side, secure, lightweight, and deployable on GitHub Pages or Netlify.
IbrahimShadi
Rules-driven PDF analyzer & auto-renamer. Classifies (invoice / flight ticket / passport / other) with per-class probabilities, extracts key fields, optional OCR for scans, and renames files based on metadata.
RittickSR
An neural network and OCR based approach to extract data from invoices
suhasdatascientist
Given a set of invoice pdfs and OCR output, you need to extract important business fields and invoice line items from the OCR output.
AmmarAhm3d
Invoice-Gemini-Extracter: Python tool to extract structured invoice data (fields and line items) from PDFs/images using OCR, preprocessing, and Google Gemini-powered extraction/normalization.
emredeveloper
Invoice AI Extractor is an AI-powered tool that extracts structured data from invoices automatically. It combines OCR and modern language models to transform unstructured invoice documents into clean, machine-readable formats such as JSON and CSV.
Remon128
This is a Notebook which detects Invoice Images Data and Extracting records like Invoice Date and Items Description based on tesseract OCR Engine.
Ramakm
A robust API to extract line items and totals from different invoices using Tesseract OCR and FastAPI.
Govindcoderr
Intelligent Document Processing Agent – Extracts invoices using OCR + LLM, validates data, and syncs with ERP (zoho) automatically.
A robust system for extracting and validating data from invoice PDFs using advanced OCR and computer vision techniques.