Found 122 repositories(showing 30)
This project offers an efficient method for identifying and recognizing handwritten text from images. Using a Convolutional Recurrent Neural Network (CRNN) for Optical Character Recognition (OCR), it effectively extracts text from images, aiding in the digitization of handwritten documents and automated text extraction.
bharathbhimshetty
# Denoising Dirty Documents Optical Character Recognition (OCR) is the process of getting type or handwritten documents into a digitized format. If you've read a classic novel on a digital reading device or had your doctor pull up old healthcare records via the hospital computer system, you've probably benefited from OCR. OCR makes previously static content editable, searchable, and much easier to share. But, a lot of documents eager for digitization are being held back. Coffee stains, faded sun spots, dog-eared pages, and lot of wrinkles are keeping some printed documents offline and in the past. This competition challenges you to give these documents a machine learning makeover. Given a dataset of images of scanned text that has seen better days, you're challenged to remove the noise. Improving the ease of document enhancement will help us get that rare mathematics book on our e-reader before the next beach vacation. We've kicked off the fun with a few handy scripts to get you started on the dataset. Acknowledgements Kaggle is hosting this competition for the machine learning community to use for fun and practice. This dataset was created by RM.J. Castro-Bleda, S. España-Boquera, J. Pastor-Pellicer, F. Zamora-Martinez. We also thank the UCI machine learning repository for hosting the dataset. If you use the problem in publication, please cite: Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science ## AIM: * To Denoise the images using Encoder-Decoder Model ## Dataset: * https://www.kaggle.com/c/denoising-dirty-documents/data * We are provided two sets of images, train and test. These images contain various styles of text, to which synthetic noise has been added to simulate real-world, messy artifacts. The training set includes the test without the noise (train_cleaned). You must create an algorithm to clean the images in the test set.
oladimeji-kazeem
Handwriting Transcription using Deep Learning is a project aimed at converting handwritten text into digital text. This project leverages state-of-the-art deep learning techniques to recognize and transcribe handwritten text from images, making it useful for digitizing handwritten notes, documents, and more.
ikonthomas
I will tackle the issue of handwritten scribes usually written by Doctors and Nurses. In my project I will digitize handwritten scripts to digital records which can then be stored in non relational databases. The data used to build the model was collected from IAM Handwriting database. This model is built on handwriting from 657 different writers. Each writer has written multiple paragraphs and sentences have been extracted from those paragraphs.
Redvanisation
A React and soon native app that converts handwritten or printed text on paper to computer text (digitizing it) giving you a faster and easier way to get things done. The app also might have the ability to tell in which language was a piece of text written after scanning it.
BrunoFelalaga
Full-featured SwiftUI iOS app for digitizing handwritten notes and image-based text using Apple’s Vision OCR, with text-to-speech, Google Keep export, Core Data persistence, and accessibility support. Designed for seamless capture, transcription, organization, and sharing of handwritten or scanned notes.
This project uses a fine-tuned Llama3.2-VL model to extract handwritten text from images with high precision. Ideal for digitizing notes, letters, or documents, it features a user-friendly Gradio interface for seamless interaction.
This Streamlit app converts handwritten notes into editable digital text using EasyOCR. It is useful for students, researchers, and professionals who want to digitize their handwritten content.
AmineRACHID
This project focuses on the recognition of handwritten text in two languages: English and Tifinagh. Handwritten text recognition plays a crucial role in various applications, including document digitization, optical character recognition (OCR), and language preservation.
AminexRACHID
This project focuses on the recognition of handwritten text in two languages: English and Tifinagh. Handwritten text recognition plays a crucial role in various applications, including document digitization, optical character recognition (OCR), and language preservation.
AbhishekMudaraddi
This project extracts handwritten Kannada text from PDF images using PyMuPDF and Tesseract OCR. It processes images for better contrast using OpenCV, improving text recognition accuracy. The script also filters out unwanted patterns like URLs and digits, ensuring clean output. This tool is ideal for digitizing Kannada handwritten documents.
IshaniSen2612
A system that can interpret and digitize handwritten text using ML.
AartiBhagtani
A system where handwritten text from a form recognized and converted into digitized text.
afislonge
This project's primary goal is to create a system for handwriting recognition utilizing machine learning models. The objective is to precisely translate handwritten text into a digital format. To solve the problem of digitizing handwritten content, an algorithm model that can identify and understand human handwriting must be developed.
ManInTheHam
PrescriptionOCR - A deep learning–powered system that extracts and digitizes text from handwritten medical prescriptions.
Shivansh13sri
InkSight (Handwritten-text recognition) is a modern web application that uses Optical Character Recognition (OCR) to extract handwritten text from images. Designed for students, researchers, and professionals, the app helps digitize handwritten content quickly and accurately.
Mohammad-Ghouse-virtuoso
End-to-end Handwritten Text Recognition (HTR) using deep learning. Converts handwritten images to digital text using CNNs, RNNs, and CTC loss. Supports training, inference, and integration into applications for document digitization and transcription.
Satvika26
Created an OCR model using PyTesseract and OpenCV to extract and digitize handwritten text from images. Tech Stack: Python, OpenCV, PyTesseract
Owl-Hacks
We won 3rd place at SM Hacks for this project. A Django Web App that scans and digitizes handwritten notes into text.
SrijanRoy12
A machine learning project to recognize handwritten characters (A-Z, 0-9) using image processing and neural networks for text digitization and analysis.
Yash-Kunwar
Developed a handwriting recognition model leveraging machine learning techniques to accurately interpret and digitize handwritten text (digits). Designed and trained the model using a labeled dataset, achieving high accuracy in recognizing diverse handwriting styles and improving text digitization efficiency.
Usaidkhxn
Developed an OCR-based system to recognize and digitize handwritten text, enabling efficient conversion of manual notes into digital format. Designed for accuracy, scalability, and real-world usability in document processing."
The recognition of handwritten text has emerged as a fundamental aspect of the progress in computer vision applications, particularly in the realms of historical document processing, authentication, and the digitization of languages.
DarylFernandes99
A CNN-based OCR system that extracts both typed and handwritten text from images. Using a multi-layered convolutional neural network trained on the EMNIST dataset, it processes images through hierarchical contour detection to identify and recognize alphanumeric characters. Ideal for document digitization and text extraction applications.
Ambri28
Handwriting recognition is the process of automatically converting handwritten text into digital format using machine learning techniques. This technology plays a crucial role in Optical Character Recognition (OCR) systems, enabling applications such as automatic form processing, document digitization
Zeeshier
Smart Notes is an intelligent educational and productivity app that helps you capture, digitize, and organize your handwritten or printed notes effortlessly. Powered by Google ML Kit and Firebase, it recognizes text from images with high accuracy and saves your notes securely in the cloud for instant access — anytime, anywhere.
Character recognition has been capturing the interest of researchers since the beginning of the nineteenth century. While the Optical Character Recognition for printed material is very robust and widespread nowadays, the recognition of handwritten materials lags behind. In our digital era more and more historical, handwritten documents are digitized and made available to the general public. However, these digital copies of handwritten materials lack the automatic content recognition feature of their printed materials counterparts. We are proposing a practical, accurate, and computationally efficient method for Old English character recognition from manuscript images. Our method relies on a modern machine learning model, Artificial Neural Networks, to perform character recognition based on individual character images cropped directly from the images of the manuscript pages. We propose model dimensionality reduction methods that improve accuracy and computational effectiveness. Our experimental results show that the model we propose outperforms previous attempts as well as current automatic text recognition techniques
PavlovIvan77
Usage for digitization of handwritten text.
kyletolle
For converting handwritten journal pages into digitized text.
NagadiLeelaRao
An AI-Powered Handwritten and Printed Text Medical Prescription digitizer