Search Results

Found 334 repositories(showing 30)

trocr

rsommerfeld

🧡66

Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models".

257

MIT

Python

Updated 1 day ago

computer-visionhandwritten-text-recognitionocr+3

Handwriting-OCR-ScriptReader

LPBeaulieu

🧡60

ScriptReader allows you to perform Optical Character Recognition (OCR) on your handwritten notes!

AGPL-3.0

Python

Updated 2 weeks ago

OCR

OlehOnyshchak

❤️35

Optical Character Recognition system for handwritten math expressions

MIT

Jupyter Notebook

Updated 4 months ago

handwriting-recognitionkeraslatex+6

KannadaHandwritingRecognition

SubhrajyotiSen

❤️25

Optical Character Recognition of handwritten documents in Kannada

Apache-2.0

Python

Updated 4 months ago

bigSunOCR

Wrste

🧡60

This project aims to achieve good results in handwritten mathematical formulas, printed formulas, complex formula samples, or comprehensive optical character recognition tasks.

NOASSERTION

Python

Updated 2 weeks ago

hocrux

dropbox

❤️40

Handwritten optical character recognition

Apache-2.0

Python

Updated 3 years ago

Automate-identification-and-recognition-of-handwritten-text-from-an-image

VMD7

🧡50

This project offers an efficient method for identifying and recognizing handwritten text from images. Using a Convolutional Recurrent Neural Network (CRNN) for Optical Character Recognition (OCR), it effectively extracts text from images, aiding in the digitization of handwritten documents and automated text extraction.

MIT

Jupyter Notebook

Updated 1 month ago

crnncrnn-kreascrnn-ocr+5

OCR_Math

oceanusxiv

❤️25

Optical Character Recognition for handwritten mathematical formulae using Neural Net

Python

Updated 2 years ago

handwritten-prescription-recognition

ronitkathuria15

❤️20

The Optical Character Recognition (OCR) system consists of a comprehensive neural network built using Python and TensorFlow that was trained on over 115,000 wordimages from the IAM On-Line Handwriting Database (IAM-OnDB). The neural network consists of 5 Convolutional Neural Network (CNN) layers, 2 Recurrent Neural Network (RNN) Layers, and a final Connectionist Temporal Classification (CTC) layer. As the input image is fed into the CNN layers, a non-linear ReLU function is applied to extract relevant features from the image. The ReLU function is preferred due to the lower likelihood of a vanishing gradient (which arises when network parameters and hyperparameters are not properly set) relative to a sigmoid function. In the case of the RNN layers, the Long Short-Term Memory (LSTM) implementation is used due to its ability to propagate information through long distances. The CTC is given the RNN output matrix and the ground truth text to compute the loss value and the mean of the loss values of the batch elements is used to train the OCR system. This means is fed into an RMSProp optimizer which is focused on minimizing the loss, and it does so in a very robust manner. For inference, the CTC layer decodes the RNN output matrix into the final text. The OCR system reports an accuracy rate of 95.7% for the IAM Test Dataset, but this accuracy falls to 89.4% for unseen handwritten doctors’ prescriptions.

Python

Updated 1 year ago

Handwritten-Optical-Character-Recognition

Sagyam

❤️40

Responsive web app that can recognize handwritten characters drawn on canvas and perform arithmetic calculation.

GPL-3.0

Jupyter Notebook

Updated 3 months ago

calculatorcalculator-javascriptconvolutional-neural-networks+2

Urdu-OCR

muhammadsohaib60

❤️30

Our project is based on one of the most important application of machine learning i.e. pattern recognition. Optical character recognition or optical character reader is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image. We are working on developing an OCR for URDU. We studied a couple of research papers related to our project. So far, we have found that Both Arabic and Urdu are written in Perso-Arabic script; at the written level, therefore, they share similarities. The styles of Arabic and Persian writing have a heavy influence on the Urdu script. There are 6 major styles for writing Arabic, Persian and Pashto as well. Urdu is written in Naskh writing style which is most famous of all. Optical character recognition (OCR) is the process of converting an image of text, such as a scanned paper document or electronic fax file, into computer-editable text [1]. The text in an image is not editable: the letters are made of tiny dots (pixels) that together form a picture of text. During OCR, the software analyzes an image and converts the pictures of the characters to editable text based on the patterns of the pixels in the image. After OCR, the converted text can be exported and used with a variety of word-processing, page layout and spreadsheet applications [2]. One of the main aims of OCR is to emulate the human ability to read at a much faster rate by associating symbolic identities with images of characters. Its potential applications include Screen Readers, Refreshable Braille Displays [3], reading customer filled forms, reading postal address off envelops, archiving and retrieving text etc. OCR’s ultimate goal is to develop a communication interface between the computer and its potential users. Urdu is the national language of Pakistan. It is a language that is understood by over 300 million people belonging to Pakistan, India and Bangladesh. Due to its historical database of literature, there is definitely a need to devise automatic systems for conversion of this literature into electronic form that may be accessible on the worldwide web. Although much work has been done in the field of OCR, Urdu and other languages using the Arabic script like Farsi, Urdu and Arabic, have received least attention. This is due in part to a lack of interest in the field and in part to the intricacies of the Arabic script. Owing to this state of indifference, there remains a huge amount of Urdu and Arabic literature unattended and rotting away on some old shelves. The proposed research aims to develop workable solutions to many of the problems faced in realization of an OCR designed specifically for Urdu Noori Nastaleeq Script, which is widely used in Urdu newspapers, governmental documents and books. The underlying processes first isolate and classify ligatures based on certain carefully chosen special, contour and statistical features and eventually recognize them with the aid of Feed-Forward Back Propagation Neural Networks. The input to the system is a monochrome bitmap image file of Urdu text written in Noori Nastaleeq and the output is the equivalent text converted to an editable text file.

Jupyter Notebook

Updated 11 months ago

Denoising-Dirty-Documents

bharathbhimshetty

❤️35

# Denoising Dirty Documents Optical Character Recognition (OCR) is the process of getting type or handwritten documents into a digitized format. If you've read a classic novel on a digital reading device or had your doctor pull up old healthcare records via the hospital computer system, you've probably benefited from OCR. OCR makes previously static content editable, searchable, and much easier to share. But, a lot of documents eager for digitization are being held back. Coffee stains, faded sun spots, dog-eared pages, and lot of wrinkles are keeping some printed documents offline and in the past. This competition challenges you to give these documents a machine learning makeover. Given a dataset of images of scanned text that has seen better days, you're challenged to remove the noise. Improving the ease of document enhancement will help us get that rare mathematics book on our e-reader before the next beach vacation. We've kicked off the fun with a few handy scripts to get you started on the dataset. Acknowledgements Kaggle is hosting this competition for the machine learning community to use for fun and practice. This dataset was created by RM.J. Castro-Bleda, S. España-Boquera, J. Pastor-Pellicer, F. Zamora-Martinez. We also thank the UCI machine learning repository for hosting the dataset. If you use the problem in publication, please cite: Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science ## AIM: * To Denoise the images using Encoder-Decoder Model ## Dataset: * https://www.kaggle.com/c/denoising-dirty-documents/data * We are provided two sets of images, train and test. These images contain various styles of text, to which synthetic noise has been added to simulate real-world, messy artifacts. The training set includes the test without the noise (train_cleaned). You must create an algorithm to clean the images in the test set.

Jupyter Notebook

Updated 11 months ago

ocr

prasunroy

❤️40

:large_orange_diamond: An optical character recognition system for handwritten English, Bengali and Devanagari characters.

MIT

Python

Updated 1 year ago

convolutional-neural-networksoptical-character-recognition

OCR-for-Arabic-Scripts

mohamedbassel24

❤️35

Optical character recognition or optical character reader (OCR) is the recognition process of text obtained from media in the form of typed, handwritten or printed text into machine-encoded text form. The text in question may be presented in the form of a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image.

Python

Updated 9 months ago

HandWritten-Text-Recognizer

Mattral

❤️45

Streamlit Web Interface for Handwritten Text Recognition (HTR), Optical Character Recognition (OCR) implemented with TensorFlow and trained on the IAM off-line HTR dataset. The model takes images of single words or text lines (multiple words) as input and outputs the recognized text.

Jupyter Notebook

Updated 2 months ago

OCR-Net-MAUI

Bliitze

❤️40

Optical character recognition (OCR) allows you to extract printed or handwritten text from images, such as photos of street signs and products, as well as from documents—invoices, bills, financial reports, articles, and more. Microsoft's OCR technologies support extracting printed text in several languages.

MIT

Updated 2 years ago

Handwritten-Equation-Solver

prateeek1

❤️45

A Handwritten Equation Solver built using Convolutional Neural Network and Optical Character Recognition. CNN model is used for recognition of digits and symbols. OCR is used for processing the the image and segmentation.

Jupyter Notebook

Updated 1 month ago

convolutional-neural-networksdeep-learningdeep-neural-networks+1

EXtracting-data-from-images

yuvaraj23

❤️20

Optical Character Recognition is a process when images of handwritten, printed, or typed text are converted into machine-encoded text. Automated recognition of documents, bill receipts, credit cards, car plates, and billboards significantly simplifies the way we collect and process data. The goal of our project was to develop an api using the flask for receipt recognition. Let's take a closer look.

Python

Updated 9 months ago

Notes-to-Notes

bhngupta

❤️25

Smart Web App made in Flask to digitalize 👋handwritten notes and save it in Google Keep. This app takes images in any format .png, .jpg, .jpeg or any other format, ML model scans the complete image 📷 for possible text which is then read out by the OCR (Optical Character Recognition).🎁

MIT

CSS

Updated 1 year ago

deep-learningflaskmser+2

Optical-Handwritten-Character-Recognition

bridgetmansour

❤️35

This program converts a picture of handwriting to printable text using Matlab's Computer Vision toolbox

MATLAB

Updated 1 year ago

KannadaHandwritingRecognition

kspook

❤️35

Optical Character Recognition of handwritten documents in Kannada

Python

Updated 2 years ago

OCRProject

RawanRefaat

❤️35

Optical Character Recognition of Handwritten Arabic Numerals/Digits using Artificial Nueral Network

Python

Updated 2 years ago

CalculusOCR

JignasP

❤️20

CalculusOCR: A python package that can perform optical character recognition on handwritten calculus expressions and outputs LaTeX code, Sympy equation and solution.

MIT

Python

Updated 1 year ago

Automatic-Handwritten-conversion

harshithaproject

❤️35

Recognition of Handwritten English alphabets have been broadly studied in previous years. Optical character recognition (OCR) method has been used in converting printed text into editable text. OCR is very useful and popular method in various applications. Accuracy of OCR can be dependent on text pre-processing and segmentation algorithms. This project seeks to classify an individual handwrittenword so that handwritten text can be translated to a digital form. We used two main approaches to accomplish this task: classifying words directly and character segmentation. For the former, we use Convolutional Neural Network (CNN) with various architectures to train a model that can accurately classify words.

Python

Updated 1 year ago

Optical-Character-Recognition

sujal7

❤️35

Optical Character Recognition to recognize handwritten characters using Convolutional Neural Network (CNN).

Jupyter Notebook

Updated 1 year ago

tamil-handwritten-character-recognition

rshika

❤️35

Tamil Handwritten Character Recognition (Tamil HCR) is a image processing sofware code developed using matlab #c. This is my engineering final year project, concept of this project is recognising handwritten text using optical character recognition and converting it into digital text through neural networks

Updated 1 year ago

HTR

GOKUL-REDDY

❤️20

Handwritten text recognition is basically we can see it in our CamScanner App which we use daily.In that app they used OCR(Optical Character Recognition) and Normal fast recognition which doesn’t have accuracy at all.But while using OCR its working, But not completely.So I started with OCR to get text recognized,but I faced many problems using that.. like API key and Money to be paid,then I decided to go offline then I found the “PY TESSERACT '' which is an Optical Character Recognition(OCR) tool for Python. Together they can be used to read the contents of a section of the screen. And further application in NLP(Natural Language Processing) also helps a lot. Hence, I worked on this Py Tesseract and got the results which are almost accurate and much better when compared to CamScanner fast recognition

Jupyter Notebook

Updated 1 year ago

Image-To-Text

Electro-nics

❤️35

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image

Python

Updated 1 year ago

OCR_Tool

BashitaliShaikh

❤️35

OCR (optical character recognition) is the use of technology to distinguish printed or handwritten text characters inside digital images of physical documents, such as a scanned paper document. The basic process of OCR involves examining the text of a document and translating the characters into code that can be used for data processing. OCR is sometimes also referred to as text recognition. OCR systems are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. Hardware, such as an optical scanner or specialized circuit board is used to copy or read text while software typically handles the advanced processing. Software can also take advantage of artificial intelligence (AI) to implement more advanced methods of intelligent character recognition (ICR), like identifying languages or styles of handwriting.

Python

Updated 9 months ago

Optical-Character-Recognition.-OCR

mohdzahidK

❤️40

Optical character recognition or optical character reader is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image. The first step of OCR is using a scanner to process the physical form of a document. Once all pages are copied, OCR software converts the document into a two-color, or black and white, version The scanned-in image or bitmap is analyzed for light and dark areas, where the dark areas are identified as characters that need to be recognized and light areas are identified as background Pattern Recognition & Feature Detection

Apache-2.0

Python

Updated 9 months ago

GitHub Explorer

Search Results

trocr

Handwriting-OCR-ScriptReader

OCR

KannadaHandwritingRecognition

bigSunOCR

hocrux

Automate-identification-and-recognition-of-handwritten-text-from-an-image

OCR_Math

handwritten-prescription-recognition

Handwritten-Optical-Character-Recognition

Urdu-OCR

Denoising-Dirty-Documents

ocr

OCR-for-Arabic-Scripts

HandWritten-Text-Recognizer

OCR-Net-MAUI

Handwritten-Equation-Solver

EXtracting-data-from-images

Notes-to-Notes

Optical-Handwritten-Character-Recognition

KannadaHandwritingRecognition

OCRProject

CalculusOCR

Automatic-Handwritten-conversion

Optical-Character-Recognition

tamil-handwritten-character-recognition

HTR

Image-To-Text

OCR_Tool

Optical-Character-Recognition.-OCR

trocr

Handwriting-OCR-ScriptReader

OCR

KannadaHandwritingRecognition

bigSunOCR

hocrux

Automate-identification-and-recognition-of-handwritten-text-from-an-image

OCR_Math

handwritten-prescription-recognition

Handwritten-Optical-Character-Recognition

Urdu-OCR

Denoising-Dirty-Documents

ocr

OCR-for-Arabic-Scripts

HandWritten-Text-Recognizer

OCR-Net-MAUI

Handwritten-Equation-Solver

EXtracting-data-from-images

Notes-to-Notes

Optical-Handwritten-Character-Recognition

KannadaHandwritingRecognition

OCRProject

CalculusOCR

Automatic-Handwritten-conversion

Optical-Character-Recognition

tamil-handwritten-character-recognition

HTR

Image-To-Text

OCR_Tool

Optical-Character-Recognition.-OCR