Found 279 repositories(showing 30)
Academic-Hammer
Table structure recognition dataset of the paper: Complicated Table Structure Recognition
houking-can
Best PDF Converter! PDF to any format, pdf2word/excel/xml/html/txt...
luantak
Extract text from plaintext, .docx, .odt and .rtf files. Pure go.
flyyuan
将影印版 PDF 图书转换为文本 TXT,供 GPTs 使用作为知识库
songisking
It's a python script that convert PDF to txt using PDFMiner
jayhenry
No description available
clulab
Convert PDF files to TXT
jamalmazrui
Batch convert PDF files to text under Windows, using several text extraction methods or OCR
shakeel
Extract raw text from PDF files
mmahdibarghi
python program which could change Persian pdfs with any format (absolutely pdfs which created by images) to text file
flyingeek
A PDF to text converter for Scriptable App (iOS) working offline
Academic-Hammer
Converting pdf to any format for easily analyzing
mattharrison
Python generators that are useful in manipulating text (particularly creating mobi from pdf2txt)
veer66
Thai pdf to text script
MoinDalvs
Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents
patch0000
pdf2txt sample
SPACESODA
Convert PDFs into clean, LLM-ready text.
AzozzALFiras
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
robarchibald
A simple Go package for extracting text from a PDF file
tos-kamiya
Convert multi-column pdf to text with `poppler` and `tesseract`
pynight
Convert pdf into raw text
FileFormatInfo
Simple server to extract text from a PDF
undebuggable
📄 Extract text page by page from OCR-ed and non OCR-ed PDFs.
bryanoliveira
A PDF to text converter using a MLP-based OCR.
wangx404
A python script for people who want to convert pdf to txt using tencent api.
gaazau
Based pdfminer.six, Convert PDF file into text or images
Josh-Been
Convert PDF documents to TXT, cleaning as well as able to remove formulas, diagrams, line breaks, separated hyphens, etc.
nikaiw
A dumb burp extension which transforms pdf2txt in every server responses
kalle07
with a nice GUi, convert your PDF to TXT, no OCR, no images
Anbo-WU
The repository includes pdf2txt, text segmentation, API invoking for LLMs and et cetera. The core is to provide technical support for knowledge consulting tasks in StatChat project.