Found 646 repositories(showing 30)
jsvine
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
harshhh28
Hia (Health Insights Agent) - AI Agent to analyze blood reports and provide detailed health insights.
genieincodebottle
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
A python scripe that collecting financial data from ju-chao web, and can download pdf files from it , more important is it can parase data you want from pdf files using pdfplumber .
hasan-py
Chat with PDF using LangChain, Streamlit, Ollama (for LLM inference) and PDFPlumber. Overall which is an example of a Retrieval-Augmented Generation (RAG) system with Deepseek r1 model.
theaifutureguy
AI-powered legal chatbot that leverages Retrieval-Augmented Generation (RAG) with DeepSeek R1 for advanced legal reasoning and document analysis. It provides a sophisticated legal assistant that can process and analyze complex legal documents, retrieve relevant information using advanced vector search, and generate nuanced legal analysis.
amitvikramraj
Extracting details from Resume(CVs) and matching with Job Description(JDs) using pretrained model like DistilBERT and ranking them using cosine similarity.
aborruso
Alice PDF is a CLI that extracts tables from PDFs—native or scanned—using Camelot, Mistral OCR, AWS Textract, or pdfplumber and saves them as CSV files
jaspreetsidhu3
Convert PDF into an audiobook.
No description available
eriston
Using PDFPlumber for PDF data extraction
dannguyen
NICAR 2019 workshop on using Python and PDFplumber to extract text from PDFs
vishnupriyanpr
Medical AI Report Summarizer is a smart web application that allows users to upload medical reports (PDF, JPG, PNG), automatically extract text, highlight medical terms, and generate an AI-based summary. Users can also export the final summary as a clean PDF.
Developed an AI-driven system using Groq + LLaMA 3 for automated material estimation from architectural PDFs. Implemented OCR, PyMuPDF, and pdfplumber to extract structured room-wise data, optimizing material takeoff and generating accurate CSV reports for construction planning.
No description available
ZStarryChen
本项目构建了面向上市公司的 ESG 评估系统,利用 Python 爬虫技术从香港交易所 和新浪财经 ESG 评级平台收集 ESG 报告文本及对应评级数据,建立数据集。针对 PDF 文件,运用 fitz 库和 pdfplumber 库进行内容提取与数据清洗,结合 BERT 预训 练模型对数据向量化处理,构建 ESG 评级模型,并对比 K 近邻算法(KNN)、支持 向量机(SVM)、决策树(DT)与随机森林(RF)四种经典分类模型在该任务中的效 果
Yigtwxx
Fırat University Assistant: An offline Turkish question-answering and document search system built on local PDFs using FastAPI, pdfplumber, and BM25.
No description available
jsfenfen
Advanced PDF manipulation with pdfplumber for NICAR 2020 / New Orleans
Amanbig
A modern web application that integrates a conversational AI chatbot with real-time user interactions smooth animations. Built using React, Framer Motion, Lucide Icons, and ShadCN Components on frontend and fastapi on backend. [Below is website deployed Frontend]
renan-siqueira
This project facilitates the extraction of text from PDF files using various Python libraries. It is designed to be flexible, allowing the choice among different text extraction libraries and supporting both single PDF file and directory containing multiple PDF files.
jchristn
A simple C# shell wrapper for the wonderful pdfplumber library in Python to extract text from .PDF files
TechFreak2003
BIA (Blood Report Insights Agent) - AI Agent to analyze blood reports and provide detailed health insights.
justin-thakral
DullyPDF allows you to automatically convert PDFs to fillable forms for free. Search & Fill from API or Database with US esignature support. Backend is Python + FastAPI cloud run hosted. Frontend is TypeScript + React firebase hosted.
Zaaccckkkk
A tool for extracting images from tables in PDF documents through detecting their bounding boxes and dominant colors. Utilizes pdfplumber for PDF parsing and PIL, fitz for image processing.
vikrantRajan
This is my exploration of a variety of Python 🐍 libraries. I have built geospatial data analytics systems from CSV files, Image and video processing tools like face detection and motion detection. I also built a website with flask (and three.js), I built apps connecting to several types of databases. Created a simple budgeting app that reads, writes and updates .txt files. I also created a simple graphic user interface for Mac.
Amaan-developpeur
A lightweight, local Retrieval-Augmented Generation (RAG) system for domain-specific Q&A over financial documents. Uses pdfplumber for PDF parsing, sentence-transformers for dense retrieval, and optionally connects to local LLMs (e.g., Ollama + Mistral). Runs on FastAPI with a custom frontend.
caiofariaas
No description available
jackburrus
Rust-native PDF extraction. 73x faster than pdfplumber.
Aashishh1
The Indian Teams Dedicated Audio Lines🏏