Found 1,126 repositories(showing 30)
pymupdf
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
ArtifexSoftware
Open source Python library for converting PDF to DOCX.
CBIhalsen
(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.
pymupdf
PyMuPDF4LLM
Krasjet
A CLI toolset to generate table of contents for PDF files automatically.
pymupdf
Demos, examples and utilities using PyMuPDF
QPromise
支持谷歌翻译、百度翻译、有道翻译的免费接口,基于Django、PyMuPDF实现了pdf文档英译汉的功能,翻译后的pdf格式基本保持不变,可以下载docx和pdf格式的翻译文档,基本解决复制caj中文论文时的格式问题,简单的满足看论文以及写总结的需求。
lucasrla
Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG
genieincodebottle
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
norbusan
Packaging of pymupdf for Debian
vb64
Markdown to pdf renderer
M1ck4
Smart PDF to Markdown converter with intelligent heading detection, automatic header/footer removal, orphan fragment merging, and image export. Features a user-friendly GUI with preview mode, persistent settings, and per-page error recovery. Optimized for Obsidian and other Markdown-based note-taking workflows.
Zain-Bin-Arshad
A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.
devxzh
基于pyqt5, pymupdf实现的批量添加目录书签,增强pdf,拆分合并pdf的小工具
zhangzongrui
使用PyMuPDF库,实现PDF转Word,PDF转图片,图片转PDF,合并、拆分等功能
pymupdf
No description available
shayanalibhatti
In this code, a simple implementation of PDF to audio converter is shown
Ambuj123-lab
Enterprise RAG ecosystem managing 15,000+ semantic chunks. Features hybrid parsing (LlamaParse/PyMuPDF) and 256-dim MRL embeddings for 512MB RAM environments
seehiong
A privacy-first PDF processing engine that deconstructs documents into their core elements—text, images, and tables—and reconstructs them into pristine, structured Markdown. Self-hosted React + FastAPI stack with local AI vision models via LiteLLM/Ollama. Your data never leaves your machine.
benitomartin
Multimodal RAG with PyMuPDF
ssibb
PDF Diff Viewer, a side-by-side, visual highlight, sync-scroll, PDF comparer, written in Python. Open source, mostly powered by PyMuPDF and Tkinter. Optional support for git diff, for a better comparison algorithm.
TheWatcherMultiversal
pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.
xxao
Unified Python drawing API
AIAnytime
PyMuPDF4LLM for Data Extraction. Build better and efficient RAG.
DadaNanjesha
A comprehensive web application that detects AI-generated content in PDF documents and transforms AI text into natural human-like writing. Built with Streamlit, spaCy, and Hugging Face transformers.
RicePasteM
This project is a Python tool designed to separate color and black & white pages from a PDF file into two separate PDF files. It utilizes the PyMuPDF library to read and manipulate PDF files, distinguishing between color and black & white pages based on page color mode.
johnsmith2078
一个基于PyQt6和PyMuPDF的PDF阅读器,集成了OCR识别和TTS语音朗读功能。
dalanicolai
Extend pdf-tools annotation capabilities via pymupdf
pymupdf
Help file downloads, early ZIP binaries, wheels for retired Python 2.7, 3.5.
pymupdf
An integration package connecting PyMuPDF4LLM to LangChain