Search Results

Found 1,126 repositories(showing 30)

PyMuPDF

pymupdf

💛86

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

9.4k

709

AGPL-3.0

Python

Updated 5 hours ago

data-scienceepubextract-data+12

pdf2docx

ArtifexSoftware

💛73

Open source Python library for converting PDF to DOCX.

3.4k

477

MIT

Python

Updated 1 day ago

docxextract-tablepdf-converter+2

(eBook，PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.

1.9k

264

GPL-3.0

Python

Updated 3 days ago

deepseekebookformulas+6

pymupdf4llm

pymupdf

💛73

PyMuPDF4LLM

1.5k

195

AGPL-3.0

Python

Updated 13 hours ago

pdf.tocgen

Krasjet

🧡66

A CLI toolset to generate table of contents for PDF files automatically.

825

GPL-3.0

Python

Updated 17 hours ago

clipdfpdf-document+5

PyMuPDF-Utilities

pymupdf

🧡62

Demos, examples and utilities using PyMuPDF

713

177

AGPL-3.0

Jupyter Notebook

Updated 1 week ago

mupdfocrpdf+2

EasyTrans

QPromise

🧡62

支持谷歌翻译、百度翻译、有道翻译的免费接口，基于Django、PyMuPDF实现了pdf文档英译汉的功能，翻译后的pdf格式基本保持不变，可以下载docx和pdf格式的翻译文档，基本解决复制caj中文论文时的格式问题，简单的满足看论文以及写总结的需求。

577

143

Python

Updated 1 day ago

pdf-to-wordpdf-trans

remarks

lucasrla

🧡61

Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG

390

GPL-3.0

Python

Updated 1 week ago

annotationsepubhighlighting+11

parsemypdf

genieincodebottle

💛71

Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

172

MIT

Python

Updated 4 days ago

camelotclaudedocling+15

pymupdf-debian

norbusan

❤️40

Packaging of pymupdf for Debian

149

GPL-3.0

SWIG

Updated 6 months ago

markdown-pdf

vb64

🧡65

Markdown to pdf renderer

142

AGPL-3.0

Python

Updated 5 days ago

markdownmarkdown-itmermaid-diagrams+3

pdfmd

M1ck4

💛70

Smart PDF to Markdown converter with intelligent heading detection, automatic header/footer removal, orphan fragment merging, and image export. Features a user-friendly GUI with preview mode, persistent settings, and per-page error recovery. Optimized for Obsidian and other Markdown-based note-taking workflows.

118

MIT

Python

Updated 3 days ago

cli-toolgui-applicationmarkdown+9

pdf-viewer

Zain-Bin-Arshad

❤️35

A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.

Python

Updated 4 months ago

fitzpdfpdf-viewer+6

PDFTools

devxzh

🧡60

基于pyqt5, pymupdf实现的批量添加目录书签，增强pdf，拆分合并pdf的小工具

MIT

Python

Updated 3 weeks ago

add-catalogbookmarkpdf+4

Python-PDF-tools

zhangzongrui

🧡55

使用PyMuPDF库，实现PDF转Word，PDF转图片，图片转PDF，合并、拆分等功能

Python

Updated 3 weeks ago

pymupdf4llm-mcp

pymupdf

🧡55

No description available

AGPL-3.0

Python

Updated 1 hour ago

Designing-a-PDF-Audiobook-using-Python

shayanalibhatti

❤️45

In this code, a simple implementation of PDF to audio converter is shown

Python

Updated 1 month ago

audio-convertergttspdf-reader+7

agentic-rag-financial-parser

Ambuj123-lab

🧡65

Enterprise RAG ecosystem managing 15,000+ semantic chunks. Features hybrid parsing (LlamaParse/PyMuPDF) and 256-dim MRL embeddings for 512MB RAM environments

Python

Updated 3 hours ago

agentic-ragfastapigenai+6

pdfusion

seehiong

💛70

A privacy-first PDF processing engine that deconstructs documents into their core elements—text, images, and tables—and reconstructs them into pristine, structured Markdown. Self-hosted React + FastAPI stack with local AI vision models via LiteLLM/Ollama. Your data never leaves your machine.

NOASSERTION

Python

Updated 23 hours ago

backenddocument-processingfastapi+11

multimodal-llm-pymupdf4llm

benitomartin

❤️35

Multimodal RAG with PyMuPDF

Jupyter Notebook

Updated 4 months ago

llama-indexopenaipymupdf+2

PDF-Diff-Viewer

ssibb

❤️45

PDF Diff Viewer, a side-by-side, visual highlight, sync-scroll, PDF comparer, written in Python. Open source, mostly powered by PyMuPDF and Tkinter. Optional support for git diff, for a better comparison algorithm.

GPL-3.0

Python

Updated 1 month ago

compare-pdfpdf-comparatorpdf-compare+4

pdfgui_tools

TheWatcherMultiversal

❤️35

pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.

GPL-3.0

Python

Updated 6 months ago

gnu-linuxlinuxpdf+7

pero

xxao

❤️40

Unified Python drawing API

MIT

Python

Updated 6 months ago

drawingpycairopymupdf+8

PyMuPDF4LLM-for-Data-Extraction

AIAnytime

❤️40

PyMuPDF4LLM for Data Extraction. Build better and efficient RAG.

MIT

Jupyter Notebook

Updated 4 months ago

AI-content-detector-Humanizer

DadaNanjesha

🧡65

A comprehensive web application that detects AI-generated content in PDF documents and transforms AI text into natural human-like writing. Built with Streamlit, spaCy, and Hugging Face transformers.

MIT

Python

Updated 13 hours ago

huggingface-transformersnltk-pythonopen-source+5

Color-BW-Separator-for-PDF

RicePasteM

🧡60

This project is a Python tool designed to separate color and black & white pages from a PDF file into two separate PDF files. It utilizes the PyMuPDF library to read and manipulate PDF files, distinguishing between color and black & white pages based on page color mode.

MIT

Python

Updated 2 weeks ago

pdftts

johnsmith2078

❤️35

一个基于PyQt6和PyMuPDF的PDF阅读器，集成了OCR识别和TTS语音朗读功能。

Python

Updated 4 months ago

pymupdf-mode.el

dalanicolai

❤️35

Extend pdf-tools annotation capabilities via pymupdf

Emacs Lisp

Updated 5 months ago

PyMuPDF-Optional-Material

pymupdf

❤️35

Help file downloads, early ZIP binaries, wheels for retired Python 2.7, 3.5.

AGPL-3.0

Updated 9 months ago

fitzmupdfpdf+3

langchain-pymupdf4llm

pymupdf

🧡50

An integration package connecting PyMuPDF4LLM to LangChain

AGPL-3.0

Python

Updated 1 week ago

langchainlangchain-pythonpymupdf4llm

GitHub Explorer

Search Results

PyMuPDF

pdf2docx

PolyglotPDF

pymupdf4llm

pdf.tocgen

PyMuPDF-Utilities

EasyTrans

remarks

parsemypdf

pymupdf-debian

markdown-pdf

pdfmd

pdf-viewer

PDFTools

Python-PDF-tools

pymupdf4llm-mcp

Designing-a-PDF-Audiobook-using-Python

agentic-rag-financial-parser

pdfusion

multimodal-llm-pymupdf4llm

PDF-Diff-Viewer

pdfgui_tools

pero

PyMuPDF4LLM-for-Data-Extraction

AI-content-detector-Humanizer

Color-BW-Separator-for-PDF

pdftts

pymupdf-mode.el

PyMuPDF-Optional-Material

langchain-pymupdf4llm

PyMuPDF

pdf2docx

PolyglotPDF

pymupdf4llm

pdf.tocgen

PyMuPDF-Utilities

EasyTrans

remarks

parsemypdf

pymupdf-debian

markdown-pdf

pdfmd

pdf-viewer

PDFTools

Python-PDF-tools

pymupdf4llm-mcp

Designing-a-PDF-Audiobook-using-Python

agentic-rag-financial-parser

pdfusion

multimodal-llm-pymupdf4llm

PDF-Diff-Viewer

pdfgui_tools

pero

PyMuPDF4LLM-for-Data-Extraction

AI-content-detector-Humanizer

Color-BW-Separator-for-PDF

pdftts

pymupdf-mode.el

PyMuPDF-Optional-Material

langchain-pymupdf4llm