#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
-
Updated
Feb 21, 2026 - TypeScript
#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
A Privacy First PDF Toolkit
AI Chatbot for analyzing/extracting information from data in conversational format.
An out-of-the-box local Web UI for DeepSeek-OCR. Built with FastAPI + Vue.js, it supports PDF/Image uploads, progress tracking, and result visualization with bounding boxes. Easily experience the power of a top-tier OCR model.
Building on the existing general text recognition capabilities, new features such as handwritten OCR, layout detection, and table detection and recognition have been added, covering all scenarios involving printed text, handwritten text, and document structure analysis.在原通用文本识别基础上,新增手写 OCR、版面检测、表格检测与识别功能,覆盖印刷体、手写体、文档结构解析全场景。
Open-source batch OCR workbench — a free, local alternative to ABBYY FineReader. Powered by Ollama + GLM-OCR + PP-DocLayoutV3, ~0.5s/page on RTX 4090. Three-panel editor, layout-aware, PDF/image batch processing, Markdown/Word export. 批量OCR工作台,纯本地运行,免费平替ABBYY,适合书籍文档数字化。
Convert scanned PDFs into searchable text locally using Vision LLMs (olmOCR). 100% private, offline, and free. Features a modern Web UI & CLI.
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
LLM PDF OCR工具,Markdown/Latex 文章翻译工具。支持逐段翻译和直接校对。支持数学公式。基于大语言模型(LLM)API
Client-side tool to check and fix PDF accessibility. Analyze PDFs for text layer accessibility, detect image-only pages, and rebuild selectable text layers with browser-based OCR—no server or backend required. Perfect for privacy-first and legacy environments.
A document processing service designed to extract structured text (Markdown) from various file formats using OCR (Tesseract) and native parsers.
A tool for compare, merge, display difference and make OCR between the PDFs.
OCR-enabled PDF text extraction in Python with pypdf and Azure Document Intelligence.
GPicy - AI Artificial Intelligence-driven image processing for your sporadic needs.
Windows-focused fork of Typhoon OCR featuring a modern Next.js web app. Supports multi-page PDF/image OCR to Markdown/HTML, interactive preview, and URL import.
PDFScalpel is a forensic PDF analysis and CTF toolkit for security researchers, digital forensics analysts, and penetration testers, providing deep insight into PDF structure, encryption, malware, steganography, metadata, revisions, and document authenticity.
PDF to Markdown OCR using vision-language models with multi-GPU support
Add a description, image, and links to the pdf-ocr topic page so that developers can more easily learn about it.
To associate your repository with the pdf-ocr topic, visit your repo's landing page and select "manage topics."