All notable changes to PDF Content Extractor & Translator are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- PDF/A compliance conversion
- Batch translation for multiple documents
- OCR quality enhancement
- Custom font upload for annotations
- Language auto-detection
The first stable release of PDF Content Extractor & Translator, featuring a complete suite of privacy-focused PDF tools.
- Full Document Extraction — Convert PDFs to Word (.docx) with structure preservation
- Table Extraction to CSV — Automatically detect and export tables
- Table Extraction to Word — Export tables in document format
- ODT Export — OpenDocument Text format support
- Offline Translation — 9 languages supported via Argos Translate
- English, Spanish, French, German, Italian, Portuguese, Polish, Russian, Dutch, Chinese
- No API Keys Required — Fully local translation processing
- Text Annotations — Add text anywhere on PDF pages
- Highlight Tool — Mark important sections
- Redaction Tool — Permanently remove sensitive content
- Shape Annotations — Rectangle, ellipse, line, arrow tools
- Digital Signatures — Draw, type, or upload signature images
- Sticky Notes — Add comment annotations
- Insert Pages — Add blank pages or pages from other PDFs
- Delete Pages — Remove unwanted pages
- Rotate Pages — 90° clockwise/counter-clockwise rotation
- Reorder Pages — Drag-and-drop page reorganization
- Merge PDFs — Combine multiple documents into one
- Split PDF — Extract page ranges into separate files
- Compress PDF — Reduce file size via Ghostscript
- Compare PDFs — Visual diff between document versions
- Repair PDF — Attempt to fix corrupted files
- PDF to JPG — Export pages as images
- Watermark — Add text watermarks to all pages
- Local AI Chat — Q&A about PDFs via Ollama
- Document Indexing — RAG-based retrieval with ChromaDB
- ReAct Agent — Agentic workflow with tool calling
- Multi-Model Support — Switch between installed Ollama models
- Dark Mode — System-aware theme toggle
- Ribbon Toolbar — Office-style tabbed interface
- Thumbnail Sidebar — Page navigation with previews
- Command Palette — Keyboard-driven command access (Ctrl+K)
- Batch Operations — Multi-select actions on home page
- Bug Reporter — Built-in issue reporting with logs
- Docker Support — One-command deployment
- Celery Workers — Background task processing
- Redis Queue — Reliable task management
- Structured Logging — Rotating log files with levels
- MCP Server — Model Context Protocol for AI assistants
- CLI Scripts — Command-line extraction tools
- All processing happens locally — no cloud uploads
- Path traversal prevention via
secure_filename() - Input validation on all API endpoints
- Beta release for internal testing
- Core extraction functionality
- Basic annotation tools
- Initial translation support
- Large PDFs (>100 pages) may timeout
- Some table layouts not detected correctly
- Alpha release
- PDF viewing and navigation
- Basic page operations
- Migrated from pdfminer to Docling for extraction
| Version | Date | Highlights |
|---|---|---|
| 1.0.0 | 2025-12-15 | First stable release with all core features |
| 0.9.0 | 2025-11-01 | Beta with extraction and translation |
| 0.8.0 | 2025-09-15 | Alpha with PDF viewing |
-
Docker Users:
docker-compose pull docker-compose up --build
-
Manual Installation:
git pull origin main pip install -r requirements.txt
-
Breaking Changes: None from 0.9.x
For feature requests and bug reports, please use GitHub Issues.