An Obsidian plugin that transcribes images to Markdown using local Ollama vision models.
Point it at any image in your vault and get structured Markdown back — headings, lists, tables, code blocks — all extracted by a vision AI running on your own machine. No data leaves your computer.
- Transcribe a single image via the command palette or right-click context menu
- Batch-transcribe an entire folder of images (with optional subfolder inclusion)
- Creates a
.mdfile alongside each image with the transcribed content - Install, select, and remove AI models directly from the command palette — no terminal needed
- Progress tracking for batch operations with per-file status
- Configurable prompt so you can tailor the transcription instructions
The plugin recommends these vision models for transcription:
maternion/LightOnOCR-2:1b, qwen3.5:2b, qwen3.5:4b, qwen3.5:9b, qwen3.5:27b, qwen3.5:35b
Any other Ollama vision model can be installed directly from the settings or via the Ollama CLI.
- Ollama installed and running locally
- Desktop Obsidian (this plugin is desktop-only)
- Install the plugin from Settings > Community plugins
- Enable it
- Open Settings > Transcriber and verify the Ollama server URL (default:
http://localhost:11434) - Click Test to confirm the connection
- Install a model: open the command palette (Ctrl/Cmd+P) and run Install AI model, or install from settings
- Right-click any image in your vault and select Transcribe image
See the user guide for detailed usage, configuration, and troubleshooting.
Created by Sébastien Dubois.
MIT
