Skip to content

Iroha-P/FormulaSnap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 简体中文

FormulaSnap

Screenshot math formulas, recognize with AI vision models, convert to MathML, and paste directly into MathType. Built for academic writing workflows.

A Chrome / Edge extension that uses AI vision models to recognize math formulas and output both LaTeX and MathML — paste the MathML straight into MathType in Word and keep editing.

Screenshots

Popup Mode

Popup Demo

Full-Page Mode

Web Demo

Features

  • Clipboard paste — Use Win+Shift+S to capture any region (inside or outside the browser), then press Ctrl+V to recognize
  • Multiple input methods — Clipboard paste, image URL, or local file upload
  • Multi-provider AI — SiliconFlow (free), Google Gemini (free tier), OpenAI-compatible, and Anthropic Claude
  • Free models included — SiliconFlow offers completely free GLM-4.1V and Qwen3-VL vision models
  • One-click copy — Copy MathML (paste directly into MathType) or raw LaTeX
  • Live preview — Rendered formula preview, editable LaTeX with re-render
  • History — Automatically saves the most recent 20 recognitions
  • Two modes — Popup and full-page
  • Dark tech theme — Clean, minimal dark UI

Supported Providers

Provider Best for Free tier
SiliconFlow Recommended for users in China, fast and includes free models GLM-4.1V-9B-Thinking, Qwen3-VL-8B
Google Gemini High-quality multimodal understanding Gemini 2.5 Flash Lite (1000/day)
OpenAI Compatible Custom endpoints, third-party relays Depends on provider
Anthropic Claude Best recognition quality for complex formulas Paid

Quick Start

Step 1 — Install the extension

  1. Download this repository to your machine
  2. Open chrome://extensions/ in Chrome or Edge
  3. Turn on Developer mode (top-right)
  4. Click Load unpacked and select the formula-snap folder
  5. The extension icon appears in the browser toolbar

Step 2 — Get an API key

You need an API key from an AI vision provider. SiliconFlow is the recommended default — it's fast, has free vision models, and requires no credit card.


Option 1 — SiliconFlow (recommended)

Completely free vision models, no credit card required.

  1. Visit https://cloud.siliconflow.cn/
  2. Sign up with a phone number or email
  3. Go to the console and open API Keys, or visit https://cloud.siliconflow.cn/account/ak directly
  4. Click Create API Key and copy the key (starts with sk-)
  5. Back in FormulaSnap, click the settings icon
  6. Select SiliconFlow as the provider
  7. Paste the API key
  8. Choose GLM-4.1V-9B Thinking (free) or Qwen3-VL-8B (free)
  9. Click Save

Free model notes:

  • GLM-4.1V-9B Thinking — Free, supports chain-of-thought reasoning, strong formula recognition
  • Qwen3-VL-8B — Free, fast
  • Qwen3-VL-32B — Low-cost paid, higher accuracy (recommended when budget allows)

Option 2 — Google Gemini (free tier, requires VPN in some regions)

  1. Visit https://aistudio.google.com/apikey
  2. Sign in with a Google account
  3. Click Create API Key and pick a project (or create one)
  4. Copy the API key (starts with AIzaSy)
  5. In FormulaSnap, select Google Gemini in settings
  6. Paste the key and choose Gemini 2.5 Flash Lite
  7. Save

Free tier notes:

  • Gemini 2.5 Flash Lite — 1000 requests/day, fastest, recommended
  • Gemini 2.5 Flash — 250 requests/day, with thinking
  • Gemini 2.5 Pro — 100 requests/day, best quality
  • Note: The Gemini 2.0 series was retired in March 2026 — use the 2.5 series

Option 3 — OpenAI Compatible (for users with custom endpoints)

Works with any service that speaks the OpenAI chat completions API — OpenAI proxies, third-party relays, etc.

  1. Select OpenAI Compatible (Custom) in settings
  2. Enter the API endpoint (e.g. https://api.example.com/v1/chat/completions)
  3. Enter your API key
  4. Enter the model name (e.g. gpt-4o, claude-3-5-sonnet, depending on your provider)
  5. Save

Option 4 — Anthropic Claude (paid, best quality)

  1. Visit https://console.anthropic.com/settings/keys
  2. Create an Anthropic account (credit card required)
  3. Create an API key (starts with sk-ant-)
  4. In FormulaSnap, select Claude (Anthropic) and paste the key
  5. Save

Step 3 — Use it

Method How
Paste (recommended) Press Win+Shift+S to capture the formula, then press Ctrl+V in the extension
URL Switch to the URL tab, paste an image URL, and click recognize
Upload Switch to the Upload tab, drag in an image, or click to select a file
Full-page mode Click the fullscreen button in the popup header to open the full-page app

After recognition:

  • Copy MathML → open MathType in Word → Ctrl+V → the formula loads automatically
  • Copy LaTeX → for LaTeX editors or other uses

Project Structure

FormulaSnap/
├── manifest.json                 # Chrome extension config (Manifest V3)
├── popup/
│   ├── popup.html                # Popup UI
│   ├── popup.css                 # Popup styles (dark theme)
│   └── popup.js                  # Popup logic: recognition, history, settings
├── app/
│   ├── app.html                  # Full-page mode
│   ├── app.css                   # Full-page styles
│   └── app.js                    # Full-page logic
├── background/
│   └── service-worker.js         # Service worker (reserved)
├── utils/
│   ├── llm-api.js                # Multi-provider LLM vision API layer
│   └── converter.js              # LaTeX → MathML conversion (Temml)
├── lib/
│   ├── temml.min.js              # Temml library
│   └── temml.css                 # Temml styles
├── assets/
│   └── screenshots/              # README screenshots
└── icons/
    ├── icon16.png
    ├── icon48.png
    └── icon128.png

How It Works

Formula image → AI vision model → LaTeX string → Temml → MathML → Clipboard → MathType
  1. The user provides a formula image (paste / URL / upload)
  2. The image is base64-encoded and sent to the selected AI vision model
  3. A dedicated system prompt instructs the model to output LaTeX only
  4. Temml converts the LaTeX to MathML entirely in the browser
  5. The user copies the MathML and pastes it into MathType in Word

Data Storage

All data is stored locally in the browser via the chrome.storage API:

Data Location Notes
API key, provider, model chrome.storage.sync Synced across devices
Recognition history (last 20) chrome.storage.local Local only

The only outbound network traffic is the image sent to your chosen AI provider for recognition. Nothing is sent anywhere else.

Shortcuts

Shortcut Action
Win+Shift+S Windows system screenshot (captures any region to clipboard)
Ctrl+V Paste the clipboard image into the extension and recognize

Browser Compatibility

  • Google Chrome 88+
  • Microsoft Edge 88+
  • Other Chromium-based browsers (Manifest V3 support)

Tech Stack

  • Chrome Extension Manifest V3
  • Temml — lightweight LaTeX → MathML converter (~155KB)
  • LLM Vision API — multi-provider support (SiliconFlow / Gemini / OpenAI / Claude)
  • Pure JavaScript, no build step

License

MIT

Credits

About

Screenshot math formulas, recognize with AI vision models, convert to MathML, and paste directly into MathType. Supports SiliconFlow, Google Gemini, OpenAI-compatible, and Anthropic Claude.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors