Skip to content

[API-01] Fixed OCR not working for images.#36

Merged
ddamme05 merged 2 commits intomainfrom
AI-000
Nov 8, 2025
Merged

[API-01] Fixed OCR not working for images.#36
ddamme05 merged 2 commits intomainfrom
AI-000

Conversation

@ddamme05
Copy link
Copy Markdown
Owner

@ddamme05 ddamme05 commented Nov 8, 2025

Note

Pins tess4j/Tesseract versions and tessdata path detection, configures exec tmpfs/JNA tmpdir, adds native library load error handling, and updates OCR config to restore reliable OCR.

  • OCR Runtime/Infra:
    • Dockerfile: Install Tesseract 5.x + Leptonica via PPA; add dev libs; symlink libleptonica.so; detect and persist TESSDATA_PREFIX at build; set JAVA_TOOL_OPTIONS with tmpdir/JNA flags; default TESSDATA_PREFIX to /usr/share/tesseract-ocr/5/tessdata.
    • docker-compose: Mount /tmp and /var/tmp as exec-enabled tmpfs; propagate JAVA_TOOL_OPTIONS, TESSDATA_PREFIX, and JNA_TMPDIR.
    • load-secrets.sh: Source detected tessdata.sh; remove manual JNA tmpdir setup; simplify Java exec.
  • Backend (OCR):
    • Downgrade tess4j to 5.9.0 compatible with system Leptonica.
    • OcrJobHandler: Catch UnsatisfiedLinkError, record native_library_load_failed metric, and rethrow.
    • OcrService: Validate existence of *.traineddata, set datapath explicitly, and add debug logging.
    • application.yml: Default ai.worker.ocr.data-path to /usr/share/tesseract-ocr/5/tessdata.

Written by Cursor Bugbot for commit c26441d. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on December 22

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 8, 2025

Test Results

75 tests  ±0   75 ✅ ±0   34s ⏱️ -1s
17 suites ±0    0 💤 ±0 
17 files   ±0    0 ❌ ±0 

Results for commit c26441d. ± Comparison against base commit 32a2127.

♻️ This comment has been updated with latest results.

@ddamme05 ddamme05 merged commit 0ec3d6c into main Nov 8, 2025
5 checks passed
@ddamme05 ddamme05 deleted the AI-000 branch November 8, 2025 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant