Skip to content

ADFA-3938 | Fix OCR property scaling issues#1289

Merged
jatezzz merged 2 commits into
stagefrom
fix/ADFA-3938-ocr-property-scaling-experimental
May 12, 2026
Merged

ADFA-3938 | Fix OCR property scaling issues#1289
jatezzz merged 2 commits into
stagefrom
fix/ADFA-3938-ocr-property-scaling-experimental

Conversation

@jatezzz
Copy link
Copy Markdown
Collaborator

@jatezzz jatezzz commented May 11, 2026

Description

This PR fixes a bug in the Computer Vision parser where OCR misinterprets unit labels (like "dp") as a trailing zero. This resulted in incorrect property scaling, such as a widget defaulting to 1500dp instead of the intended 150dp.

Details

  • Implemented normalizeOcrDimensionNumber within ValueCleanersImpl.kt to detect and trim trailing zeros from large numeric parts (4+ digits) that appear before a unit.
  • Added explicitDimensionRegex to specifically match and normalize strings containing explicit units like dp, sp, px, or dip.
  • Included a new unit test in FuzzyAttributeParserTest.kt to ensure that strings like "1500dp" are correctly scaled back to "150dp".
Screen.Recording.2026-05-11.at.11.56.14.AM.mov

Ticket

ADFA-3938

Observation

The normalization logic targets dimensions with four or more digits ending in '0', as these are the most common instances of OCR "noise" where the unit is merged into the numeric value. This heuristic avoids affecting smaller, standard dimensions.

@jatezzz jatezzz requested review from a team and Daniel-ADFA May 11, 2026 19:41
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

Review Change Stack

Warning

Rate limit exceeded

@jatezzz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 58 minutes and 50 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d97c1def-ce67-4a20-93e9-4a82c20be10f

📥 Commits

Reviewing files that changed from the base of the PR and between 375a9dc and de25331.

📒 Files selected for processing (2)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
  • cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt
📝 Walkthrough

Walkthrough

DimensionCleaner now detects dimension strings that already specify units (dp, sp, px, dip) via regex and normalizes OCR-derived numeric portions using new trailing-zero correction logic, preserving the original unit suffix. Tests verify normalization for trailing zeros and zero-padding.

Changes

Dimension OCR Normalization

Layer / File(s) Summary
Explicit Dimension Pattern Recognition
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
New explicitDimensionRegex matches dimension strings with explicit units (dp, sp, px, dip).
OCR Number Normalization Logic
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
Helper normalizeOcrDimensionNumber canonicalizes integer strings, corrects suspected trailing-zero inflation for large values, and normalizes -0 to 0.
Dimension Cleaner Implementation
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
DimensionCleaner.clean short-circuits when input already contains a unit, normalizes the numeric portion with OCR-aware logic, and otherwise follows the existing fixed-unit/NumberCleaner flow with added OCR normalization before appending dp.
Dimension Normalization Tests
cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt
Three new test cases validate normalization of dp dimension strings with trailing zeros and zero-padding (e.g., 1500dp150dp, 0010dp10dp, 0000dp0dp).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • appdevforall/CodeOnTheGo#1233: Both PRs modify OCR dimension-number cleaning/normalization logic (main PR adds explicit-unit handling and trailing-zero normalization; related PR adjusts dimension extraction/cleanup).

Suggested reviewers

  • avestaadfa
  • Daniel-ADFA

Poem

🐇 I munch on digits, nibble trailing zeroes,
turn 1500dp into a tidy 150,
strip leading crumbs from 0010dp,
praise neat heights where zeros cease—
OCR-friendly hops, layout dreams take flight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically references the main change: fixing OCR property scaling issues, which is the primary objective of the changeset.
Description check ✅ Passed The description is directly related to the changeset, providing context about the OCR bug fix, implementation details, and test coverage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ADFA-3938-ocr-property-scaling-experimental

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt (1)

151-157: ⚡ Quick win

Expand test coverage for additional edge cases.

The current tests validate the main scenarios, but consider adding tests for:

  1. Boundary case: height: 1000dp - documents whether this normalizes to 100dp or stays 1000dp
  2. No-normalization case: height: 1234dp - verifies values without trailing zeros are unchanged
  3. Negative values: height: -1500dp - ensures negative dimensions normalize correctly to -150dp
  4. Other units: height: 1500sp or height: 1500px - validates normalization works for all supported units
  5. Values below threshold: height: 150dp - confirms small dimensions are unaffected

These tests would make the heuristic behavior more explicit and prevent regressions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`
around lines 151 - 157, Add new unit tests to FuzzyAttributeParserTest that call
FuzzyAttributeParser.parse with EditText and annotations exercising the edge
cases: (1) a boundary case like "height: 1000dp" asserting expected normalized
value (document whether it should be "100dp" or "1000dp"), (2) a
no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a
negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other
units like "height: 1500sp" and "height: 1500px" asserting they normalize the
same way as dp, and (5) a below-threshold case "height: 150dp" asserting it
stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes,
noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize,
belowThreshold_noChange) and use assertEquals against the parsed map entries
(android:layout_height).
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt (1)

70-85: ⚖️ Poor tradeoff

Add test cases for the dimension normalization heuristic boundary conditions.

The heuristic correctly handles the target OCR bug (e.g., 1500dp150dp), and existing tests confirm this works. However, the test suite is missing coverage for edge cases:

  • Boundary case: 1000dp (divides to 100dp)
  • No-normalization case: 1234dp (no trailing zero, should remain 1234dp)
  • Negative values: -1500dp (should become -150dp)

Consider adding test cases to document expected behavior for these scenarios, particularly since the heuristic applies to any dimension ≥ 1000 ending in '0'.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`
around lines 70 - 85, Add unit tests targeting
ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary
cases: assert that "1000" normalizes to "100" (boundary that triggers division),
"1234" remains "1234" (no trailing zero/no normalization), and "-1500"
normalizes to "-150" (negative value preserves sign after division). Use the
same test harness and input formatting as existing tests for numericPart
(strip/keep any "dp" as your test convention) and include clear assertions for
each case so the behavior is documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`:
- Around line 70-85: Add unit tests targeting
ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary
cases: assert that "1000" normalizes to "100" (boundary that triggers division),
"1234" remains "1234" (no trailing zero/no normalization), and "-1500"
normalizes to "-150" (negative value preserves sign after division). Use the
same test harness and input formatting as existing tests for numericPart
(strip/keep any "dp" as your test convention) and include clear assertions for
each case so the behavior is documented.

In
`@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`:
- Around line 151-157: Add new unit tests to FuzzyAttributeParserTest that call
FuzzyAttributeParser.parse with EditText and annotations exercising the edge
cases: (1) a boundary case like "height: 1000dp" asserting expected normalized
value (document whether it should be "100dp" or "1000dp"), (2) a
no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a
negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other
units like "height: 1500sp" and "height: 1500px" asserting they normalize the
same way as dp, and (5) a below-threshold case "height: 150dp" asserting it
stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes,
noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize,
belowThreshold_noChange) and use assertEquals against the parsed map entries
(android:layout_height).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1d9ef608-27d8-406d-928b-95627f2280e7

📥 Commits

Reviewing files that changed from the base of the PR and between 853d4fc and d422a5e.

📒 Files selected for processing (2)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
  • cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt

@jatezzz jatezzz force-pushed the fix/ADFA-3938-ocr-property-scaling-experimental branch from d422a5e to 375a9dc Compare May 12, 2026 13:20
@jatezzz jatezzz requested a review from hal-eisen-adfa May 12, 2026 13:21
jatezzz added 2 commits May 12, 2026 12:56
Prevents misinterpretation of trailing units as an extra zero (e.g., 1500dp to 150dp).
@jatezzz jatezzz force-pushed the fix/ADFA-3938-ocr-property-scaling-experimental branch from 375a9dc to de25331 Compare May 12, 2026 17:56
@jatezzz jatezzz merged commit 531cdcf into stage May 12, 2026
2 checks passed
@jatezzz jatezzz deleted the fix/ADFA-3938-ocr-property-scaling-experimental branch May 12, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants