ADFA-3938 | Fix OCR property scaling issues by jatezzz · Pull Request #1289 · appdevforall/CodeOnTheGo

jatezzz · 2026-05-11T19:41:18Z

Description

This PR fixes a bug in the Computer Vision parser where OCR misinterprets unit labels (like "dp") as a trailing zero. This resulted in incorrect property scaling, such as a widget defaulting to 1500dp instead of the intended 150dp.

Details

Implemented normalizeOcrDimensionNumber within ValueCleanersImpl.kt to detect and trim trailing zeros from large numeric parts (4+ digits) that appear before a unit.
Added explicitDimensionRegex to specifically match and normalize strings containing explicit units like dp, sp, px, or dip.
Included a new unit test in FuzzyAttributeParserTest.kt to ensure that strings like "1500dp" are correctly scaled back to "150dp".

Screen.Recording.2026-05-11.at.11.56.14.AM.mov

Ticket

ADFA-3938

Observation

The normalization logic targets dimensions with four or more digits ending in '0', as these are the most common instances of OCR "noise" where the unit is merged into the numeric value. This heuristic avoids affecting smaller, standard dimensions.

coderabbitai · 2026-05-11T19:49:31Z

Warning

Rate limit exceeded

@jatezzz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 58 minutes and 50 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d97c1def-ce67-4a20-93e9-4a82c20be10f

📥 Commits

Reviewing files that changed from the base of the PR and between 375a9dc and de25331.

📒 Files selected for processing (2)

cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt

📝 Walkthrough

Walkthrough

DimensionCleaner now detects dimension strings that already specify units (dp, sp, px, dip) via regex and normalizes OCR-derived numeric portions using new trailing-zero correction logic, preserving the original unit suffix. Tests verify normalization for trailing zeros and zero-padding.

Changes

Dimension OCR Normalization

Layer / File(s)	Summary
Explicit Dimension Pattern Recognition `cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`	New `explicitDimensionRegex` matches dimension strings with explicit units (`dp`, `sp`, `px`, `dip`).
OCR Number Normalization Logic `cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`	Helper `normalizeOcrDimensionNumber` canonicalizes integer strings, corrects suspected trailing-zero inflation for large values, and normalizes `-0` to `0`.
Dimension Cleaner Implementation `cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`	`DimensionCleaner.clean` short-circuits when input already contains a unit, normalizes the numeric portion with OCR-aware logic, and otherwise follows the existing fixed-unit/NumberCleaner flow with added OCR normalization before appending `dp`.
Dimension Normalization Tests `cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`	Three new test cases validate normalization of `dp` dimension strings with trailing zeros and zero-padding (e.g., `1500dp` → `150dp`, `0010dp` → `10dp`, `0000dp` → `0dp`).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

appdevforall/CodeOnTheGo#1233: Both PRs modify OCR dimension-number cleaning/normalization logic (main PR adds explicit-unit handling and trailing-zero normalization; related PR adjusts dimension extraction/cleanup).

Suggested reviewers

avestaadfa
Daniel-ADFA

Poem

🐇 I munch on digits, nibble trailing zeroes,
turn 1500dp into a tidy 150,
strip leading crumbs from 0010dp,
praise neat heights where zeros cease—
OCR-friendly hops, layout dreams take flight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically references the main change: fixing OCR property scaling issues, which is the primary objective of the changeset.
Description check	✅ Passed	The description is directly related to the changeset, providing context about the OCR bug fix, implementation details, and test coverage.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ADFA-3938-ocr-property-scaling-experimental

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt (1)
151-157: ⚡ Quick win

Expand test coverage for additional edge cases.

The current tests validate the main scenarios, but consider adding tests for:

Boundary case: height: 1000dp - documents whether this normalizes to 100dp or stays 1000dp

No-normalization case: height: 1234dp - verifies values without trailing zeros are unchanged

Negative values: height: -1500dp - ensures negative dimensions normalize correctly to -150dp

Other units: height: 1500sp or height: 1500px - validates normalization works for all supported units

Values below threshold: height: 150dp - confirms small dimensions are unaffected

These tests would make the heuristic behavior more explicit and prevent regressions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`
around lines 151 - 157, Add new unit tests to FuzzyAttributeParserTest that call
FuzzyAttributeParser.parse with EditText and annotations exercising the edge
cases: (1) a boundary case like "height: 1000dp" asserting expected normalized
value (document whether it should be "100dp" or "1000dp"), (2) a
no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a
negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other
units like "height: 1500sp" and "height: 1500px" asserting they normalize the
same way as dp, and (5) a below-threshold case "height: 150dp" asserting it
stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes,
noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize,
belowThreshold_noChange) and use assertEquals against the parsed map entries
(android:layout_height).
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt (1)
70-85: ⚖️ Poor tradeoff

Add test cases for the dimension normalization heuristic boundary conditions.

The heuristic correctly handles the target OCR bug (e.g., 1500dp → 150dp), and existing tests confirm this works. However, the test suite is missing coverage for edge cases:

Boundary case: 1000dp (divides to 100dp)

No-normalization case: 1234dp (no trailing zero, should remain 1234dp)

Negative values: -1500dp (should become -150dp)

Consider adding test cases to document expected behavior for these scenarios, particularly since the heuristic applies to any dimension ≥ 1000 ending in '0'.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`
around lines 70 - 85, Add unit tests targeting
ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary
cases: assert that "1000" normalizes to "100" (boundary that triggers division),
"1234" remains "1234" (no trailing zero/no normalization), and "-1500"
normalizes to "-150" (negative value preserves sign after division). Use the
same test harness and input formatting as existing tests for numericPart
(strip/keep any "dp" as your test convention) and include clear assertions for
each case so the behavior is documented.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`:
- Around line 70-85: Add unit tests targeting
ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary
cases: assert that "1000" normalizes to "100" (boundary that triggers division),
"1234" remains "1234" (no trailing zero/no normalization), and "-1500"
normalizes to "-150" (negative value preserves sign after division). Use the
same test harness and input formatting as existing tests for numericPart
(strip/keep any "dp" as your test convention) and include clear assertions for
each case so the behavior is documented.

In
`@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`:
- Around line 151-157: Add new unit tests to FuzzyAttributeParserTest that call
FuzzyAttributeParser.parse with EditText and annotations exercising the edge
cases: (1) a boundary case like "height: 1000dp" asserting expected normalized
value (document whether it should be "100dp" or "1000dp"), (2) a
no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a
negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other
units like "height: 1500sp" and "height: 1500px" asserting they normalize the
same way as dp, and (5) a below-threshold case "height: 150dp" asserting it
stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes,
noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize,
belowThreshold_noChange) and use assertEquals against the parsed map entries
(android:layout_height).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1d9ef608-27d8-406d-928b-95627f2280e7

📥 Commits

Reviewing files that changed from the base of the PR and between 853d4fc and d422a5e.

📒 Files selected for processing (2)

cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt
cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt

Prevents misinterpretation of trailing units as an extra zero (e.g., 1500dp to 150dp).

jatezzz requested review from a team and Daniel-ADFA May 11, 2026 19:41

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

hal-eisen-adfa reviewed May 11, 2026

View reviewed changes

Comment thread ...src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt Outdated

jatezzz force-pushed the fix/ADFA-3938-ocr-property-scaling-experimental branch from d422a5e to 375a9dc Compare May 12, 2026 13:20

jatezzz requested a review from hal-eisen-adfa May 12, 2026 13:21

hal-eisen-adfa approved these changes May 12, 2026

View reviewed changes

jatezzz added 2 commits May 12, 2026 12:56

fix(cv): normalize OCR dimension values to correct property scaling

e25b1df

Prevents misinterpretation of trailing units as an extra zero (e.g., 1500dp to 150dp).

refactor: cleaned regex

de25331

jatezzz force-pushed the fix/ADFA-3938-ocr-property-scaling-experimental branch from 375a9dc to de25331 Compare May 12, 2026 17:56

jatezzz merged commit 531cdcf into stage May 12, 2026
2 checks passed

jatezzz deleted the fix/ADFA-3938-ocr-property-scaling-experimental branch May 12, 2026 18:03

coderabbitai Bot mentioned this pull request May 22, 2026

ADFA-4033 | Improve OCR sanitization, value cleaners, and widget support #1333

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ADFA-3938 | Fix OCR property scaling issues#1289

ADFA-3938 | Fix OCR property scaling issues#1289
jatezzz merged 2 commits into
stagefrom
fix/ADFA-3938-ocr-property-scaling-experimental

jatezzz commented May 11, 2026 •

edited by atlassian Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jatezzz commented May 11, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Details

Ticket

Observation

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jatezzz commented May 11, 2026 •

edited by atlassian Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading