ADFA-3938 | Fix OCR property scaling issues#1289
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthrough
ChangesDimension OCR Normalization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt (1)
151-157: ⚡ Quick winExpand test coverage for additional edge cases.
The current tests validate the main scenarios, but consider adding tests for:
- Boundary case:
height: 1000dp- documents whether this normalizes to100dpor stays1000dp- No-normalization case:
height: 1234dp- verifies values without trailing zeros are unchanged- Negative values:
height: -1500dp- ensures negative dimensions normalize correctly to-150dp- Other units:
height: 1500sporheight: 1500px- validates normalization works for all supported units- Values below threshold:
height: 150dp- confirms small dimensions are unaffectedThese tests would make the heuristic behavior more explicit and prevent regressions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt` around lines 151 - 157, Add new unit tests to FuzzyAttributeParserTest that call FuzzyAttributeParser.parse with EditText and annotations exercising the edge cases: (1) a boundary case like "height: 1000dp" asserting expected normalized value (document whether it should be "100dp" or "1000dp"), (2) a no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other units like "height: 1500sp" and "height: 1500px" asserting they normalize the same way as dp, and (5) a below-threshold case "height: 150dp" asserting it stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes, noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize, belowThreshold_noChange) and use assertEquals against the parsed map entries (android:layout_height).cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt (1)
70-85: ⚖️ Poor tradeoffAdd test cases for the dimension normalization heuristic boundary conditions.
The heuristic correctly handles the target OCR bug (e.g.,
1500dp→150dp), and existing tests confirm this works. However, the test suite is missing coverage for edge cases:
- Boundary case:
1000dp(divides to100dp)- No-normalization case:
1234dp(no trailing zero, should remain1234dp)- Negative values:
-1500dp(should become-150dp)Consider adding test cases to document expected behavior for these scenarios, particularly since the heuristic applies to any dimension ≥ 1000 ending in '0'.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt` around lines 70 - 85, Add unit tests targeting ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary cases: assert that "1000" normalizes to "100" (boundary that triggers division), "1234" remains "1234" (no trailing zero/no normalization), and "-1500" normalizes to "-150" (negative value preserves sign after division). Use the same test harness and input formatting as existing tests for numericPart (strip/keep any "dp" as your test convention) and include clear assertions for each case so the behavior is documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.kt`:
- Around line 70-85: Add unit tests targeting
ValueCleanersImpl.normalizeOcrDimensionNumber to cover the heuristic boundary
cases: assert that "1000" normalizes to "100" (boundary that triggers division),
"1234" remains "1234" (no trailing zero/no normalization), and "-1500"
normalizes to "-150" (negative value preserves sign after division). Use the
same test harness and input formatting as existing tests for numericPart
(strip/keep any "dp" as your test convention) and include clear assertions for
each case so the behavior is documented.
In
`@cv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt`:
- Around line 151-157: Add new unit tests to FuzzyAttributeParserTest that call
FuzzyAttributeParser.parse with EditText and annotations exercising the edge
cases: (1) a boundary case like "height: 1000dp" asserting expected normalized
value (document whether it should be "100dp" or "1000dp"), (2) a
no-normalization case "height: 1234dp" asserting it remains "1234dp", (3) a
negative value "height: -1500dp" asserting it normalizes to "-150dp", (4) other
units like "height: 1500sp" and "height: 1500px" asserting they normalize the
same way as dp, and (5) a below-threshold case "height: 150dp" asserting it
stays "150dp"; name tests clearly (e.g., allZeroPaddedBoundary_normalizes,
noNormalization_keepsValue, negativeValue_normalizes, otherUnits_normalize,
belowThreshold_noChange) and use assertEquals against the parsed map entries
(android:layout_height).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1d9ef608-27d8-406d-928b-95627f2280e7
📒 Files selected for processing (2)
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/parser/ValueCleanersImpl.ktcv-image-to-xml/src/test/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParserTest.kt
d422a5e to
375a9dc
Compare
Prevents misinterpretation of trailing units as an extra zero (e.g., 1500dp to 150dp).
375a9dc to
de25331
Compare
Description
This PR fixes a bug in the Computer Vision parser where OCR misinterprets unit labels (like "dp") as a trailing zero. This resulted in incorrect property scaling, such as a widget defaulting to 1500dp instead of the intended 150dp.
Details
normalizeOcrDimensionNumberwithinValueCleanersImpl.ktto detect and trim trailing zeros from large numeric parts (4+ digits) that appear before a unit.explicitDimensionRegexto specifically match and normalize strings containing explicit units like dp, sp, px, or dip.FuzzyAttributeParserTest.ktto ensure that strings like "1500dp" are correctly scaled back to "150dp".Screen.Recording.2026-05-11.at.11.56.14.AM.mov
Ticket
ADFA-3938
Observation
The normalization logic targets dimensions with four or more digits ending in '0', as these are the most common instances of OCR "noise" where the unit is merged into the numeric value. This heuristic avoids affecting smaller, standard dimensions.