Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 51 minutes and 45 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA hardcoded regex pattern used for subset detection in the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/java/org/verapdf/pd/font/PDFont.java`:
- Line 50: Update the subset detection to require a literal plus and safely
strip the prefix: change SUBSET_REGEX to enforce a literal '+' (e.g.
"^[A-Z]{6}\\+.+") and replace the blind substring(7) usage with logic that finds
the '+' and uses substring(pos+1) only when the '+' is present at the expected
position (or, better, use the regex capture group to return the portion after
the '+'). Modify the code that currently calls substring(7) so it validates the
'+' exists (and is at index 6) before slicing to avoid corrupting names.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 10e4c98e-127c-48f5-963b-66a5411acea2
📒 Files selected for processing (1)
src/main/java/org/verapdf/pd/font/PDFont.java
|
|
||
| private static final Logger LOGGER = Logger.getLogger(PDFont.class.getCanonicalName()); | ||
|
|
||
| public static final String SUBSET_REGEX = "^[A-Z]{6}+.+"; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Java regex, what does the quantifier {6}+mean, and does^[A-Z]{6}+.+ require a literal plus sign character?
💡 Result:
In Java regex, the quantifier {6}+ means "exactly 6 occurrences of the preceding token, possessively." It matches exactly 6 instances (like {6}) but possessively: it consumes all 6 matches greedily without backtracking if later parts of the pattern fail. This differs from the standard greedy {6} (which backtracks) and reluctant {6}? (which matches minimally). Possessive quantifiers improve performance by avoiding unnecessary backtracking attempts. The pattern ^[A-Z]{6}+.+ does NOT require a literal plus sign character. The + after {6} is part of the possessive quantifier syntax for [A-Z]{6}. The pattern matches: - ^: start of string - [A-Z]{6}+: exactly 6 uppercase letters, possessively - .+: one or more of any characters (except line terminators by default) No literal + is needed; the + is quantifier syntax, not a character to match.
Citations:
- 1: https://docs.oracle.com/javase/tutorial/essential/regex/quant.html
- 2: https://docs.oracle.com/en/java/javase/26/docs/api/java.base/java/util/regex/Pattern.html
- 3: https://www.logicbig.com/tutorials/core-java-tutorial/java-regular-expressions/regex-possessive-quantifiers.html
- 4: https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/util/regex/Pattern.html
🌐 Web query:
For PDF font subset names, is the expected BaseFont format exactly six uppercase letters followed by '+' and the PostScript name (e.g., ABCDEF+FontName)?
💡 Result:
Yes, the expected BaseFont format for PDF font subsets is exactly six uppercase letters followed by '+' and the PostScript name of the original font (e.g., ABCDEF+FontName).
Citations:
- 1: Make BaseFont subset prefix generation deterministic LibrePDF/OpenPDF#623
- 2: https://tex.stackexchange.com/questions/156429/what-is-the-scheme-of-fonts-naming-in-pdfs-generated-by-latex-and-other-software
- 3: https://www.verypdf.com/document/pdf-format-reference/pg_0419.htm
- 4: https://synopse.info/forum/viewtopic.php?id=6229
Enforce literal + in subset regex to prevent false positives and font name corruption
At Line 50, the regex ^[A-Z]{6}+.+ does not enforce a literal + character as required by the PDF spec for subset fonts. This causes false positives (e.g., "ABCDEF1FontName") to match, and Line 83's substring(7) truncates them incorrectly, corrupting valid font names.
The PDF subset format requires exactly: six uppercase letters, literal +, then PostScript name (e.g., ABCDEF+FontName).
Proposed fix
- public static final String SUBSET_REGEX = "^[A-Z]{6}+.+";
+ public static final String SUBSET_REGEX = "^[A-Z]{6}\\+.+";Also applies to: 83-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/java/org/verapdf/pd/font/PDFont.java` at line 50, Update the subset
detection to require a literal plus and safely strip the prefix: change
SUBSET_REGEX to enforce a literal '+' (e.g. "^[A-Z]{6}\\+.+") and replace the
blind substring(7) usage with logic that finds the '+' and uses substring(pos+1)
only when the '+' is present at the expected position (or, better, use the regex
capture group to return the portion after the '+'). Modify the code that
currently calls substring(7) so it validates the '+' exists (and is at index 6)
before slicing to avoid corrupting names.
Summary by CodeRabbit