Skip to content

Fix handling of DOIs#14704

Merged
koppor merged 2 commits intomainfrom
fix-mp-1160
Dec 23, 2025
Merged

Fix handling of DOIs#14704
koppor merged 2 commits intomainfrom
fix-mp-1160

Conversation

@koppor
Copy link
Copy Markdown
Member

@koppor koppor commented Dec 23, 2025

User description

Closes https://github.com/JabRef/jabref-issue-melting-pot/issues/1160

Steps to test

Use DOI cleanup for entries

Mandatory checks


PR Type

Bug fix


Description

  • Refactored DOI cleanup logic to handle malformed percent-encoded characters gracefully

  • Added error handling for URLDecoder to prevent crashes on invalid encoding

  • Simplified field list initialization using List.of() instead of Arrays.asList()

  • Improved code structure with functional programming patterns and AtomicBoolean tracking

  • Added test cases for wrongly encoded DOI characters in various fields


Diagram Walkthrough

flowchart LR
  A["DOI Field Processing"] --> B["URL Decode with Error Handling"]
  B --> C["Parse and Validate DOI"]
  C --> D["Clean Other Fields"]
  E["Other Fields Processing"] --> F["Extract DOI if Present"]
  F --> G["Set DOI Field if Empty"]
  G --> H["Remove DOI from Source Field"]
  I["ArXiv Processing"] --> J["Infer DOI from ArXiv ID"]
  J --> K["Set DOI if Not Exists"]
  B --> L["Graceful Fallback on Decode Error"]
Loading

File Walkthrough

Relevant files
Bug fix
DoiCleanup.java
Refactor DOI cleanup with error handling                                 

jablib/src/main/java/org/jabref/logic/cleanup/DoiCleanup.java

  • Replaced Arrays.asList() with List.of() for FIELDS initialization
  • Added try-catch block around URLDecoder to handle
    IllegalArgumentException gracefully
  • Refactored DOI field processing using functional programming with
    ifPresent() and map()
  • Introduced AtomicBoolean to track if valid DOI exists in DOI field
  • Restructured logic to separate DOI field cleanup from other field
    processing
  • Simplified ArXiv DOI inference to only execute when DOI field is empty
  • Removed unused imports and improved code readability with better
    variable naming
+58/-50 
Tests
DoiCleanupTest.java
Add tests for malformed percent-encoded DOIs                         

jablib/src/test/java/org/jabref/logic/cleanup/DoiCleanupTest.java

  • Changed @MethodSource annotation to use implicit method name matching
    instead of explicit string reference
  • Renamed test data provider method from provideDoiForAllLowers() to
    changeDoi() to match test method name
  • Added three new test cases for malformed percent-encoded characters
  • Added test for invalid percent encoding in NOTE field that should
    remain unchanged
  • Added test for valid percent-encoded DOI in NOTE field that should be
    moved to DOI field
  • Added test for invalid percent encoding in DOI field that should
    remain unchanged
+21/-3   
Documentation
CHANGELOG.md
Document DOI encoding fix in changelog                                     

CHANGELOG.md

+1/-0     

@koppor koppor requested a review from Siedlerchr December 23, 2025 17:21
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Silent error handling: The catch block at line 41-43 silently falls back to the original value without logging
the IllegalArgumentException, making debugging difficult.

Referred Code
} catch (IllegalArgumentException e) {
    // If decoding fails, we keep the original value
    decodedDoiFieldValue = currentlyStoredDoi;

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent data loss from redundant cleanup

Wrap the DOI search loop in an if block to prevent redundant cleanup and
potential data loss. This ensures the loop only runs if no valid DOI was found
in the main DOI field.

jablib/src/main/java/org/jabref/logic/cleanup/DoiCleanup.java [68-78]

-for (Field field : FIELDS) {
-    entry.getField(field)
-         .flatMap(DOI::parse) // covers a full DOI only
-         .ifPresent(doi -> {
-             if (!validDoiExistsInDoiField.get()) {
-                 Optional<FieldChange> change = entry.setField(StandardField.DOI, doi.asString());
-                 change.ifPresent(changes::add);
-             }
-             removeFieldValue(entry, field, changes);
-         });
+if (!validDoiExistsInDoiField.get()) {
+    for (Field field : FIELDS) {
+        entry.getField(field)
+             .flatMap(DOI::parse) // covers a full DOI only
+             .ifPresent(doi -> {
+                 if (!validDoiExistsInDoiField.get()) {
+                     Optional<FieldChange> change = entry.setField(StandardField.DOI, doi.asString());
+                     change.ifPresent(changes::add);
+                     validDoiExistsInDoiField.set(true);
+                 }
+                 removeFieldValue(entry, field, changes);
+             });
+    }
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a logic flaw where a redundant loop can cause unintended side effects and proposes a sound fix that makes the code more robust and correct by preventing multiple DOIs from being moved from different fields.

Medium
Correct test case for malformed DOIs

Correct a test case by changing the field from StandardField.NOTE to
StandardField.DOI. This ensures the test accurately verifies the cleanup
behavior for malformed percent-encoded characters in the DOI field.

jablib/src/test/java/org/jabref/logic/cleanup/DoiCleanupTest.java [83-87]

 // cleanup of wrong percent encoded chars in DOI field
 Arguments.of(new BibEntry()
-                .withField(StandardField.NOTE, "10.18726/2018%7B%%5Ctextunderscore%7D3"),
+                .withField(StandardField.DOI, "10.18726/2018%7B%%5Ctextunderscore%7D3"),
         new BibEntry()
-                .withField(StandardField.NOTE, "10.18726/2018%7B%%5Ctextunderscore%7D3"));
+                .withField(StandardField.DOI, "10.18726/2018%7B%%5Ctextunderscore%7D3"));
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a mismatch between the test case's description and its implementation, proposing a change that aligns the test with its intended purpose and improves test coverage for the DOI field.

Medium
  • More

@koppor koppor added this pull request to the merge queue Dec 23, 2025
Merged via the queue into main with commit 0827bd0 Dec 23, 2025
71 of 77 checks passed
@koppor koppor deleted the fix-mp-1160 branch December 23, 2025 19:15
Siedlerchr added a commit that referenced this pull request Dec 23, 2025
* main:
  Update AI usage policy (#14698)
  Fix handling of DOIs (#14704)
  Handle ohter CrossRef response (#14696)
  Fix condition for processing closed issues/PRs
  Translate the English "change to Chinese(simplified)" to the Chinese in the warning dialog (#14690)
  More performance optimization (#14695)
  Add missing dot (and a link)
  Add link to PR template also if checklist is present, but not OK (#14694)
  Fix typo in IntelliJ code style instructions (#14693)
  Add import into new library to Welcome Tab (#14669)
  Add initial search requirements (#14633)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants