Skip to content

Add OpenCitations#14996

Merged
koppor merged 33 commits intomainfrom
add-open-citations
Feb 2, 2026
Merged

Add OpenCitations#14996
koppor merged 33 commits intomainfrom
add-open-citations

Conversation

@koppor
Copy link
Copy Markdown
Member

@koppor koppor commented Feb 1, 2026

User description

This adds support for OpenCitations.

image

Steps to test

  1. Open Chocolate.bib
  2. Select any entry
  3. Open tab "Citations"
  4. Switch to "OpenCitations"

Mandatory checks


PR Type

Enhancement


Description

  • Add OpenCitations fetcher support for citation relations

  • Implement bidirectional binding for fetcher selection preference

  • Create response model classes for OpenCitations API

  • Fix DOI URL encoding in CrossRef fetcher

  • Unify citation fetcher initialization in CLI commands


Diagram Walkthrough

flowchart LR
  A["CitationFetcherType"] -->|"adds OPEN_CITATIONS"| B["OpenCitationsFetcher"]
  B -->|"uses"| C["CitationItem<br/>CountResponse"]
  B -->|"fetches via"| D["OpenCitations API"]
  E["CitationRelationsTab"] -->|"bidirectional bind"| F["entryEditorPreferences"]
  E -->|"uses"| A
  G["CrossRef"] -->|"URL encode DOI"| H["API Request"]
  I["GetCitingWorks<br/>GetCitedWorks"] -->|"unified initialization"| A
Loading

File Walkthrough

Relevant files
Enhancement
8 files
OpenCitationsFetcher.java
Implement OpenCitations citation fetcher core                       
+160/-0 
CitationItem.java
Create CitationItem response model class                                 
+59/-0   
CountResponse.java
Create CountResponse model for citation counts                     
+20/-0   
CitationFetcherType.java
Register OpenCitations fetcher in enum                                     
+6/-2     
SemanticScholarCitationFetcher.java
Update fetcher name and nullability annotations                   
+6/-5     
CitationRelationsTab.java
Implement bidirectional preference binding for fetcher     
+15/-13 
GetCitedWorks.java
Change default fetcher to OpenCitations                                   
+9/-10   
GetCitingWorks.java
Add provider option and unify fetcher initialization         
+28/-2   
Bug fix
1 files
CrossRef.java
URL encode DOI identifiers in API requests                             
+5/-3     
Documentation
2 files
CHANGELOG.md
Document OpenCitations support and DOI encoding fix           
+2/-0     
references.md
Clarify references definition and structure                           
+2/-1     

koppor added 29 commits February 1, 2026 13:06
Now let me check the directory structure and create the model classes:
Updated spec.md and plan.md to use **package-private fields** instead of private fields with getters/setters:

**Changes:**
- Response model classes will use fields without access modifiers (package-private)
- No getters/setters boilerplate needed
- Gson can directly access package-private fields
- Only public helper methods (extractDoi(), toBibEntry()) where needed
Updated spec.md with complete structure examples for all response model classes:

**CitationItem**: Package-private fields (oci, citing, cited, creation, timespan, journalSelfCitation, authorSelfCitation) with public helper methods

**CitationResponse**: Package-private `List<CitationItem> data` field

**CountResponse**: Package-private `String count` field

All classes follow the package-private pattern - no getters/setters needed, Gson handles deserialization directly.
✓ **OpenCitationsFetcher Core Class implemented**

Created `/home/koppor/.zenflow/worktrees/new-task-ed21/jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java` with:

- ✓ Constructor accepting `ImporterPreferences`
- ✓ `getName()` returning "OpenCitations"
- ✓ `getReferences()` - fetches outgoing references from DOI
- ✓ `getCitations()` - fetches incoming citations to DOI
- ✓ `getCitationCount()` - fetches citation count
- ✓ DOI extraction from OpenCitations PID strings
- ✓ BibEntry conversion using CrossRef fetcher
- ✓ Fallback to minimal BibEntry creation
- ✓ Error handling (missing DOI, network errors, JSON parsing)
- ✓ Optional API key support via ImporterPreferences

Plan updated: `plan.md:71` marked as `[x]`
Updated to use `@NullMarked` + `@Nullable` pattern
Done. `fetchBibEntryFromDoi` now handles errors with `createMinimalBibEntry`, and `createMinimalBibEntry` uses method chaining.
Done. Now handles all identifiers:
- **DOI**: Fetches full metadata via CrossRef
- **PMID**: Creates entry with `StandardField.PMID`
- **Other IDs** (omid, openalex): Creates entry using `FieldFactory.parseField()`

No identifiers are skipped.
Simplified. `FieldFactory.parseField()` handles all identifiers including `doi` and `pmid`.
Updated to handle all identifiers from space-separated lists. Each citation/reference now creates a single BibEntry with:
- Full metadata from CrossRef (if DOI present)
- All additional identifiers (PMID, OMID, OpenAlex, etc.) added as fields
Done. Identifiers without colons are now added to BibEntry with `StandardField.NOTE`.
Unified. `getReferences()` and `getCitations()` now delegate to `fetchCitationData()` with endpoint and identifier extractor as parameters.
Refactored. Now creates **one BibEntry per identifier**:
- Each identifier (DOI, PMID, OMID, etc.) → separate BibEntry
- DOI identifiers fetch full metadata via CrossRef
- Other identifiers create minimal entries with just that field
Fixed! Each `CitationItem` now creates **one BibEntry** with:
- Full metadata from CrossRef (if DOI exists) + all other identifiers added
- All identifiers as fields (if no DOI)
I'll register OpenCitations in the CitationFetcherType enum. Let me first locate and read the file.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Feb 1, 2026

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
URL injection vulnerability

Description: The DOI value is directly concatenated into the API URL without proper URL encoding, which
could lead to URL injection or malformed requests if the DOI contains special characters.
OpenCitationsFetcher.java [50-50]

Referred Code
    return API_BASE_URL + "/" + endpoint + "/doi:" + doi;
}
API key exposure risk

Description: The API key is added as a header value without validation or sanitization, potentially
exposing sensitive credentials if the key contains unexpected characters or is logged.
OpenCitationsFetcher.java [76-76]

Referred Code
.ifPresent(apiKey -> urlDownload.addHeader("authorization", apiKey));
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Silent error handling: The fetchBibEntryFromIdentifiers method catches FetcherException but only logs a warning
without providing actionable context to the caller or handling the failure case
appropriately.

Referred Code
} catch (FetcherException e) {
    LOGGER.warn("Could not fetch BibEntry for DOI: {}", doiIdentifier.get().value(), e);
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Missing input validation: The getApiUrl method constructs API URLs using DOI values without validating or sanitizing
the DOI string before concatenation, which could lead to URL injection if DOI contains
malicious characters.

Referred Code
private String getApiUrl(String endpoint, BibEntry entry) throws FetcherException {
    String doi = entry.getDOI()
                      .orElseThrow(() -> new FetcherException("Entry does not have a DOI"))
                      .asString();
    return API_BASE_URL + "/" + endpoint + "/doi:" + doi;
}

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@koppor koppor enabled auto-merge February 1, 2026 22:02
@koppor koppor added component: citation-relations status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers labels Feb 1, 2026
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Feb 1, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Use a generic DOI-based fetcher

Replace the direct dependency on CrossRef fetcher in the new
OpenCitationsFetcher with a more generic DoiFetcher. This will decouple the
components and improve modularity.

Examples:

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java [35]
    private final CrossRef crossRefFetcher = new CrossRef();
jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java [139]
                Optional<BibEntry> fetchedEntry = crossRefFetcher.performSearchById(doiIdentifier.get().value());

Solution Walkthrough:

Before:

// jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java
public class OpenCitationsFetcher implements CitationFetcher {
    private final ImporterPreferences importerPreferences;
    private final CrossRef crossRefFetcher = new CrossRef();

    public OpenCitationsFetcher(ImporterPreferences importerPreferences) {
        this.importerPreferences = importerPreferences;
    }

    private BibEntry fetchBibEntryFromIdentifiers(List<CitationItem.IdentifierWithField> identifiers) {
        // ...
        if (doiIdentifier.isPresent()) {
            Optional<BibEntry> fetchedEntry = crossRefFetcher.performSearchById(doiIdentifier.get().value());
            // ...
        }
        // ...
    }
}

After:

// jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java
public class OpenCitationsFetcher implements CitationFetcher {
    private final ImporterPreferences importerPreferences;
    private final DoiFetcher doiFetcher;

    public OpenCitationsFetcher(ImporterPreferences importerPreferences) {
        this.importerPreferences = importerPreferences;
        this.doiFetcher = new DoiFetcher(importerPreferences);
    }

    private BibEntry fetchBibEntryFromIdentifiers(List<CitationItem.IdentifierWithField> identifiers) {
        // ...
        if (doiIdentifier.isPresent()) {
            Optional<BibEntry> fetchedEntry = doiFetcher.performSearchById(doiIdentifier.get().value());
            // ...
        }
        // ...
    }
}
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a tight coupling between the new OpenCitationsFetcher and CrossRef, and proposing to use the more generic DoiFetcher significantly improves modularity and robustness.

Medium
Possible issue
URL-encode DOI to prevent request errors

URL-encode the DOI in getApiUrl to prevent malformed URLs and ensure correct API
requests, consistent with other fixes in this PR.

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/OpenCitationsFetcher.java [46-51]

 private String getApiUrl(String endpoint, BibEntry entry) throws FetcherException {
     String doi = entry.getDOI()
                       .orElseThrow(() -> new FetcherException("Entry does not have a DOI"))
                       .asString();
-    return API_BASE_URL + "/" + endpoint + "/doi:" + doi;
+    return API_BASE_URL + "/" + endpoint + "/doi:" + URLEncoder.encode(doi, StandardCharsets.UTF_8);
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: This is a valid and important bug fix. The PR's changelog explicitly mentions fixing an issue with DOIs containing URL-invalid characters, and this suggestion correctly points out that the new OpenCitationsFetcher fails to apply the same fix, which could lead to runtime errors.

Medium
General
Avoid adding identifiers with empty values

In extractIdentifiers, add a check to prevent creating identifiers with empty
values, such as from a "doi: " string.

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/opencitations/CitationItem.java [30-50]

 List<IdentifierWithField> extractIdentifiers(@Nullable String pidString) {
     if (pidString == null || pidString.isEmpty()) {
         return List.of();
     }
 
     String[] pids = pidString.split("\\s+");
     List<IdentifierWithField> identifiers = new ArrayList<>();
     for (String pid : pids) {
         int colonIndex = pid.indexOf(':');
         if (colonIndex > 0) {
             String prefix = pid.substring(0, colonIndex);
             String value = pid.substring(colonIndex + 1);
-            Field field = FieldFactory.parseField(prefix);
-            identifiers.add(new IdentifierWithField(field, value));
+            if (!value.isEmpty()) {
+                Field field = FieldFactory.parseField(prefix);
+                identifiers.add(new IdentifierWithField(field, value));
+            }
         } else {
             identifiers.add(new IdentifierWithField(StandardField.NOTE, pid));
         }
     }
 
     return identifiers;
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a potential issue where an identifier with an empty value could be created. While this is a good defensive programming practice, its impact is minor as it's unlikely to receive such malformed data from the API.

Low
Learned
best practice
Complete incomplete documentation sentence
Suggestion Impact:The commit addressed the incomplete sentence issue but chose a different solution. Instead of adding "or bibliography of a scholarly document" as suggested, the commit simply removed the trailing "or the" to make the sentence complete and grammatically correct.

code diff:

-It is usually described as the outgoing references to other cited works appearing in the reference list or the 
+It is usually described as the outgoing references to other cited works appearing in the reference list.

The sentence appears incomplete and ends abruptly. Complete the sentence to
clearly describe what references represent in the context of scholarly works.

docs/glossary/references.md [13]

-It is usually described as the outgoing references to other cited works appearing in the reference list or the
+It is usually described as the outgoing references to other cited works appearing in the reference list or bibliography of a scholarly document.

[Suggestion processed]

Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - Fix typographical errors in documentation to maintain professionalism and clarity. Incomplete sentences or malformed text should be corrected.

Low
  • Update

.install(fetcherCombo);
styleTopBarNode(fetcherCombo, 75.0);
fetcherCombo.setValue(entryEditorPreferences.getCitationFetcherType());
fetcherCombo.valueProperty().bindBidirectional(entryEditorPreferences.citationFetcherTypeProperty());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why bidirectional?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Common pattern in JabRef's code base
  2. It might be that some other UI element will allow for changing the citatoin fetcher. E.g., when checking citation counts at org.jabref.gui.fieldeditors.CitationCountEditor

return 0;
}
try {
return Integer.parseInt(count);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this class here really necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DTO pattern. Better consistency in always using a DTO in this class than case-by-case

--> CitatonItem is the other DTO.

protected void bindToEntry(BibEntry entry) {
citationsRelationsTabViewModel.bindToEntry(entry);

// TODO: All this should go to ViewModel
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you do this or for follow-up

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe me, maybe someone else. This was in before - I created an issue for this.

@koppor koppor added this pull request to the merge queue Feb 2, 2026
@github-actions github-actions bot added the status: to-be-merged PRs which are accepted and should go into the merge-queue. label Feb 2, 2026
Merged via the queue into main with commit 66c86ba Feb 2, 2026
75 checks passed
@koppor koppor deleted the add-open-citations branch February 2, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: citation-relations Review effort 3/5 status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers status: to-be-merged PRs which are accepted and should go into the merge-queue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants