Please note: To read these instructions in full, please go to https://github.com/IQSS/dataverse/releases/tag/v6.10 rather than the list of releases, which will cut them off.
This release brings new features, enhancements, and bug fixes to Dataverse. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project!
Highlights for Dataverse 6.10 include:
- Optionally require acknowledgment of a disclaimer when publishing
- Optionally require embargo reason
- Harvesting improvements
- Croissant support now built in
- Archiving, OAI-ORE, and BagIt export improvements
- Support for REFI-QDA Codebook and Project files
- Review datasets
- New and improved APIs
- Bug fixes
When users click "Publish" on a dataset they have always seen a popup displaying various information to read before clicking "Continue" to proceed with publication.
Now you can optionally require users to check a box in this popup to acknowledge a disclaimer that you specify through a new setting called :PublishDatasetDisclaimerText.
For backward compatibility, APIs will continue to publish without the acknowledgement for now. An API endpoint was added for anyone to retrieve the disclaimer text anonymously.
See the guides and #12051.
It is now possible to configure Dataverse to require an embargo reason when a user creates an embargo on one or more files. By default, the embargo reason is optional. dataverse.feature.require-embargo-reason can be set to true to enable this feature.
In addition, with this release, if an embargo reason is supplied, it must not be blank.
See the guides, #8692, #11956, #12067.
A setting has been added for configuring sleep intervals in between OAI-PMH calls on a per-server basis. This can help when some of the servers you want to harvest from have rate limiting policies. You can set a default sleep time and custom sleep times for servers that need more.
Additionally, this release fixes a problem with harvesting from DataCite OAI-PMH where initial, long-running harvests were failing on sets with large numbers of records.
See :HarvestingClientCallRateLimit in the guides, #11473, and #11486.
Croissant is a metadata export format for machine learning datasets that (until this release) was optional and implemented as external exporter. The code has been merged into the main Dataverse code base which means the Croissant format is automatically available in your installation of Dataverse, alongside older formats like Dublin Core and DDI. If you were using the external Croissant exporter, the merged code is equivalent to version 0.1.6. Croissant bugs and feature requests should now be filed against the main Dataverse repo (https://github.com/IQSS/dataverse) and the old repo (https://github.com/gdcc/exporter-croissant) should be considered retired.
As described in the Discoverability section of the Admin Guide, Croissant is inserted into the "head" of the HTML of dataset landing pages, as requested by the Google Dataset Search team so that their tool can filter by datasets that support Croissant. In previous versions of Dataverse, when Croissant was optional and hadn't been enabled, we used the older "Schema.org JSON-LD" format in the "head". If you'd like to keep this behavior, you can use the feature flag dataverse.legacy.schemaorg-in-html-head.
Both Croissant and Schema.org JSON-LD formats can become quite large when the dataset has many files or (for Croissant) when the files have many variables. As of this release, the "head" of the HTML contains a "slim" version of Croissant that doesn't contain information about files or variables. The original, full version of Croissant is still available via the "Export Metadata" dropdown. Both "croissant" and "croissantSlim" formats are available via API.
See also #11254, #12123, #12130, and #12191.
This release includes multiple updates to the OAI-ORE metadata export and the process of creating archival bags, improving performance, fixing bugs, and adding significant new functionality. See #12144, #12129, #12122, #12104, #12103, #12101, and #12213.
- Multiple performance and scaling improvements have been made for creating archival bags for large datasets, including:
- The duration of archiving tasks triggered from the version table or API are no longer limited by the transaction time limit.
- Temporary storage space requirements have increased by
1/:BagGeneratorThreadsof the zipped bag size. (Often this is by half because the default value for:BagGeneratorThreadsis 2.) This is a consequence of changes to avoid timeout errors on larger files/datasets. - The size of individual data files and the total dataset size that will be included in an archival bag can now be limited. Admins can choose whether files above these limits are transferred along with, but outside, the zipped bag (creating a complete archival copy) or are just referenced (using the concept of a "holey" bag and just listing the oversized files and the Dataverse URLs from which they can be retrieved in a
fetch.txtfile). In the holey bag case, an active service on the archiving platform must retrieve the oversized files (using appropriate credentials as needed) to make a complete copy. - Superusers can now see a pending status in the dataset version table while archiving is active.
- Workflows are now triggered outside the transactions related to publication, assuring that workflow locks and status updates are always recorded.
- Potential conflicts between archiving/workflows, indexing, and metadata exports after publication have been resolved, avoiding cases where the status/last update times for these actions were not recorded.
- A bug has been fixed where superusers would incorrectly see the "Submit" button to launch archiving from the dataset page version table.
- The local, S3, and Google archivers have been updated to support deleting existing archival files for a version to allow re-creating the bag for a given version.
- For archivers that support file deletion, it is now possible to recreate an archival bag after "Update Current Version" has been used (replacing the original bag). By default, Dataverse will mark the current version's archive as out-of-date, but will not automatically re-archive it.
- A new "obsolete" status has been added to indicate when an archival bag exists for a version but it was created prior to an "Update Current Version" change.
- Improvements have been made to file retrieval for bagging, including retries on errors and when download requests are being throttled.
- A bug causing
:BagGeneratorThreadsto be ignored has been fixed, and the default has been reduced to 2.
- A bug causing
- Retrieval of files for inclusion in an archival bag is no longer counted as a download.
- It is now possible to require that all previous versions have been successfully archived before archiving of a newly published version can succeed. This is intended to support use cases where de-duplication of files between dataset versions will be done and is a step towards supporting the Oxford Common File Layout (OCFL).
- The pending status has changed to use the same JSON format as other statuses.
- The export now uses URIs for checksum algorithms, conforming with JSON-LD requirements.
- A bug causing failures with deaccessioned versions has been fixed. This occurred when the deaccession note ("Deaccession Reason" in the UI) was null, which is permissible via the API.
- The
https://schema.org/additionalTypehas been updated to "Dataverse OREMap Format v1.0.2" to reflect format changes.
- The
bag-info.txtfile now correctly includes information for dataset contacts, fixing a bug where nothing was included when multiple contacts were defined. (Multiple contacts were always included in the OAI-ORE file in the bag; only the baginfo file was affected). - Values used in the
bag-info.txtfile that may be multi-line (i.e. with embedded CR or LF characters) are now properly indented and wrapped per the BagIt specification (Internal-Sender-Identifier,External-Description,Source-Organization,Organization-Address). - The dataset name is no longer used as a subdirectory within the
data/directory to reduce issues with unzipping long paths on some filesystems. - For dataset versions with no files, the empty
manifest-<alg>.txtfile will now use the algorithm from the:FileFixityChecksumAlgorithmsetting instead of defaulting to MD5. - A new key,
Dataverse-Bag-Version, has been added tobag-info.txtwith the value "1.0" to allow for tracking changes to Dataverse's archival bag generation over time. - When using the
holeybag option discussed above, the requiredfetch.txtfile will be included.
.qdc and .qdpx files are now detected as REFI-QDA standard Codebook and Project files, respectively, for qualitative data analysis, which allows them to be used with the new REFI QDA Previewers. See gdcc/dataverse-previewers#137 for screenshots.
To enable existing .qdc and .qdpx files to be used with the previewers, their content type (MIME type) will need to be redetected. See #12163.
Dataverse now supports review datasets, a type of dataset that can be used to review resources such as other datasets in the Dataverse installation itself or various resources in external data repositories. APIs and a new "review" metadata block (with an "Item Reviewed" field) are in place but the UI for this feature will only available in a future version of the new React-based Dataverse Frontend (see #876). See the guides, #11747, #12015, #11887, #12115, and #11753. This feature is experimental.
These are features that weren't already mentioned under "highlights" above.
- A new "DATASETMOVED" notification type was added for when datasets are moved from one collection (dataverse) to another. This requires the :SendNotificationOnDatasetMove setting to be enabled. See #11670 and #11805.
- Performance has been improved for the Solr search index. Changes in v6.9 that significantly improved re-indexing performance and lowered memory use (in situations such as when a user's role on the root collection were changed) also slowed reindexing of individual datasets after editing and publication. This release restores/improves the individual dataset reindexing performance while retaining the benefits of the earlier update. This release also avoids creating unused Solr entries for files in drafts of new versions of published datasets (decreasing the Solr database size and thereby improving performance). See #12082, #12093, and #12094.
- In prior versions of Dataverse, configuring a proxy to forward to Dataverse over an HTTP connection could result in failure of signed URLs (e.g. for external tools). This version of Dataverse supports having a proxy send an
X-Forwarded-Protoheader set to HTTPS to avoid this issue. See the guides and #11787. - Citation Style Language (CSL) output now includes "type:software" or "type:review" when those dataset types are used. See the guides and #11753.
rdm-integration is a Dataverse external tool for synchronizing files from various source repositories into Dataverse, with support for background processing, DDI-CDI metadata generation, and high-performance Globus transfers. You can find it on the Integrations section of the Dataverse Admin Guide.
Release 2.0.1 brings several new Globus capabilities:
- Guest downloads — public datasets can be downloaded via Globus without a Dataverse account
- Preview URL support — reviewers can download draft dataset files via Globus using general preview URLs
- Scoped institutional login —
session_required_single_domainsupport enables access to institutional Globus endpoints (e.g., HPC clusters); scopes are automatically removed for guest and preview access - Real-time transfer progress — polling-based progress monitoring with percentage display and status updates (ACTIVE/SUCCEEDED/FAILED)
- Download filtering — only datasets where the user can download all files are shown, avoiding failed transfers for restricted or embargoed content
- Hierarchical file tree — recursive folder selection and color-coded file status
For full details, see the README and GLOBUS_INTEGRATION.md.
See the note above about support for REFI-QDA files. Screenshots of the previewer can be found at gdcc/dataverse-previewers#137
- The names of host collections were visible when using anonymized preview URLs. See #11085 and #12111.
- As of Dataverse 6.8, the "replace file" feature was not working. See #11976, #12107, and #12157.
- A dataset or collection (dataverse) was still visible in browse/search results immediately after deleting it if you didn't refresh the page. See #11206 and #12072.
- The text in "assign role" notifications now only shows the role that was just assigned. Previously, the notification showed all the roles associated with the dataset. See #11773 and #11915.
- Handles from hdl.handle.net with urls of
/citationinstead of/dataset.xhtmlwere not properly redirecting. This fix adds a lookup for alternate PID so/citationendpoint will redirect to/dataset.xhtml. See #11943. - Dataverse no longer sends duplicate COAR Notify Relationship Announcement Workflow messages when new dataset versions are published (and the relationship metadata has not been changed). See #11983.
- 500 error when deleting dataset type by name. See #11833 and #11753.
- Dataset Type facet works in JSF but not the SPA. See #11758 and #11753.
- PIDs could not be generated when the
identifier-generation-stylewas set tostoredProcGenerated. See #12126 and #12127. - It came to our attention that the Dataverse Uploader GitHub Action was failing with an "unhashable type" error. This has been fixed in a new release, v1.7.
- The MyData API now supports the
metadata_fields,sort,order,show_collectionsandfqparameters, which enhances its functionality and brings it in line with the Search API. See the guides and #12009. - This release removes an undocumented restriction on the API calls to get, set, and delete archival status. They did not work on deaccessioned dataset versions and now do. See #12065.
- Dataset templates can be listed and deleted for a given collection (dataverse). See the guides, #11918, and #11969. The default template can also be set. See the guides, #11914 and #11989.
- Because some clients (such as the new frontend) need to retrieved contact email addresses along with the rest of the dataset metadata, a new query parameter called
ignoreSettingExcludeEmailFromExporthas been introduced. It requires "EditDataset" permission. See the guides, #11714, and #11819. - The Change Collection Attributes API now supports
allowedDatasetTypes. See the guides, #12115, and #11753. - The API returning information about datasets (
/api/datasets/{id}) now includes alocksfield containing a list of the types of all existing locks, e.g."locks": ["InReview"]. See #12008. - Cleaned up Access APIs to localize getting user from session for JSF backward compatibility. This bug requires a frontend fix to send the Bearer Token in the API call. See #11740 and #11844.
This release contains important security updates. If you are not receiving security notices, please sign up by following the steps in the guides.
Generally speaking, see the API Changelog for a list of backward-incompatible API changes.
The filename of the archival zipped bag produced by the LocalSubmitToArchiveCommand archiver now has a "." character before the "v" (for version number) to mirror the filename used by other archivers. For example, the filename will look like
doi-10-5072-fk2-fosg5q.v1.0.zip
rather than
doi-10-5072-fk2-fosg5qv1.0.zip.
In previous releases of Dataverse, as soon as additional dataset types were added (such as "software", "workflow", etc.), they could be used by all users when creating datasets (via API only). As of this release, on a per-collection basis, superusers must allow these dataset types to be used. See #12115 and #11753.
We mentioned this in the Dataverse 6.6, 6.8, 6.9 release notes, but as a reminder, according to https://www.postgresql.org/support/versioning/ PostgreSQL 13 reached EOL on 13 November 2025. As stated in the Installation Guide, we recommend running PostgreSQL 16 since it is the version we test with in our continuous integration (since February 2025). The Dataverse 5.4 release notes explained the upgrade process from 9 to 13 (e.g. pg_dumpall, etc.) and the steps will be similar. If you have any problems, please feel free to reach out (see "getting help" in these release notes).
This release fixes a bug where the value of the dataverse.auth.oidc.enabled setting (available when provisioning an authentication provider via JVM options) was not being not being propagated to the current Dataverse user interface (JSF, where enabled=false providers are not displayed for login/registration) or represented in the GET /api/admin/authenticationProviders API call.
A new JVM setting (dataverse.auth.oidc.hidden-jsf) was added to hide an enabled OIDC Provider from the JSF UI.
For Dataverse instances deploying both the current JSF UI and the new SPA UI, this fix allows the OIDC Keycloak provider configured for the SPA to be hidden in the JSF UI. This is useful in cases where it would duplicate other configured providers.
Note: The API to create a new Auth Provider can only be used to create a provider for both JSF and SPA. Use JVM / MicroProfile config setting to create SPA-only providers.
See dataverse.auth.oidc.hidden-jsf in the guides, #11606, and #11922.
- dataverse.auth.oidc.hidden-jsf
- dataverse.bagit.archive-on-version-update
- dataverse.bagit.zip.holey
- dataverse.bagit.zip.max-data-size
- dataverse.bagit.zip.max-file-size
- dataverse.feature.require-embargo-reason
- dataverse.legacy.schemaorg-in-html-head
- :ArchiveOnlyIfEarlierVersionsAreArchived
- :HarvestingClientCallRateLimit
- :PublishDatasetDisclaimerText
- :SendNotificationOnDatasetMove
For the complete list of code changes in this release, see the 6.10 milestone in GitHub.
For help with upgrading, installing, or general questions please see getting help in the Installation Guide.
If this is a new installation, please follow our Installation Guide. Please don't be shy about asking for help if you need it!
Once you are in production, we would be delighted to update our map of Dataverse installations around the world to include yours! Please create an issue or email us at support@dataverse.org to join the club!
You are also very welcome to join the Global Dataverse Community Consortium (GDCC).
Upgrading requires a maintenance window and downtime. Please plan accordingly, create backups of your database, etc.
Note: These instructions assume that you are upgrading from the immediate previous version. That is to say, you've already upgraded through all the 6.x releases and are now running Dataverse 6.9. See tags on GitHub for a list of versions. If you are running an earlier version, the only supported way to upgrade is to progress through the upgrades to all the releases in between before attempting the upgrade to this version.
If you are running Payara as a non-root user (and you should be!), remember not to execute the commands below as root. By default, Payara runs as the dataverse user. In the commands below, we use sudo to run the commands as a non-root user.
Also, we assume that Payara 6 is installed in /usr/local/payara6. If not, adjust as needed.
-
Undeploy Dataverse, using the unprivileged service account ("dataverse", by default).
sudo -u dataverse /usr/local/payara6/bin/asadmin list-applicationssudo -u dataverse /usr/local/payara6/bin/asadmin undeploy dataverse-6.9 -
Deploy the Dataverse 6.10 war file.
wget https://github.com/IQSS/dataverse/releases/download/v6.10/dataverse-6.10.warsudo -u dataverse /usr/local/payara7/bin/asadmin deploy dataverse-6.10.war