Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9baf2d5
File Metadata Update
stevenwinship Apr 18, 2025
6d871aa
new version checking
stevenwinship Apr 18, 2025
fd800fc
fix test
stevenwinship Apr 18, 2025
dea4904
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship Apr 21, 2025
8dcb065
fix test
stevenwinship Apr 22, 2025
2fce5fd
add to test
stevenwinship Apr 22, 2025
26f07b7
adding info for debugging jenkins test failure
stevenwinship Apr 22, 2025
10939ce
remove jenkins debug
stevenwinship Apr 23, 2025
f5ddcff
per review comments
stevenwinship Apr 24, 2025
1615fb9
per review comments
stevenwinship Apr 24, 2025
8a57fc8
refactor to use last update timestamp instead of version number
stevenwinship Apr 25, 2025
6fedf6e
comment on data/timestamp compare
stevenwinship Apr 25, 2025
eaf49a9
refactor so both datafiles and datasets validate update timestamp the…
stevenwinship Apr 29, 2025
1320d51
refactor optional qp name from sourceInternalVersionTimestamp to sour…
stevenwinship Apr 30, 2025
b61bb1c
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship May 29, 2025
8ef92d2
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship Jun 24, 2025
7563cf8
Suggested doc edits (#11590)
qqmyers Jun 24, 2025
7a7a84f
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship Jun 24, 2025
0eaca6c
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship Jun 26, 2025
0c72a89
remove unused bundle entry
stevenwinship Jul 14, 2025
509e55a
update changelog to move this PR to 6.8
stevenwinship Jul 14, 2025
c465180
Merge branch 'develop' into 11392-edit-file-metadata-empty-values
stevenwinship Jul 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions doc/release-notes/11243-editmetadata-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
### Edit Dataset Metadata API extension

- This endpoint now allows removing fields (by sending empty values), as long as they are not required by the dataset.
- New ``sourceLastUpdateTime`` optional query parameter, which prevents inconsistencies by managing updates that
may occur from other users while a dataset is being edited.

NOTE: This release note was updated to conform to the refactoring of the validation as part of issue #11392
7 changes: 7 additions & 0 deletions doc/release-notes/11392-edit-file-metadata-empty-values.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
### Edit File Metadata empty values should clear data

Previously the API POST /files/{id}/metadata would ignore fields with empty values. Now the API updates the fields with the empty values essentially clearing the data. Missing fields will still be ignored.

An optional query parameter (sourceLastUpdateTime) was added to ensure the metadata update doesn't overwrite stale data.

See also [the guides](https://dataverse-guide--11359.org.readthedocs.build/en/11359/api/native-api.html#updating-file-metadata), #11392, and #11359.
5 changes: 5 additions & 0 deletions doc/sphinx-guides/source/api/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ This API changelog is experimental and we would love feedback on its usefulness.
:local:
:depth: 1

v6.8
----
- For POST /api/files/{id}/metadata passing an empty string ("description":"") or array ("categories":[]) will no longer be ignored. Empty fields will now clear out the values in the file's metadata. To ignore the fields simply do not include them in the JSON string.
- For PUT /api/datasets/{id}/editMetadata the query parameter "sourceInternalVersionNumber" has been removed and replaced with "sourceLastUpdateTime" to verify that the data being edited hasn't been modified and isn't stale.

v6.7
----

Expand Down
21 changes: 12 additions & 9 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2156,26 +2156,26 @@ For these edits your JSON file need only include those dataset fields which you

This endpoint also allows removing fields, as long as they are not required by the dataset. To remove a field, send an empty value (``""``) for individual fields. For multiple fields, send an empty array (``[]``). A sample JSON file for removing fields may be downloaded here: :download:`dataset-edit-metadata-delete-fields-sample.json <../_static/api/dataset-edit-metadata-delete-fields-sample.json>`

If another user updates the dataset version metadata before you send the update request, data inconsistencies may occur. To prevent this, you can use the optional ``sourceInternalVersionNumber`` query parameter. This parameter must include the internal version number corresponding to the dataset version being updated. Note that internal version numbers increase sequentially with each version update.
If another user updates the dataset version metadata before you send the update request, metadata inconsistencies may occur. To prevent this, you can use the optional ``sourceLastUpdateTime`` query parameter. This parameter must include the ``lastUpdateTime`` corresponding to the dataset version being updated. The date must be in the format ``yyyy-MM-dd'T'HH:mm:ss'Z'``.

If this parameter is provided, the update will proceed only if the internal version number remains unchanged. Otherwise, the request will fail with an error.
If this parameter is provided, the update will proceed only if the ``lastUpdateTime`` remains unchanged (meaning no one has updated the dataset metadata since you retrieved it). Otherwise, the request will fail with an error.

Example using ``sourceInternalVersionNumber``:
Example using ``sourceLastUpdateTime``:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/BCCP9Z
export SOURCE_INTERNAL_VERSION_NUMBER=5
export SOURCE_LAST_UPDATE_TIME=2025-04-25T13:58:28Z

curl -H "X-Dataverse-key: $API_TOKEN" -X PUT "$SERVER_URL/api/datasets/:persistentId/editMetadata?persistentId=$PERSISTENT_IDENTIFIER&replace=true&sourceInternalVersionNumber=$SOURCE_INTERNAL_VERSION_NUMBER" --upload-file dataset-update-metadata.json
curl -H "X-Dataverse-key: $API_TOKEN" -X PUT "$SERVER_URL/api/datasets/:persistentId/editMetadata?persistentId=$PERSISTENT_IDENTIFIER&replace=true&sourceLastUpdateTime=SOURCE_LAST_UPDATE_TIME" --upload-file dataset-update-metadata.json

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/datasets/:persistentId/editMetadata/?persistentId=doi:10.5072/FK2/BCCP9Z&replace=true&sourceInternalVersionNumber=5" --upload-file dataset-update-metadata.json
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/datasets/:persistentId/editMetadata/?persistentId=doi:10.5072/FK2/BCCP9Z&replace=true&sourceLastUpdateTime=2025-04-25T13:58:28Z" --upload-file dataset-update-metadata.json


Delete Dataset Metadata
Expand Down Expand Up @@ -4730,6 +4730,8 @@ Updating File Metadata

Updates the file metadata for an existing file where ``ID`` is the database id of the file to update or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file. Requires a ``jsonString`` expressing the new metadata. No metadata from the previous version of this file will be persisted, so if you want to update a specific field first get the json with the above command and alter the fields you want.

An optional parameter, sourceLastUpdateTime=datetime (in format: ``yyyy-MM-dd'T'HH:mm:ss'Z'``), can be used to verify that the file metadata being edited has not been changed since you last retrieved it, thereby avoiding potential lost metadata updates. The value for sourceLastUpdateTime can be taken from ``lastUpdateTime`` in the response to get $SERVER_URL/api/files/$ID API call.

A curl example using an ``ID``

.. code-block:: bash
Expand All @@ -4750,25 +4752,26 @@ The fully expanded example above (without environment variables) looks like this
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
"https://demo.dataverse.org/api/files/24/metadata"

A curl example using a ``PERSISTENT_ID``
A curl example using a ``PERSISTENT_ID`` and the sourceLastUpdateTime parameter:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/AAA000
export UPDATE_TIME=2025-04-25T13:58:28Z

curl -H "X-Dataverse-key:$API_TOKEN" -X POST \
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
"$SERVER_URL/api/files/:persistentId/metadata?persistentId=$PERSISTENT_ID"
"$SERVER_URL/api/files/:persistentId/metadata?persistentId=$PERSISTENT_ID&sourceLastUpdateTime=$UPDATE_TIME"

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST \
-F 'jsonData={"description":"My description bbb.","provFreeform":"Test prov freeform","categories":["Data"],"dataFileTags":["Survey"],"restrict":false}' \
"https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000"
"https://demo.dataverse.org/api/files/:persistentId/metadata?persistentId=doi:10.5072/FK2/AAA000&sourceLastUpdateTime=2025-04-25T13:58:28Z"

Note: To update the 'tabularTags' property of file metadata, use the 'dataFileTags' key when making API requests. This property is used to update the 'tabularTags' of the file metadata.

Expand Down
20 changes: 17 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
import edu.harvard.iq.dataverse.search.savedsearch.SavedSearchServiceBean;
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.util.DateUtil;
import edu.harvard.iq.dataverse.util.FileUtil;
import edu.harvard.iq.dataverse.util.SystemConfig;
import edu.harvard.iq.dataverse.util.json.JsonParser;
Expand All @@ -52,6 +53,7 @@

import java.io.InputStream;
import java.net.URI;
import java.time.Instant;
import java.util.*;
import java.util.concurrent.Callable;
import java.util.logging.Level;
Expand Down Expand Up @@ -447,10 +449,22 @@ public Command<DatasetVersion> handleLatestPublished() {
return dsv;
}

protected void validateInternalVersionNumberIsNotOutdated(Dataset dataset, int internalVersion) throws WrappedResponse {
if (dataset.getLatestVersion().getVersion() > internalVersion) {
protected void validateInternalTimestampIsNotOutdated(DvObject dvObject, String sourceLastUpdateTime) throws WrappedResponse {
Date date = sourceLastUpdateTime != null ? DateUtil.parseDate(sourceLastUpdateTime, "yyyy-MM-dd'T'HH:mm:ss'Z'") : null;
if (date == null) {
throw new WrappedResponse(
badRequest(BundleUtil.getStringFromBundle("abstractApiBean.error.datasetInternalVersionNumberIsOutdated", Collections.singletonList(Integer.toString(internalVersion))))
badRequest(BundleUtil.getStringFromBundle("jsonparser.error.parsing.date", Collections.singletonList(sourceLastUpdateTime)))
);
}
Instant instant = date.toInstant();
Comment thread
stevenwinship marked this conversation as resolved.
Instant updateTimestamp =
(dvObject instanceof DataFile) ? ((DataFile) dvObject).getFileMetadata().getDatasetVersion().getLastUpdateTime().toInstant() :
(dvObject instanceof Dataset) ? ((Dataset) dvObject).getLatestVersion().getLastUpdateTime().toInstant() :
instant;
// granularity is to the second since the json output only returns dates in this format to the second
if (updateTimestamp.getEpochSecond() != instant.getEpochSecond()) {
throw new WrappedResponse(
badRequest(BundleUtil.getStringFromBundle("abstractApiBean.error.internalVersionTimestampIsOutdated", Collections.singletonList(sourceLastUpdateTime)))
);
}
}
Expand Down
8 changes: 5 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Original file line number Diff line number Diff line change
Expand Up @@ -1118,12 +1118,14 @@ private String getCompoundDisplayValue (DatasetFieldCompoundValue dscv){
@PUT
@AuthRequired
@Path("{id}/editMetadata")
public Response editVersionMetadata(@Context ContainerRequestContext crc, String jsonBody, @PathParam("id") String id, @QueryParam("replace") boolean replaceData, @QueryParam("sourceInternalVersionNumber") Integer sourceInternalVersionNumber) {
public Response editVersionMetadata(@Context ContainerRequestContext crc, String jsonBody, @PathParam("id") String id,
@QueryParam("replace") boolean replaceData,
@QueryParam("sourceLastUpdateTime") String sourceLastUpdateTime) {
try {
Dataset dataset = findDatasetOrDie(id);

if (sourceInternalVersionNumber != null) {
validateInternalVersionNumberIsNotOutdated(dataset, sourceInternalVersionNumber);
if (sourceLastUpdateTime != null) {
validateInternalTimestampIsNotOutdated(dataset, sourceLastUpdateTime);
}

JsonObject json = JsonUtil.getJsonObject(jsonBody);
Expand Down
12 changes: 9 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/Files.java
Original file line number Diff line number Diff line change
Expand Up @@ -410,8 +410,7 @@ public Response deleteFileInDataset(@Context ContainerRequestContext crc, @PathP
@AuthRequired
@Path("{id}/metadata")
public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDataParam("jsonData") String jsonData,
@PathParam("id") String fileIdOrPersistentId
) throws DataFileTagException, CommandException {
@PathParam("id") String fileIdOrPersistentId, @QueryParam("sourceLastUpdateTime") String sourceLastUpdateTime) {

FileMetadata upFmd = null;

Expand All @@ -429,6 +428,13 @@ public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDa
return error(BAD_REQUEST, "Error attempting get the requested data file.");
}

if (sourceLastUpdateTime != null) {
try {
validateInternalTimestampIsNotOutdated(df, sourceLastUpdateTime);
} catch (WrappedResponse wr) {
return wr.getResponse();
}
}

//You shouldn't be trying to edit a datafile that has been replaced
List<Long> result = em.createNamedQuery("DataFile.findDataFileThatReplacedId", Long.class)
Expand Down Expand Up @@ -519,7 +525,7 @@ public Response updateFileMetadata(@Context ContainerRequestContext crc, @FormDa
return error(Response.Status.INTERNAL_SERVER_ERROR, "Error adding metadata to DataFile: " + e);
}

} catch (WrappedResponse wr) {
} catch (CommandException | WrappedResponse ex) {
return error(BAD_REQUEST, "An error has occurred attempting to update the requested DataFile, likely related to permissions.");
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -194,46 +194,28 @@ public boolean getTabIngest() {
return this.tabIngest;
}

public boolean hasCategories(){
if ((categories == null)||(this.categories.isEmpty())){
return false;
}
return true;
public boolean hasCategories() {
return categories != null;
}

public boolean hasFileDataTags(){
if ((dataFileTags == null)||(this.dataFileTags.isEmpty())){
return false;
}
return true;
public boolean hasFileDataTags() {
return dataFileTags != null;
}

public boolean hasDescription(){
if ((description == null)||(this.description.isEmpty())){
return false;
}
return true;
return description != null;
}

public boolean hasDirectoryLabel(){
if ((directoryLabel == null)||(this.directoryLabel.isEmpty())){
return false;
}
return true;
public boolean hasDirectoryLabel() {
return directoryLabel != null;
}

public boolean hasLabel(){
if ((label == null)||(this.label.isEmpty())){
return false;
}
return true;
public boolean hasLabel() {
return label != null;
}

public boolean hasProvFreeform(){
if ((provFreeForm == null)||(this.provFreeForm.isEmpty())){
return false;
}
return true;
public boolean hasProvFreeform() {
return provFreeForm != null;
}

public boolean hasStorageIdentifier() {
Expand All @@ -245,15 +227,15 @@ public String getStorageIdentifier() {
}

public boolean hasFileName() {
return ((fileName!=null)&&(!fileName.isEmpty()));
return fileName != null;
}

public String getFileName() {
return fileName;
}

public boolean hasMimetype() {
return ((mimeType!=null)&&(!mimeType.isEmpty()));
return mimeType != null;
}

public String getMimeType() {
Expand All @@ -266,7 +248,7 @@ public void setCheckSum(String checkSum, ChecksumType type) {
}

public boolean hasCheckSum() {
return ((checkSumValue!=null)&&(!checkSumValue.isEmpty()));
return checkSumValue != null;
}

public String getCheckSum() {
Expand Down Expand Up @@ -294,15 +276,10 @@ public void setFileSize(long fileSize) {
* @param tags
*/
public void setCategories(List<String> newCategories) {

if (newCategories != null) {
newCategories = Util.removeDuplicatesNullsEmptyStrings(newCategories);
if (newCategories.isEmpty()) {
newCategories = null;
}
this.categories = newCategories;
}

this.categories = newCategories;
}

/**
Expand Down Expand Up @@ -495,27 +472,20 @@ private void addFileDataTags(List<String> potentialTags) throws DataFileTagExcep
}

potentialTags = Util.removeDuplicatesNullsEmptyStrings(potentialTags);

if (potentialTags.isEmpty()){
return;
}


// Make a new list
this.dataFileTags = new ArrayList<>();
List<String> newList = new ArrayList<>();

// Add valid potential tags to the list
for (String tagToCheck : potentialTags){
if (DataFileTag.isDataFileTag(tagToCheck)){
this.dataFileTags.add(tagToCheck);
newList.add(tagToCheck);
}else{
String errMsg = BundleUtil.getStringFromBundle("file.addreplace.error.invalid_datafile_tag");
throw new DataFileTagException(errMsg + " [" + tagToCheck + "]. Please use one of the following: " + DataFileTag.getListofLabelsAsString());
}
}
// Shouldn't happen....
if (dataFileTags.isEmpty()){
dataFileTags = null;
}
this.dataFileTags = newList;
}

private void msg(String s){
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -905,7 +905,8 @@ public static JsonObjectBuilder json(DataFile df, FileMetadata fileMetadata, boo
.add("tabularData", df.isTabularData())
.add("tabularTags", getTabularFileTags(df))
.add("creationDate", df.getCreateDateFormattedYYYYMMDD())
.add("publicationDate", df.getPublicationDateFormattedYYYYMMDD());
.add("publicationDate", df.getPublicationDateFormattedYYYYMMDD())
.add("lastUpdateTime", format(fileMetadata.getDatasetVersion().getLastUpdateTime()));
Dataset dfOwner = df.getOwner();
if (dfOwner != null) {
builder.add("fileAccessRequest", dfOwner.isFileAccessRequest());
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/propertyFiles/Bundle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -3220,7 +3220,7 @@ datasetFieldValidator.error.emptyRequiredSingleValueForField=Empty required valu
updateDatasetFieldsCommand.api.processDatasetUpdate.parseError=Error parsing dataset update: {0}

#AbstractApiBean.java
abstractApiBean.error.datasetInternalVersionNumberIsOutdated=Dataset internal version number {0} is outdated
abstractApiBean.error.internalVersionTimestampIsOutdated=Internal version timestamp {0} is outdated

#RoleAssigneeServiceBean.java
roleAssigneeServiceBean.error.dataverseRequestCannotBeNull=DataverseRequest cannot be null.
Loading