What steps does it take to reproduce the issue?
See the dataset file page here.
- This page says the "Original File MD5" begins with
9e9be.... But this is not true.
- The "Stata Binary (Original File Format)" file has an md5 hash beginning with
20ddc4....
- The "Tab-Delimited" file has an md5 hash beginning with
1f75c2....
- However "Tab-Delimited" file without the header row (
cat file.tab | tail --lines=+2 | md5sum) has an md5 hash of 9e9be....
This is a bug because it will lead users to believe that they downloaded a corrupted file.
There are two parts: the incorrect labeling, and cutting off the header row. The label should be "Tab-Delimited File MD5" not "Original File MD5." Cutting off the header row is more interesting. Why does Dataverse send the file to the user, but hash a transformed version of that file?
- When does this issue occur?
Unknown
- Which page(s) does it occurs on?
The Stata files of this dataset that I checked by hand.
Which version of Dataverse are you using?
The one hosted at https://dataverse.harvard.edu/, 5.13 build 1244-79d6e57
What steps does it take to reproduce the issue?
See the dataset file page here.
9e9be.... But this is not true.20ddc4....1f75c2....cat file.tab | tail --lines=+2 | md5sum) has an md5 hash of9e9be....This is a bug because it will lead users to believe that they downloaded a corrupted file.
There are two parts: the incorrect labeling, and cutting off the header row. The label should be "Tab-Delimited File MD5" not "Original File MD5." Cutting off the header row is more interesting. Why does Dataverse send the file to the user, but hash a transformed version of that file?
Unknown
The Stata files of this dataset that I checked by hand.
Which version of Dataverse are you using?
The one hosted at https://dataverse.harvard.edu/, 5.13 build 1244-79d6e57