-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Data submitted to NDA through the standard data submission endpoint (NOT BSMN-S3) are distributed across 5 buckets: gpop, NDAR_Central_1, NDAR_Central_2, NDAR_Central_3, and NDAR_Central_4. Making requests of the submission API (https://nda.nih.gov/api/submission/docs/swagger-ui.html) will return these locations for any files related to a submission.
The following python functions manipulate the URL returned by the submission service and return a dictionary with the bucket and key for objects in NDAR_Central_* and nda-bsmn locations, which can be passed as arguments to boto functions for working with the S3 API.
def ndar_central_location(self, file):
bucket, key = (file['file_remote_path']
.split('//')[1]
.split('/', 1))
return {'Bucket': bucket, 'Key': key}
def nda_bsmn_location(self, file):
original_key = (file['file_remote_path']
.split('//')[1]
.split('/', 1)[1]
('ndar_data/DataSubmissions', 'submission_{}/ndar_data/DataSubmissions'.format(self.submission_id)))
nda_bsmn_key = 'collection_{}/{}'.format(self.collection_id, original_key)
return {'Bucket': 'nda-bsmn', 'Key': nda_bsmn_key}These functions are included in an update to the NDASubmissionFiles class, and the file argument each accepts is from the list returned from /api/submission/submission_id/files. That response is used as an initialization argument to NDASubmissionFiles class.
files = []
request = requests.get(
self.submission_api + '/{}/files'.format(s),
headers=self.headers,
auth=self.auth
)
try:
files = json.loads(request.text)
submission_files.append({'files': NDASubmissionFiles(files, collection_id, s),
'collection_id': collection_id,
'submission_id': s})
except json.decoder.JSONDecodeError:
print('Error occurred retrieving files from submission {}'.format(s))
print('Request returned {}'.format(request.text))