Skip to content

Accessing CMR Data in R#482

Merged
smk0033 merged 3 commits intodevelopfrom
access-cmr-r
Apr 1, 2025
Merged

Accessing CMR Data in R#482
smk0033 merged 3 commits intodevelopfrom
access-cmr-r

Conversation

@smk0033
Copy link
Contributor

@smk0033 smk0033 commented Mar 24, 2025

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,603 @@
{
Copy link
Collaborator

@wildintellect wildintellect Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this print show some granule names, and example granule, or granule count?


Reply via ReviewNB

@@ -0,0 +1,603 @@
{
Copy link
Collaborator

@wildintellect wildintellect Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should reference the page about DAAC AWS access, so people know how to find other DAAC credential urls


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be incorrect or overlooking it, but I'm not sure we have a document that lists all other DAAC credential URLs or how to find them? The "Direct DAAC S3 Bucket Access (BETA)" notebook has S3 links to specific granules but not the general DAAC credential URL, though we do link to the cloud collections in Earthdata Search. "Accessing Data Providers by NASA's DAACs)" doesn't have S3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Openscapes has a list of these, but I think MAAP only supports 5 right now. If we don't have those listed in the docs we should.

@hrodmn
Copy link
Contributor

hrodmn commented Mar 25, 2025

Thanks for putting this together @HarshiniGirish and @smk0033! I left a few suggestions but I am left wondering if reading hdf files directly from S3 is possible from within R. Users really should not need to manually download files to read the bytes they want to use! If it is not possible it might be worth exploring a reticulate solution that can use a python library that can do that thing. I tried a few things like setting the AWS environment variables like this:

Sys.setenv(
  "AWS_ACCESS_KEY_ID" = credentials$accessKeyId,
  "AWS_SECRET_ACCESS_KEY" = credentials$secretAccessKey,
  "AWS_SESSION_TOKEN" = credentials$sessionToken,
  "AWS_REGION" = "us-west-2"
)

but I was not able to read the hdf file with /vsis3/{bucket}/{path}.hdf.

@wildintellect
Copy link
Collaborator

@hrodmn I think the whole using directly from S3 is unexplored territory in R. There are 2 issues, authenticating, and then the HDF reader library supporting network based reads. This works for Terra and SF because gdal/ogr handles it all. For some netcdf with Terra and Stars likely it would work too, not sure about this ncdf4 libraries abilities to do either of the required steps.

@smk0033 smk0033 requested review from hrodmn and wildintellect March 27, 2025 20:38
Copy link
Contributor

@hrodmn hrodmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, adding the collection search details in the "Searching CMR from R" would be great.

@smk0033
Copy link
Contributor Author

smk0033 commented Apr 1, 2025

Thanks Henry! Will do - I'll be sure to get a PR out for that this week

@smk0033 smk0033 merged commit dd226b3 into develop Apr 1, 2025
@smk0033 smk0033 deleted the access-cmr-r branch April 1, 2025 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants