Skip to content

Reading Files in Private S3 Buckets #898

@vb-stephane

Description

@vb-stephane

I'm having issues using tabix when reading from private AWS S3 buckets. tabix has been compiled using the --enable-libcurl option, and has no problem reading from public buckets, for example:

tabix s3://1000genomes/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz 1:100000-150000

However, when I attempt to read from a private bucket, which would require AWS credentials, I get the following error:

tabix -H s3://xx/NA18961.minbranch1.raw.snps.indels.g.vcf.gz
[E::hts_open_format] Failed to open file s3://xx/NA18961.minbranch1.raw.snps.indels.g.vcf.gz
Could not read s3://xx/NA18961.minbranch1.raw.snps.indels.g.vcf.gz

This is despite having the aws cli configured with the correct credentials (and saved to ~/.aws/) and the environment variables "AWS_ACCESS_KEY_ID" and "AWS_SECRET_ACCESS_KEY" set. Using the AWS cli to view the file has no issue:

aws s3 ls s3://xx/NA18961.minbranch1.raw.snps.indels.g.vcf.gz
2019-07-18 17:11:25 829053904 NA18961.minbranch1.raw.snps.indels.g.vcf.gz
2019-07-18 17:13:26 3300333 NA18961.minbranch1.raw.snps.indels.g.vcf.gz.tbi

I'm at a loss for what I should do. As far as I understand tabix (and samtools / bcftools) should be able to read from private buckets, provided that access credentials are provided through either the configuration files or the environment variables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions