Skip to content

VarSome API Python client rewrite#7

Merged
RikMaxSpeed merged 34 commits intomasterfrom
vcf-annotator
Jan 15, 2018
Merged

VarSome API Python client rewrite#7
RikMaxSpeed merged 34 commits intomasterfrom
vcf-annotator

Conversation

@ckopanos
Copy link
Copy Markdown
Member

Includes new vcf annotator object
Several code fixes
Unit tests
updated documentation

Comment thread README.md Outdated
varsome_api_run.py -g hg19 -q 'chr7-140453136-A-T' -p add-all-data=1

Without an API key you will not be able to perform batch requests as well.
The script should complete without errors and display aprox 6,700 lines of data from dann, dbnsfp, ensemble_transcripts, gerp, gnomad_exomes, gnomad_exomes_coverage, icgc_somatic, ncbi_clinvar2, pub_med_articles, refseq_transcripts, sanger_cosmic_public, uniprot_variants, wustl_civic etc.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ensembl_transcripts

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done

Comment thread scripts/varsome_api_run.py Outdated


def annotate_variant(argv):
parser = argparse.ArgumentParser(description='Sample Variant API calls')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description='Sample Varsome API calls'

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread varsome_api/models/elements/ncbi.py Outdated
rsid = fields.ListField(items_types=(int,), help_text="RS ID")


class ClinVarDisease(models.Base):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClinVar and ClinVarDisease are not needed anymore, we only have them for old annotations for the portal, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct.

Comment thread varsome_api/models/variant.py Outdated
sanger_cosmic = fields.ListField(required=False, items_types=(Cosmic,), help_text="Sanger Cosmic")
sanger_cosmic_public = fields.ListField(required=False, items_types=(CosmicPublic,), help_text="Cosmic")
sanger_cosmic_licensed = fields.ListField(required=False, items_types=(CosmicLicensed,), help_text="Cosmic")
ncbi_clinvar = fields.ListField(required=False, items_types=(ClinVar,), help_text="ClinVar")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ncbi_clinvar is not in the API anymore

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the schema still returns them, and the available get parameters too. I will remove from the json models. Maybe we should also patch the api and remove the parameters, but we should keep the serialzers for the portal..

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok done, removed clinvar (1). I will prepare a patch for the api as well

Comment thread README.md Outdated

varsome_api_annotate_vcf.py -g hg19 -k api_key -i input.vcf -o annotated_vcf.vcf -p add-all-data=1

Notice however that not all available annotations will be present in the annotated_vcf file. Only a subset
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. Why won't all annotations be present? Which ones will be there and which ones will be missing? How does the script "decide"? Is it random? Is it a specific set of annotations, the same each time?

Comment thread README.md Outdated
print(e) # 404 (invalid reference genome)

### Example Usage
To view available request parameters (used in the params method parameter) refer to an example at [api.varsome.com](https://api.varsome.com)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "params" is code, it should be params (`params`). I can't tell here if the method os parameter or params.

Comment thread README.md Outdated
api = VarSomeAPIClient(api_key)
# fetch information about a variant into a dictionary
result = api.lookup('chr7-140453136-A-T', params={'add-source-databases': 'gnomad-exomes,refseq-transcripts'}, ref_genome='hg19')
annotated_variant = AnnotatedVariant(**result)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ** here make al subsequent lines bold. It looks like the code is not actually in a code block, so it is parsed as markdown.

Comment thread scripts/varsome_api_run.py Outdated
@@ -0,0 +1,112 @@
#!/usr/bin/env python
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this (and all other shebang lines) be #!/usr/bin/env python3? env python will likely still resolve to python2 on many systems.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if you are on a system where python2 is still the default. which is none except Macs, and dont create a python3 virtual env then yes it will fail.
So leave it on pyton3. The setup.py script has a requirement for python3 but I guess its ok to change that to python3

Comment thread README.md Outdated

for a list of available options
Please visit the [api documentation](https://api.varsome.com) to find out how to use the api and
what values does the api provide as a response to lookup requests
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change

what values does the api provide as a response to lookup requests

to

what values the api provides as a response to lookup requests

Comment thread README.md Outdated
This client is still in beta but it is a good for playing around with the API.

### Python versions
Python version 3 is supported.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires at least Python 3.5, you can download the latest version from www.python.org

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ie: Python 3.4 doesn't work.

Comment thread README.md Outdated
within your code, or do
There are several ways to create a virtual environment, but you can refer to [pip installation](https://pip.pypa.io/en/stable/installing/) and
[virtualenv installation](https://virtualenv.pypa.io/en/stable/installation/) to first install these 2 tools if you don't
have them already installed via a package manager (Linux) or HomeBrew (MacOS), etc.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to use "sudo -H" when installing on Mac.

Comment thread README.md Outdated
varsome_api_run.py -g hg19 -k api_key -i variants.txt -o annotations.txt -p add-all-data=1

The command above will read variants from `variants.txt` and dump the annotations to `annotations.txt`.
If you don't use the `-k` parameter, the script will do as many requests as there are variants in `variants.txt`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any substantial number of variants you will need to register for an API key. You can try the software without the -k apiKey parameter but you will quickly bump into safeguard limits...

or something like that.

Comment thread README.md
varsome_api_annotate_vcf.py -g hg19 -k api_key -i input.vcf -o annotated_vcf.vcf -p add-all-data=1

Notice, however, that not all available annotations will be present in the `annotated_vcf.vcf` file. Only a subset
of the returned annotations will be available when running this script. See the "Using the client in your code"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's missing? Why?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added only a subset of annotations in the output.vcf the possible number of annotations are too many to add them all and header info in the vcf file for all of them it might take several more days with tests and changes etc.. Besides you dont know what a user might need from all of these annotations, this is why the user is referred to the section using the client,where he is instructed how to override the VCFAnnotator class with the annotations he wants.
After all the intended use is that the end user will get the python class objects to develop an app of his own not use the run.py and annotate_vcf scripts. Even if he wants to do that he can copy the code and use his overriden vcf annotator class

Comment thread README.md Outdated

You will also not be able to perform batch requests without an API key.

To obtain an API key please [contact us](mailto:support@saphetor.com)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the top of this section.

Comment thread README.md Outdated

### How to get an API key

You are generally not required to have an API key to use the API but, without one, the number of requests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to: "You can use the API without an API key, but performance and the number of queries will be limited in order to safeguard our platform's reliability."

Comment thread setup.py
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to remove up to 3.4 included.

Comment thread scripts/varsome_api_annotate_vcf.py Outdated
@@ -0,0 +1,41 @@
#!/usr/bin/env python
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many (all?) of these files are missing (c) 2018 Saphetor at the top.
We need to decide which license we are using for these.
And then include the appropriate header in all files.
Preferably something that means that:
1- We own the original code.
2- If somebody modifies the code they still have to include the copyright message.
3- Would be nice if they send bug-fixes back to us.
4- They can sell the code to 3rd party uses, but they must credit Saphetor.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need all this info in all files. The repository contains a license file which can cover all the code

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RikMaxSpeed RikMaxSpeed changed the title Rewritted python client VarSome API Python client rewrite Jan 15, 2018
@RikMaxSpeed RikMaxSpeed merged commit ff57060 into master Jan 15, 2018
@ckopanos ckopanos deleted the vcf-annotator branch January 17, 2018 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants