Skip to content

greninger-lab/vadr-models-mev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

Measles virus (MeV) genome annotation


How to annotate MeV genomes with VADR

Steps for using VADR for MeV annotation:

  1. Download and install the latest version of VADR, following the instructions on this page. Alternatively, you can use the StaPH-B VADR 1.6.3-hav-flu2 docker image created by Curtis Kapsak (docker image names: staphb/vadr:1.6.3-hav-flu2 and staphb/vadr:latest), available on dockerhub and quay. A brief README for the docker image is here.

  2. Clone the latest MeV VADR model library from this repository. git clone git@github.com:greninger-lab/vadr-models-mev.git

    Note the path to the directory name created plus the "mev" subdirectory (e.g. /path/to/vadr-models-mev/mev) as <mev-models-dir-path> for step 4.

  3. Remove terminal ambiguous nucleotides from your input fasta sequence file using the fasta-trim-terminal-ambigs.pl script in $VADRSCRIPTSDIR/miniscripts/.

    To remove terminal ambiguous nucleotides from your sequence file <input-fasta-file> and to remove short and long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 18000 <input-fasta-file> > <trimmed-fasta-file>
  1. Run the v-annotate.pl program on an input trimmed fasta file with MeV sequences using the recommended command below.
v-annotate.pl -r --indefclass 0.01 --mkey mev --mdir <mev-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>
  1. After running the v-annotate.pl command in step 4, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

    More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md


MeV VADR model library

  • The VADR model library for MeV annotation was developed using representative sequences from the 15 genotypes currently available in the complete MeV genomes published in GenBank: AF266288(A), OR290098(B3), LC655230(C1), MG912589(C2), AB481088(D3), OP236009(D4), JN635406(D5), DQ227319(D6), JN635410(D7), PP101943(D8), KY969476(D9), MN017369(D11), GMG912591(G2), KC164758(G3), MZ442631(H1).

Reference

  • The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3

  • This page was adapted for MeV from Mpox virus annotation


About

Measles (MeV) VADR models

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors