Skip to content

greninger-lab/vadr-models-muv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Mumps virus (MuV) genome annotation


How to annotate MuV genomes with VADR

Steps for using VADR for MuV annotation:

  1. Download and install the latest version of VADR, following the instructions on this page. Alternatively, you can use the StaPH-B VADR 1.6.3-hav-flu2 docker image created by Curtis Kapsak (docker image names: staphb/vadr:1.6.3-hav-flu2 and staphb/vadr:latest), available on dockerhub and quay. A brief README for the docker image is here.

  2. Clone the latest MuV VADR model library from this repository. git clone git@github.com:greninger-lab/vadr-models-muv.git

    Note the path to the directory name created plus the "muv" subdirectory (e.g. /path/to/vadr-models-muv/muv) as <muv-models-dir-path> for step 4.

  3. Remove terminal ambiguous nucleotides from your input fasta sequence file using the fasta-trim-terminal-ambigs.pl script in $VADRSCRIPTSDIR/miniscripts/.

    To remove terminal ambiguous nucleotides from your sequence file <input-fasta-file> and to remove short and long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 18000 <input-fasta-file> > <trimmed-fasta-file>
  1. Run the v-annotate.pl program on an input trimmed fasta file with MuV sequences using the recommended command below.
v-annotate.pl -r --indefclass 0.025 --mkey muv --mdir <muv-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>
  1. After running the v-annotate.pl command in step 4, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

    More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md


MuV VADR model library

  • The VADR model library for MuV annotation was developed using representative sequences from 12 genotypes currently available in the complete MuV genomes published in GenBank: MH426702 (A), MK279727 (B), KX953297 (C), MW819866 (D), DQ649478 (F), OR822028 (G), KY969483 (H), AY309060 (I), KF878079 (J), MT238690 (K), KF878081 (L), AY685921 (N).

Reference

  • The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3

  • This page was adapted for MuV from Mpox virus annotation


About

Mumps (MuV) VADR models

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors