Hepatitis A virus (HAV) genome annotation

How to annotate HAV sequences with VADR

HAV VADR model library

Additional VADR documentation

References

How to annotate HAV genomes with VADR

Steps for using VADR for HAV annotation:

Download and install the latest version of VADR, following the instructions on this page. Alternatively, you can use the StaPH-B VADR 1.6.4 docker image created by Curtis Kapsak (docker image names: staphb/vadr:1.6.4 and staphb/vadr:latest), available on dockerhub and quay. A brief README for the docker image is here.
Clone the latest HAV VADR model library from this repository. git clone git@github.com:greninger-lab/vadr-models-hav.git

Note the path to the directory name created plus the "hav" subdirectory (e.g. /path/to/vadr-models-hav/hav) as <hav-models-dir-path> for step 4.
Remove terminal ambiguous nucleotides from your input fasta sequence file using the fasta-trim-terminal-ambigs.pl script in $VADRSCRIPTSDIR/miniscripts/.

To remove terminal ambiguous nucleotides from your sequence file <input-fasta-file> and to remove short and long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 8000 <input-fasta-file> > <trimmed-fasta-file>

Run the v-annotate.pl program on an input trimmed fasta file with HAV sequences using the recommended command below.

v-annotate.pl -r --mkey hav --mdir <hav-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>

After running the v-annotate.pl command in step 4, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md

HAV VADR model library

The VADR model library for HAV annotation was developed using representative sequences from the 6 genotypes currently available in the complete HAV genomes published in GenBank: OK625565(IA), KP879217(IB), AY644676(IIA), AY644670(IIB), DQ991029(IIIA) and AB258387(IIIB).
Some of the model genomes have been modified slightly on the 3' end to have polyA tails of consistent length and facilitate consistent behavior across the models. These include:
- OK625565 (IA) removed 7 As from 3' end
- KP879217 (IB) added 18 As to 3' end
- DQ991029 (IIIA) removed 10 As from 3' end

Additional VADR documentation

Reference

The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3
This page was adapted for HAV from Mpox virus annotation

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
hav		hav
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hepatitis A virus (HAV) genome annotation

How to annotate HAV sequences with VADR

HAV VADR model library

Additional VADR documentation

References

How to annotate HAV genomes with VADR

HAV VADR model library

Additional VADR documentation

Reference

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hepatitis A virus (HAV) genome annotation

How to annotate HAV genomes with VADR

HAV VADR model library

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Uh oh!

Languages