Automated analysis and visualization generation of the LD12 genomes and the freshwater genomes comparison paper.
- All code written by Lucas Sinclair.
The published paper for which this pipeline was made can be found here:
http://www.nature.com/ismej/journal/vaop/ncurrent/full/ismej2015260a.html
The code is documented with many docstrings, in addition here is overview of what happens in the analysis:
The command line tool supports a few optional arguments:
e_value: Minimum e-value in similarity search. Defaults to0.0001.mcl_factor: The MCL clustering factor. Defaults to1.5.seq_type: Eithernuclorprot. Defaults toprot.num_threads: Number of threads to use. Default to the number of cores on the current machine.min_identity: Minimum identity in similarity search. Defaults to0.97.min_coverage: Minimum query coverage in similarity search. Defaults to0.97.
