v1.5.0 - Await another voice
New features
- Binning was refurnished. Binners can be now selected from command line using the
-binnersoption. Options--nomaxbinand--nometabatare now obsolete. Steps 14 (metabat) and 15 (maxbin) were dropped, and replaced by a single step doing all binning. This produced a change in numbering for subsequent scripts and results. The script versionchange.pl was introduced to provide compatibility of old results with this current one. - Added the utility script
sqm_mapper.pl, which maps reads to a given reference using one of the included sequence aligners (Bowtie2, BWA), and provides estimation of the abundance of the contigs and ORFs in the reference. - Reworked the utility script
sqm_annot.pl. This script performs functional and taxonomic annotation for a set of genes or genomes. Genomes must be nucleotide sequences, while gene sequences can be either nucleotides or amino acids. All sequence files must be in fasta format. - Added CONCOCT as an extra option for binning.
- Added the possibility of selecting only the functions/taxa of interest when using the
sqmreads2tables.pyto create summary tables fromsqm_reads.plandsqm_longreads.plprojects. This is achieved by passing an extra-q/--queryparameter tosqmreads2tables.py. Query syntax is similar to that ofanvi-filter-sqm.py. - Added the utility script
add_databases.pl, which will add one or several new databases to the results of an existing project. The script will run DIAMOND searches for the new databases, and then will re-run several SqueezeMeta scripts to include the new database(s) to the existing results. The following scripts will be invoked: 07, 12, 13 and 21.
Minor changes / bugfixes
- Added the
--norenameflag toSqueezeMeta.pl, to keep the original contig names produced by the assemblers, or already present in the external assembly provided with the-extassemblyparameter. Contig names containing underscores may break the pipeline, so use it with caution. - Added compatibility for anvi'o 7.1 in the
anvi-load-sqm.pyandanvi-filter-sqm.pyscripts. - Updated canu to version 2.2.
- Updated flye to version 2.9.
- Fixed a bug in which neither the last contig nor its length were included in calculations.
- Changed the automatic calculation of the
-bparameter in DIAMOND fromfree_ram/5tofree_ram/8to be more conservative with memory usage. - Fixed a bug in which
sqm2anvio.plwoud fail if the file names contained the substring "sam" (other than having ".sam" as the extension). - Added the
--very-sensitive-localparameter to bowtie2 calls to increase performance. - Fixed an issue in the
blastxcollapse.plscript than appeared when the number of sequences was smaller than the number of threads. - Allow to use minimap2 as a mapper in the
sqm_mapper.plscript. - Corrected an error in which the RPKM of the contigs was multiplied by 10^9 rather than 10^6.
- Fixed an issue in which the minpath step generated files in the wrong paths.
- SQMtools: Added the
metadata_groupsparameter toplotTaxonomy,plotFunctions,plotBarsandplotHeatmatto divide samples between different subplots.
Compatibility Changes
- Results generated by previous versions of SqueezeMeta will not load into SQMtools 0.7.0 (which corresponds to SqueezeMeta release 1.5). We provide the utility script
versionchange.plin order to make older projects compatible with the new versions. - Conversely, projects generated with SqueezeMeta v1.5 will not load into older versions of SQMtools.
As always, please open an issue if something's not working for you.