Skip to content

v1.5.0 - Await another voice

Choose a tag to compare

@fpusan fpusan released this 31 Dec 16:52
· 401 commits to master since this release

New features

  • Binning was refurnished. Binners can be now selected from command line using the -binners option. Options --nomaxbin and --nometabat are now obsolete. Steps 14 (metabat) and 15 (maxbin) were dropped, and replaced by a single step doing all binning. This produced a change in numbering for subsequent scripts and results. The script versionchange.pl was introduced to provide compatibility of old results with this current one.
  • Added the utility script sqm_mapper.pl, which maps reads to a given reference using one of the included sequence aligners (Bowtie2, BWA), and provides estimation of the abundance of the contigs and ORFs in the reference.
  • Reworked the utility script sqm_annot.pl. This script performs functional and taxonomic annotation for a set of genes or genomes. Genomes must be nucleotide sequences, while gene sequences can be either nucleotides or amino acids. All sequence files must be in fasta format.
  • Added CONCOCT as an extra option for binning.
  • Added the possibility of selecting only the functions/taxa of interest when using the sqmreads2tables.py to create summary tables from sqm_reads.pl and sqm_longreads.pl projects. This is achieved by passing an extra -q/--query parameter to sqmreads2tables.py. Query syntax is similar to that of anvi-filter-sqm.py.
  • Added the utility script add_databases.pl, which will add one or several new databases to the results of an existing project. The script will run DIAMOND searches for the new databases, and then will re-run several SqueezeMeta scripts to include the new database(s) to the existing results. The following scripts will be invoked: 07, 12, 13 and 21.

Minor changes / bugfixes

  • Added the --norename flag to SqueezeMeta.pl, to keep the original contig names produced by the assemblers, or already present in the external assembly provided with the -extassembly parameter. Contig names containing underscores may break the pipeline, so use it with caution.
  • Added compatibility for anvi'o 7.1 in the anvi-load-sqm.py and anvi-filter-sqm.py scripts.
  • Updated canu to version 2.2.
  • Updated flye to version 2.9.
  • Fixed a bug in which neither the last contig nor its length were included in calculations.
  • Changed the automatic calculation of the -b parameter in DIAMOND from free_ram/5 to free_ram/8 to be more conservative with memory usage.
  • Fixed a bug in which sqm2anvio.pl woud fail if the file names contained the substring "sam" (other than having ".sam" as the extension).
  • Added the --very-sensitive-local parameter to bowtie2 calls to increase performance.
  • Fixed an issue in the blastxcollapse.pl script than appeared when the number of sequences was smaller than the number of threads.
  • Allow to use minimap2 as a mapper in the sqm_mapper.pl script.
  • Corrected an error in which the RPKM of the contigs was multiplied by 10^9 rather than 10^6.
  • Fixed an issue in which the minpath step generated files in the wrong paths.
  • SQMtools: Added the metadata_groups parameter to plotTaxonomy, plotFunctions, plotBars and plotHeatmat to divide samples between different subplots.

Compatibility Changes

  • Results generated by previous versions of SqueezeMeta will not load into SQMtools 0.7.0 (which corresponds to SqueezeMeta release 1.5). We provide the utility script versionchange.pl in order to make older projects compatible with the new versions.
  • Conversely, projects generated with SqueezeMeta v1.5 will not load into older versions of SQMtools.

As always, please open an issue if something's not working for you.