Release v1.7.0 Birds of a feather · jtamames/SqueezeMeta

SqueezeMeta

Compatibility note

SqueezeMeta will now expect the CheckM2 database to be present in its database directory. If you had downloaded the SqueezeMeta database before, you can just download that extra file from here (make sure to uncompress it too!)

New features

We have revamped all the documentation and moved it to Read The Docs! We will no longer provide a PDF version of the documentation
SqueezeMeta can now be used to annotate a set of pre-existing genomes/bins and quantify their abundance in different samples. A directory containing genomes/bins can be provided through the -extbins parameter, tho will run the pipeline on a pre-existing set of bins/genomes. This is similar to what -extassembly would do with a single FASTA file, but will treat each FASTA file in the input directory as a different bin
SqueezeMeta can now be used to quickly obtain bins from metagenomes, skipping the taxonomic/functional annotation of contigs and ORFs. We have added the --onlybins flag to SqueezeMeta.pl, in order to quickly perform assembly, binning and bin QC/annotation
SqueezeMeta can now optionally run GTDB-Tk for the taxonomic classification of bins, if the --gtdbtk flag is provided when calling the pipeline. Note that we do not redistribute the GTDB-Tk databases and they must be obtained separately. By default we expect them to be in a directory named gtdb inside the SqueezeMeta database directory, but a custom location can be provided via the -gtdbtk_data_path argument
Switched to using CheckM2 for the calculation of bin completeness/contamination. This gets rid of several bugs related to CheckM1 not having updated its taxonomy to the current standard (e.g. "Pseudomonadota" instead of "Proteobacteria"). As a consequence, a strain heterogenity is no longer available in the bin results (though we've left an empty column there for backwards compatibility reasons)
-taxbinmode has been deprecated, as GTDB-Tk can provide better bin-level taxonomies
Added the --fastnr flag, which in turn will pass the --fast flag to DIAMOND when running classification against the nr database in Step 4 of the pipeline. This is significantly faster at the expense of some accuracy, but didn't seem to change the results significantly in our test.
We have simplified the way we calculate disparity for contig and bins, see details here
sqm2tables.py is now called at the end of SqueezeMeta runs
We're moving towards using conda packages rather than vendoring SqueezeMeta's dependencies, see details here

Minor changes / bugfixes

Contig names and bin names now start with the project name, to make it easy to distinguish contigs/bins coming from different SqueezeMeta runs
Added read group tags identifying the sample from which the reads come from to the BAM files produced in step 10
Removed the make_databases_alt.pl and configure_nodb_alt.pl scripts, as the standard make_databases.pl, download_databases.pl and configure_nodb.pl scripts now are able to switching to a mirror if our server is unavailable
Added the -g parameter which will control the value of the -g|--global-ranking parameter in DIAMOND when running it against the nr database
Use forking instead of threads in scripts 06 and 10 to reduce memory usage when multithreading
Fixed a bug that prevented sqm_hmm_reads.pl from working since it was trying to download legacy PFAM databases that are no longer reachable
Fixed a bug in which some ORFs would be duplicated if the pipeline went through step 13 on restart
Fixed the calculation of present pathways in step 20
Fixed a bug preventing SqueezeMeta to work with newer versions of MetaBAT2
Several bugfixes to SqueezeMeta's behaviour when restarting a run
We now use the scaffolds.fasta result instead of the contigs.fasta one when running SPAdes with the -a spades or -a spades-base (we still use the transcripts.fasta result if running it with the -a rnaspades mode
Fixed a bug in which sqm_annot.pl wasn't passing the right number of threads to subprocesses
Fixed a bug in step 10 when the total number of contigs was smaller than the available threads

SQMtools

New features

We have revamped all the documentation and moved it to Read The Docs! A PDF version will still be present as part of the CRAN release
SQMtools now supports loading more than one project into the same object. loadSQM can now be used to load the output of different SqueezeMeta runs into a single object that can be subsetted and plotted as a standard SQM object (see details here. This facilitates the analysis of e.g. sequential runs in which each sample was processed independently
We now provide basic functions for defining/modifying/curating bins within SQMtools, and the possibility of recalculating bin completeness/contamination after adding/removing contigs to the bin (either manually or through a subset function). See details here and here
Added exportContigs, exportORFs and exportBins to export the sequences present in a SQM or SQMbunch object
We changed the default way of calculating copy numbers from using RecA as a reference to using the median coverage of 10 Universal Single Copy Genes. This behaviour can be controlled via the single_copy_genes parameter in loadSQM

Minor changes / bugfixes

Added the load_sequences argument to loadSQM to control whether contig/ORF sequences should be loaded. Setting it to FALSE will decrease memory usage
Added an output_dir parameter to exportPathway
Start and end positions of ORFs are now tracked explicitly in SQM$orfs$table
copy_number is now the default quantification method used by plotFunctions and exportPathway, when available.
Fixed some IDs missing from SQM names and paths vectors after running combineSQMlite
Fixed a bug in which the data.table package wasn't attached when loading SQMtools
Fixed a bug when subsetting was attempted with only one ORF/contig

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.7.0 Birds of a feather

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

SqueezeMeta

Compatibility note

New features

Minor changes / bugfixes

SQMtools

New features

Minor changes / bugfixes

Uh oh!