Tglow: Nextflow pipeline for analyzing HCI data

Check out our pre-print here: https://www.biorxiv.org/content/10.64898/2026.02.10.704860v1

This repo contains the nextflow pipeline and binaries and scripts to run a tglow-pipeline instance for the analysis of high content imaging data. A detailed walkthrough of the steps, installation and configuration is given on the wiki and a full list of options can be found in docs/parameters.md. A guided tutorial with example data is available here

There are three components to the overall workflow

tglow-pipeline - Nextflow files and Python scripts for running pipeline processes
tglow-core - A python library with IO, parsing and convenience functions based on AICSImageIO.
tglow-r - A Seurat-like R package for analyzing the output HCI features

Installation & dependencies

See here for full install instructions of all pipeline components.

Pipeline overview

The following readme gives a high level overview, for more detailed guide please see the wiki. The pipeline consists of two main stages:

stage: prepare and standardize raw images into a well/field-organized OME-TIFF layout with metadata.
run_pipeline: perform image processing and feature extraction on the staged images.

Both stages are implemented as Nextflow workflows and can be run independently using -entry stage|run_pipeline.

Some steps in the pipeline require GPU's to be available. These are semgmentation and deconvolution. Deconvolution will not run without GPU. Segmentation (CellPose) will run, but we only reccomend this in cases where you are generating masks in 2D. In 3d the computational burden for large datasets will be too much for CPU.

The pipeline is intended to run on high performance compute (HPC) clusters, and bundled resource profiles should work for most HPC, but some tweaks to queue names and GPU settings may be required as flags differ between vendors and HPC configurations. Go to conf/processes.config and search for queue and clusterOptions to update. Furthermore each HPC is different, with different machines and resource limits. You may need to add a profile for your HPC enviroment in the conf folder. The nf-core config directory may be of help for your HPC: https://nf-co.re/configs/. If something is unclear, feel free to raise an issue on github.

If you dont want to run the pipeline on HPC but run it locally, supply -profile local.

1) stage

Purpose: Stage Revity/PerkinElmer (currently Phenix or Operetta) acquisitions into a reproducible plate/row/col/field.ome.tiff structure and capture metadata (channel names, pixel sizes, channel order, original index files).

-> If you don't have a Phenix or Operetta export, you can skip this step, but will need to organize the images using your own script. See more details here

Input:

PerkinElmer index.xml / index.idx.xml and raw instrument files (or manually organized raw files).

Output:

plate_name/row/col/field.ome.tiff (with metadata)
manifest listing wells to process (used as a Nextflow channel)
auxiliary files to capture provenance (index.xml, channel maps, etc.) (optional)

Nextflow processes:

prepare_manifest — create a manifest with wells/fields to run (re-usable Nextflow channel)
fetch_raw — read raw files and write standardized OME-TIFFs and metadata

2) run_pipeline

Purpose: Run the core image-processing and feature-extraction steps on the staged images. The workflow is modular — many steps are optional or configurable.

Input:

Staged OME-TIFFs (from stage) and optional per-plate/field metadata (flatfields, registration references, etc.)

Output:

Segmentation outputs, registration matrices, flatfields, extracted feature tables, and logs/artifacts needed for downstream analysis.

Main processing steps (in typical execution order — each step can be enabled/disabled via config):

estimate flatfield (Polynomial / BaSiCPY) (optional)
- Parallelization: per-plate + channel or single flatfield for all plates + channels.
- Output: flatfield images only (no transformed images saved).
register (cross correlation / pystackreg) (optional)
- Parallelization: per-well
- Output: registration matrices (no transformed images saved).
cellpose segmentation
- Parallelization: per-well, GPU-enabled
- Notes: If registration is used, segmentation currently runs on the reference plate. Nucleus channel optional but segmentation is required.
- Output: 2D or 3D cell & nucleus masks as tiffs
deconvolute with CLIJ2-fft (optional)
- Parallelization: per-well, GPU-enabled
- Output: deconvolved images (creates a data copy)
finalizing images
- Parallelization: per-well
- Applies all the registration, flatfields, scaling, max projection to the (deconvolved) images and collects the masks
- Output: Analysis reade OME-TIFFs
feature extraction with CellProfiler
- Parallelization: per-well
- Stage images into a CellProfiler-compatible layout, apply flatfields and registration (if enabled), and run feature extraction.
- Outputs: CellProfiler artifacts as a zip archive per well
cellcrops (optional)
- Parallelization: per-well
- Produces a HDF5 file for each field where each h5 group is a cell
- Outputs: h5 file with fully processed cellcrops

Options

See nextflow.config or the docs/parameters.md for available options and their descriptions.

Quick Usage

Prerequisites:

Nextflow and conda
Completed install instructions

I strongly reccomend to configure through a configuration file, altough parameters can be overridden on the commandline. I would reccomend a project structure as follows:

my_project
- results: By default this is where the pipeline stores outputs
- scripts
  - logs
  - my_config.config
  - run_pipeline.sh
- workdir: By default this is the Nextflow workdir

Quick examples:

Stage PerkinElmer data from a raw export:

nextflow \
-log logs/stage.nextflow.log \
run </path/to/main.nf> \
-profile <your profile> \
-w ../workdir \
-resume \
-entry stage \
-with-report logs/stage.nextflow.html \
-with-trace logs/stage.nextflow.trace \
-c my_config.config"

Run the main pipeline on staged images:

nextflow \
-log logs/run_pipeline.nextflow.log \
run </path/to/main.nf> \
-profile <your profile> \
-w ../workdir \
-resume \
-entry run_pipeline \
-with-report logs/run_pipeline.nextflow.html \
-with-trace logs/run_pipeline.nextflow.trace \
-c my_config.config"

Getting help

See known issues and notes. If you find an issue please raise it on the git or contact us directly.

Authors:

Olivier Bakker
Francesco Cisterno

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tglow: Nextflow pipeline for analyzing HCI data

Installation & dependencies

Pipeline overview

1) stage

2) run_pipeline

Options

Quick Usage

Getting help

Authors:

References

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Tglow: Nextflow pipeline for analyzing HCI data

Installation & dependencies

Pipeline overview

1) stage

2) run_pipeline

Options

Quick Usage

Getting help

Authors:

References