Skip to content

Parallel Processing release of MuTILs

Latest

Choose a tag to compare

@szolgyen szolgyen released this 21 May 20:20
· 4 commits to main since this release
f36b524

Summary

In this release, MuTILs has been accelerated by implementing parallel processing at the region-of-interest (ROI) level. ROI-s are grouped into chunks and chunks are distributed among several CPU-s to process. The code has been refactored to three isolated ROI processing units: preprocessing, inference, and postprocessing. Each unit performs chunk processing independently and forwards data through queues. This enables balancing workload for optimal performance.

Since MuTILs uses 5 folds of models, it's been optimized to use 5 physical GPU devices for inference, one for each model. It also works with fewer GPU-s or on CPUs without leveraging the optimization.

Content

The repository consists of 4 main folders mutils_panoptic, config, utils, tests, and two linked folders to a commit of histolab submodule @ 0747a43, and a specific commit of histomicstk submodule @ e936aa6. These are forked submodules with version fixes and further performance optimizations. The repository also has a run_docker.sh file.

The major updates of this release have been done in

  • mutils_panoptic/MuTILsWSIRunner.py - which runs WSI segmentation and analysis. It is the module to run for inference.
  • mutils_panoptic/MuTILsInference.py - which has the ROI processing units as classes (legacy module name).
  • configs/MuTILsWSIRunConfigs.py - which has the configuration as a dataclass.

Usage

A custom Docker image has been created to host MuTILs for inference: szolgyen/mutils:v2

  • The container can be started with the Docker command:
docker run \
    --name Mutils \
    --gpus '"device=0,1,2,3,4"' \
    --rm \
    -it \
    -v /path/to/the/slides:/home/input \
    -v /path/to/the/output:/home/output \
    -v /path/to/the/mutils/models:/home/models \
    --ulimit core=0 \
    szolgyen/mutils:v2 \
    bash
  • The image has this release v2.0.0 of MuTILs set up.
  • Once the container is running, MuTILs can be started by the command
    python MuTILs_Panoptic/mutils_panoptic/MuTILsWSIRunner.py

Notes

The Python version has been stepped from 3.8.10 to 3.10.12 for which the scikit-image library has been also stepped from version 0.18.1 to 0.25.0. The newer version of scikit-image has a rewritten slic() function (at skimage/segmentation/slic_superpixels.py) which slightly affects how tissue regions are clustered to build the region adjacency graph (by rag_threshold() function of histolab/filters/image_filters_functional.py used at _get_slide_region_adjacency() method of MuTILsWSIRunner class). This has an impact on which model fold is assigned to a ROI for segmentation at the _assign_rois_to_rag() method of MuTILsWSIRunner class. Different model folds predict slightly different outcomes on the same ROI-s. As a consequence, the augmented results of the feature extraction will be numerically close but not identical with the v1.0.0 and v2.0.0 versions of MuTILs.