A PyTorch Dataset for loading and processing AI-READI dataset.
The PatientDataset class is a PyTorch Dataset designed to handle patient data from the AI-READI dataset. The dataset includes 3D OCT, 3D OCTA, 2D fundus, and en face images, with 1,067 cases (and growing) and over 1,500 diverse tasks spanning segmentation, classification, and regression. This dataloader supports flexible configuration to filter, preprocess, and transform patient data for machine learning tasks in both image analysis and clinical prediction. This module provides MONAI-compatible dataset classes for loading OCT/OCTA/Photography imaging data, supporting both cache-based and on-demand access. It is tested with monai.data.DataLoader and follows MONAI-style dictionary-based outputs.
Each dataset returns a dictionary per sample with the following keys:
data_dict["frames"]: the imaging data (2D or 3D tensor depending on settings)data_dict["label"]: the ground truth label associated with the sample
We strongly recommend relying on the build_dataset.py script to construct datasets, as it encapsulates essential logic for transformation, caching, and configuration.
- Concept ID-based Data Loading: Loads data with respect to a specific clinical observation or measurement, which is coded as a clinical concept ID.
- Easy Image-Target Pairing: Facilitates seamless loading of image and target pairs via the dataloader, making it convenient for model training.
- Seamless Integration with MONAI Pipeline: Built on MONAI’s dataset class, allowing users to apply flexible preprocessing and transformations as part of their data pipeline.
- Support for Multiple Imaging Modalities and Devices: Handles images from various devices, models, imaging techniques (OCT, IR, CFP, OCTA, etc.), and anatomical regions. Users can specify which device or imaging type to load, making it easy to extract specific datasets via the API.
This guide walks you through setting up a Python environment and installing the aireadi-loader package along with its dependencies.
-
Create environment:
conda create -n aireadi python=3.10 -y conda activate aireadi
-
Install
aireadi-loader:pip install aireadi-loader
-
Install remaining dependencies and example training code:
git clone https://github.com/AI-READI/aireadi_loader.git cd aireadi_loader pip install -r requirements.txt
- This dataloader currently supports Linux systems only.
(Installation has not been tested on Windows or macOS.) - Make sure you have a CUDA-compatible GPU and the correct NVIDIA drivers installed for GPU acceleration.
- Tested with:
- Python 3.10 or 3.11
- CUDA 11.8 or 12.1
🛠️ If you're using Windows, WSL2 with Ubuntu may work, but we recommend Linux for best compatibility and performance.
This example demonstrates how to utilize the AIREADI dataloader for various input configurations.
Note: The core focus of this release is the dataloader itself, not model training.
However, we include example training scripts to showcase how the dataloader can be integrated into practical workflows.
Decide which clinical variable (training target) to predict. Each target is encoded using a concept_id.
You can find the corresponding concept_id for your variable of interest in the AIREADI Data Domain Table.
For example, to predict HbA1c, search for "Hba1c" in the table — the TARGET_CONCEPT_ID will be 3004410.
Please make sure you understand your prediction target — is it a classification (e.g., normal vs. abnormal, or multiclass) or a regression task (e.g., predicting a continuous value like HbA1c level)? This will affect how you configure nb_classes and choose the appropriate loss function.
Choose the imaging condition(s) you want to train on, such as:
- Device model
- Imaging modality
- Anatomical region
Valid combinations can be found in the AIREADI Dataloader Access Table.
For an overview and available images, please refer to the following link:
This dataloader uses a dictionary-based transformation pipeline compatible with MONAI, where inputs are expected in the form of a dictionary (e.g., {"frames": ..., "label": ...}). MONAI's Compose and dictionary-style transforms (like Resized, RandRotated, etc.) are used to apply preprocessing consistently across multimodal or sequence-based data. By default, the ToTensord transform is applied after any user-defined transforms.
train_transform = Compose([
Resized(
keys=["frames"],
spatial_size=(input_size, input_size),
mode="bilinear",
),
ScaleIntensityd(keys=["frames"]),
RandRotated(
keys=["frames"],
range_x=(-0.17, 0.17),
prob=0.5,
mode="bilinear",
),
])Use the build_dataset() function defined in build_dataset.py to construct your dataset.
You will need to set a few required parameters — see the Key Parameters section for details.
Note: The AI-READI dataset comes with a predetermined train/validation/test split to support reproducible research and fair benchmarking. Below is the breakdown of the number of patients in each split by demographic categories such as sex, race, and diabetes status.
Note: shuffle=True has no effect when using PatientFastAccessDataset, which inherits from PyTorch's IterableDataset.
Example:
from build_dataset import build_dataset
from monai.data import DataLoader
train_dataset = build_dataset(is_train=True, args=args)
train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True, num_workers=4)
test_dataset = build_dataset(is_train=False, args=args)
test_loader = DataLoader(test_dataset, batch_size=8, shuffle=False, num_workers=4)During training, data samples are accessed using:
for batch in dataloader:
images = batch["frames"].to(device)
labels = batch["label"].to(device).long() // for classification
...- 2D Model Training: For slice-based inputs such as OCT/OCTA center slices, OCTA slabs, and fundus images (CFP/IR).
- 3D Model Training: For volume-based models that take full OCT or OCTA scans as input.
- Multimodal 2D Training: Combines multiple 2D image types (e.g., CFP + OCTA slab) for multimodal fusion.
Each script shows how to:
- Initialize the dataset and dataloader with selected modalitie(s)
- Iterate through singlemodal/multimodal batches
Refer to the example training scripts for full workflows:
train_2d.pyfor 2D model trainingtrain_3d.pyfor 3D model trainingtrain_multimodal.pyfor multimodal model training
For users aiming to achieve stronger performance, we recommend exploring the following pretrained foundation models we collaborated on:
- RETFound (MAE-based): A self-supervised pretrained model on a large retinal dataset using masked autoencoding.
- OCTCubeM: A 3D foundation model designed for generalizable OCT analysis across diseases, datasets, and imaging devices.
These models can be easily adapted to the AIREADI dataset using the provided dataloader.
Base dataset class for on-demand loading. Used when args.cache_rate == 0. Suitable for low-memory environments or quick debugging.
- Efficient lazy loading to minimize memory usage
- Great for low-resource environments and fast prototyping
- Compatible with MONAI
DataLoader
Caching dataset class used when args.cache_rate > 0. Loads a portion (or all) of the dataset into memory at initialization for faster training.
- Partial or full caching supported via
cache_rate - Good balance between speed and memory
- Compatible with MONAI
DataLoader
Optimized caching class for 2D slice datasets when args.patient_dataset_type == "slice" and imaging is "oct" or "octa".
- Inherits from PyTorch's
IterableDataset, enabling efficient streaming of data without full dataset indexing. - Tailored for slice-based imaging
- Optimized for caching and fast access to OCT/OCTA slices.
- Compatible with MONAI
DataLoader
The root directory containing all dataset files, including imaging data, clinical metadata, and other related files.
After downloading the dataset, the structure should resemble: [your dataset path]/AIREADI/YEAR2/
Make sure to set root_dir to this path (i.e., the full path to the YEAR2 folder). For example:
args.root_dir = "/home/user/datasets/AIREADI/YEAR2/"Specifies the dataset split to use ('train', 'test', 'val'). Determines the subset of patients loaded based on AIREADI predefined splits.
Specifies how the data should be loaded:
slice: Loads imges for CFP and IR imaging, or individual slices from the volume data for OCT or OCTA.center_slice: Loads only the central slice of each volume (only available for volume data; OCT or OCTA).volume: Loads the entire volume (only available for volume data; OCT or OCTA).
Type of imaging data to load. Should be set to one of the following options.
octoctaircfp
Filters data based on the anatomical region imaged. The available options for anatomic_region include:
MaculaMacula_6x6Macula_12x12Wide_FieldOptic_DiscTemporal_PeripheryOptic_Disc_6x6MosaicNasal
Note: Ensure you input a valid combination of anatomic_region with respect to device and imaging conditions. Refer to the AIREADI Dataloader Access Table for possible valid combinations.
Filters patients based on the OCT device used for imaging (e.g., 'Maestro2', 'Cirrus'). This argument is required and must be explicitly set by the user to avoid biasing the dataset toward a particular device or vendor.
Used to filter clinical data based on the specified concept ID. If 0, the dataset returns class indices (cls_idx) from the study_group.
The study_group categorizes patients based on their diabetes status. If concept_id is set to 0, the dataloader assigns labels according to the following categories:
- Healthy:
0 - Pre-diabetes (lifestyle controlled):
1 - Oral medication and/or non-insulin injectable medication controlled:
2 - Insulin-dependent:
3
If concept_id == -1: The dataset returns laterality labels:
0for right eye (OD)1for left eye (OS)
This setting can be helpful for sanity-checking your model pipeline. Since laterality is an easily learnable signal in most imaging modalities, a model trained with concept_id == -1 typically converges near 100% accuracy very quickly.
A list of values to ignore in the clinical data (e.g., [777, 999]).
A preprocessing function applied to the data before any transformations.
Optional transformations to be applied to the data samples (e.g., augmentations).
If None, a default transform will be used, which:
- extracts images from DICOM files,
- maps the raw label using the
concept_id, - and converts the
"frames"and"label"keys totorch.float32tensors usingToTensord.
This ensures the data is in the correct format for training even when no custom transforms are provided.
Specifies the type of en-face slab to load when using OCTA imaging. Should be set to one of the following options when extracting en-face slab data from OCTA:
superficialdeepouter_retinachoriocapillaris
For OCTA flow cube data, set imaging='octa' and keep octa_enface_imaging = None
Additional keyword arguments to configure the dataset.
This dataloader was developed by Jack Strand, Yelena Bagdasarova, and Yuka Kihara from the Computational Ophthalmology Lab, led by Aaron Lee and Cecilia Lee.