This project focuses on the preparation and training of YOLO nano for cloud segmentation, using the CloudSen12 and Titan datasets.
- Filters samples with cloud coverage > 55%
- Uses only the Sentinel-2 RGB bands
- Extracts polygons of cloud classes 1 and 2 (these classes represent clouds)
- Converts masks into YOLO-Segmentation format
-
Install dependencies:
pip install -r requirements.txt
-
Run the script:
python preprocess_earth.py
-
Output: RGB images and
.txtannotations in YOLO-Seg format saved in:datasets/CloudSen12/train/ datasets/CloudSen12/val/
- Annotated using LabelMe (polygons)
- Automatically merges and splits into train, val, and test
- Converts
.jsonannotations into YOLO-Segmentation format - Removes corrupted images or those without valid labels
- Merges
train + val → full_train
-
Make sure the original dataset from https://zenodo.org/records/13988492 is placed inside the
datasetsdirectory, and each subdirectorytestandtraincontain onlylabelsandimages, others subdirectory must be deleted. -
Run the script:
python preprocess_titan.py
-
Output: Data and annotations in YOLO-Seg format saved in
datasets/Titan/
Install Ultralytics YOLOv11:
pip install ultralyticsand install all dependencies via requirements.txt:
pip install -r requirements.txtRun the scripts in the following order:
training_earth.py– Initial training on CloudSen12tuning_titan.py– Fine-tuning on the Titan datasetfinal_model.py– Retrain the final model on the entire Titan dataset and evaluate its performance
Make sure the .yaml files in the yolo_configs/ folder point to the correct dataset paths.
To retrain or experiment with new parameters:
- Change the model names and weights in the scripts
- Update the
.yamlfiles if needed
.
├── datasets/
│ ├── CloudSen12/
│ │ ├── train/
│ │ │ ├── images/
│ │ │ └── labels/
│ │ └── val/
│ │ ├── images/
│ │ └── labels/
│ ├── Dataset_Zenodo/
│ │ ├── train/
│ │ │ ├── images/
│ │ │ └── labels/
│ │ └── test/
│ │ │ ├── images/
│ │ │ └── labels/
│ └── Titan/
│ ├── full_train/
│ ├── train/
│ ├── val/
│ └── test/
├── scripts/
│ ├── final_model.py
│ ├── preprocess_earth.py
│ ├── preprocess_titan.py
│ ├── training_earth.py
│ └── tuning_titan.py
├── yolo_configs/
│ ├── earth.yaml
│ ├── titan.yaml
│ └── titan_full.yaml
├── requirements.txt
└── README.md
- To obtain a consistent subdataset from CloudSen12 in preprocessing phase we applied some controls in order to obtain valid images which can make the download time-consuming (depending on your internet speed). We therefore recommend using the pre-uploaded version.
- Make sure to run the scripts from the correct directory
Yahn, Zachary; Trent, Douglas; Duncan, Ethan; Seignovert, Benoit; Santerre, John; Nixon, Conor. (2024).
Supplemental Data: Rapid Automated Mapping of Clouds on Titan with Instance Segmentation (2.1) [Data set]. Zenodo.
https://doi.org/10.5281/zenodo.13988492
Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).