Migrate image datasets to Hugging Face / Kaggle to reduce repo size

## Problem

The repository currently stores all reference images (`gemini_black/`, `gemini_white/`, `gemini_random/`) directly in git. As contributors add more images (e.g. `gemini_black_nb_pro/`, `gemini_white_nb_pro/` with 150-200 images each), the repo size will grow significantly - git isn't designed for large binary datasets and every clone will download the full history of these files.

## Suggestion

Migrate image datasets to **Hugging Face Datasets** or **Kaggle Datasets**:

### Option A: Hugging Face Hub (recommended)
- Create a dataset repo at `huggingface.co/datasets/aloshdenny/reverse-synthid-images`
- Organize by folder: `gemini_black/`, `gemini_white/`, `gemini_black_nb_pro/`, etc.
- Contributors can upload via `huggingface_hub` CLI or the web UI
- Easy to load in Python: `from datasets import load_dataset`
- Free hosting, versioned, supports large files natively via LFS

### Option B: Kaggle Datasets
- Host at `kaggle.com/datasets/aloshdenny/reverse-synthid-images`
- Contributors upload via Kaggle API
- Good visibility in the ML community

### Repo changes needed
1. Move existing images to the chosen platform
2. Replace image folders with a download script (e.g. `scripts/download_images.py`)
3. Add the dataset link to README
4. Update contribution guide to point contributors to upload images there instead of PRs

## Current repo size concern

The existing `gemini_black/` (101 images), `gemini_white/` (101 images), and `gemini_random/` (88 images) already contribute significantly. With the new `nb_pro` folders requesting 150-200 images each, plus future model variants, the repo could easily exceed 1-2 GB - making clones slow and CI expensive.

## Benefits
- **Faster clones** - code-only repo stays small
- **Better for contributors** - uploading images to HF/Kaggle is simpler than large git PRs
- **Versioning** - HF Hub tracks dataset versions properly
- **Discoverability** - datasets on HF/Kaggle get more visibility from the ML research community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate image datasets to Hugging Face / Kaggle to reduce repo size #15

Problem

Suggestion

Option A: Hugging Face Hub (recommended)

Option B: Kaggle Datasets

Repo changes needed

Current repo size concern

Benefits

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Migrate image datasets to Hugging Face / Kaggle to reduce repo size #15

Description

Problem

Suggestion

Option A: Hugging Face Hub (recommended)

Option B: Kaggle Datasets

Repo changes needed

Current repo size concern

Benefits

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions