Daily automation to recommend AWS SageMaker GPU instance types for popular Hugging Face models and update the amazon-sagemaker/repository-metadata dataset (modal.json).
- Uses the Hugging Face Hub CLI (
hf) to fetch the top 100 models by:likesdownloadstrending_score
- Computes the union of these lists (up to 300 unique models/day).
- Downloads the current dataset config (
modal.json) fromamazon-sagemaker/repository-metadata. - For models not present in
modal.json, runshf-memto estimate memory requirements. - Picks the cheapest allowed SageMaker instance that should not OOM using aggregate GPU VRAM.
- The set of “allowed” instances is defined by
config/instance_catalog.json. For now we only consider the G5, G6, G6e, P4 (p4d/p4de), and P5 (p5/p5e) families. - Appends a new entry with:
idinstanceTypenumGpu(all GPUs on that instance)containerStartupHealthCheckTimeout(hardcoded per instance type)
scripts/sync_modal.py: main entrypoint (discover → diff → hf-mem → map → update → upload/PR)scripts/hf_mem.py: runsuvx hf-memand parses outputscripts/instance_selection.py: instance catalog loading and cheapest-instance selection logicscripts/modal_io.py: reads/writesmodal.jsonconfig formatconfig/instance_catalog.json: instance allowlist, VRAM, GPU counts, timeouts, and pricing.github/workflows/daily-sync.yml: scheduled GitHub Actions workflow
Prereqs:
- Python 3.10+
uvinstalled (souvx hf-mem ...works)- Hugging Face CLI available via
pip install -U huggingface_hub(provides thehfcommand)
Auth (needed for gated models + dataset PRs):
export HF_TOKEN="hf_..."
hf auth login --token "$HF_TOKEN"Dry-run (no upload):
python scripts/sync_modal.py --dry-runWrite + open PR on the dataset:
python scripts/sync_modal.py --write --create-prSet a repository secret named HF_TOKEN with write access to the dataset.