Skip to content

dchu917/autoresearch

 
 

Repository files navigation

autoresearch-modal

A Modal fork of karpathy/autoresearch — run autonomous LLM pretraining research on cloud GPUs without needing a local NVIDIA GPU.

The original autoresearch requires a local NVIDIA GPU (ideally an H100). This fork adds a thin Modal wrapper so you can run everything from a laptop (macOS, Linux, Windows) while training executes on a remote H100 via Modal's serverless GPU infrastructure.

What changed from upstream:

  • Added run_modal.py — Modal wrapper that sends train.py to a remote H100 for each experiment
  • Updated program.md — agent instructions use modal run instead of uv run
  • Everything else (model, optimizer, data, evaluation) is identical to upstream

Quick start

Requirements: Python 3.10+, a Modal account, the modal CLI.

# 1. Install the Modal CLI (if you don't already have it)
pip install modal
modal setup

# 2. One-time data prep (~2 min, runs on a cheap T4)
modal run run_modal.py --prepare-data

# 3. Run a single training experiment (~5 min on H100)
modal run run_modal.py

# 4. Check status of a running/completed experiment
modal run run_modal.py --status

How it works

Each modal run run_modal.py invocation:

  1. Uploads your current train.py and prepare.py to Modal (picks up the agent's latest edits)
  2. Spins up an H100 container with all dependencies pre-installed (image is cached after first build)
  3. Runs training for the fixed 5-minute time budget
  4. Streams output back and saves results to a persistent Modal Volume for polling

Data shards and the tokenizer are stored on a Modal Volume (autoresearch-cache) so they persist across runs.

Running the agent

Same as upstream — spin up Claude Code or any coding agent in this repo, then prompt:

Hi have a look at program.md and let's kick off a new experiment! let's do the setup first.

The agent edits train.py locally and runs modal run run_modal.py > run.log 2>&1 for each experiment. The --status flag lets you poll results while training is in progress.

Project structure

prepare.py      — constants, data prep + runtime utilities (do not modify)
train.py        — model, optimizer, training loop (agent modifies this)
run_modal.py    — Modal wrapper (sends train.py to remote H100)
program.md      — agent instructions (uses modal run instead of uv run)
pyproject.toml  — dependencies

Configuration

To change the GPU type, edit the gpu= parameter in run_modal.py:

@app.function(
    image=image,
    gpu="H100",  # Change to "A100-80GB", "A100", "A10G", "L4", "T4", etc.
    ...
)

Original README

See upstream for the full original README, design choices, and context.

License

MIT

About

Modal fork of karpathy/autoresearch — run autonomous LLM research on cloud H100s from any laptop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 84.7%
  • Jupyter Notebook 15.3%