Research Loop is a complete scientific research workflow for your coding agents, built on top of a set of composable "skills" and some initial instructions that make sure your agent uses them.
It starts from the moment you open your coding agent and mention anything research-related. As soon as it sees that you're exploring a topic, it doesn't just start searching and dumping information. Instead, it steps back and asks you what you're really trying to figure out.
Once it's teased the research framing out of the conversation, it explores in parallel — papers, repos, debates, open problems — and shows you the landscape in a synthesis short enough to actually read and act on.
After you've picked a direction, your agent finds the gaps, runs them through a Carlini gate (one question at a time, waiting for your answers), and surfaces the ideas worth pursuing. Then it spins up parallel hypothesis lanes, applies gates between them, and kills the weak ones early.
Next up, once you say "go", it launches a subagent-driven experiment loop — proposing code mutations, running benchmarks, annotating results causally, and building a living knowledge graph that remembers what was tried and why it failed.
When you want to understand something deeply, just say "explain X." The learn skill walks you through how experts actually think about a topic — not a summary, but the underlying reasoning structures: core mental models one at a time, the places where the field genuinely disagrees, questions that expose whether you understand or just recognize, and finally a reverse test where you explain it back. Every gap gets logged to the lab notebook.
There's a bunch more to it, but that's the core of the system. And because the skills trigger automatically, you don't need to do anything special. Your agent just has Research Loop.
If Research Loop has helped you do work that matters and you are so inclined, consider sponsoring the project.
Thanks!
— Alexander
Clone the repo into your research workspace:
git clone https://github.com/moralespanitz/research-loop
cd research-loopThe .claude/ directory is picked up automatically by Claude Code. Open a new session and it's active.
go install github.com/moralespanitz/research-loop/cmd/research-loop@latest
research-loop initOr with the install script:
curl -fsSL https://raw.githubusercontent.com/moralespanitz/research-loop/main/install.sh | shOpen a new Claude Code session in the workspace and say anything research-related — "I want to explore transformer memory systems" or "explain policy compression." The agent should automatically load the relevant skill without you typing any command.
-
research-loop — Activates when you mention research, a topic, papers, or experiments. Asks one question to confirm framing. Entry point for everything.
-
status — Activates when you say "where are we", "what do we have", or "what should I do next". Renders the full decision tree: every gap found, every hypothesis evaluated, every lane alive or killed, every experiment pending or done — and one specific next decision. Updates
knowledge_graph.mdon every run. -
learn — Activates when you say "explain", "what is", "I don't understand", or ask about any term. Builds the reasoning structures experts carry: mental models → field debates → diagnostic questions → reverse test. The difference between reading about something and actually understanding it.
-
explore — Activates when you want to map a field or find papers. Spawns 4 parallel search agents simultaneously (papers, repos, debates, open problems). Saves everything to the lab notebook. Presents a 3-sentence synthesis, not a data dump.
-
idea-selection — Activates when you want to find gaps or evaluate whether an idea is worth pursuing. Runs the conversational Carlini gate: 4 questions (taste, uniqueness, impact, feasibility), one at a time, scored and saved.
-
discover — Activates when you want to test multiple angles. Runs 4 parallel hypothesis lanes (incremental, cross-field transfer, assumption challenge, systems/efficiency). Applies Carlini gates between stages. Kills weak lanes early.
-
plan — Activates when you have a selected route and need concrete tasks. Breaks work into specific actions with exact file paths, verification steps, and time estimates. Creates a TodoWrite checklist. Nothing vague.
-
loop — Activates when you have a hypothesis and want to run experiments. Writes the experiment plan and gets approval before running anything. Drives the PROPOSE → MUTATE → BENCHMARK → ANNOTATE cycle.
-
execution — Activates when experiments complete. Annotates results causally, updates the lab notebook, and helps you decide: continue, pivot, or kill.
The agent checks for relevant skills before any response. Mandatory workflows, not suggestions.
Learning
- learn — Builds expert reasoning structures: mental models, field debates, diagnostic questions, reverse test, gap tracking.
Exploration
- research-loop — Entry point. Conversational advisor. Routes to the right skill based on what you say.
- explore — Parallel field mapping. 4 agents simultaneously. Saves full results to lab notebook.
Idea Development
- idea-selection — Conversational Carlini gate. Taste, uniqueness, impact, feasibility. One question at a time.
- discover — Parallel hypothesis lanes. 4 angles. Gates between stages. Kills weak lanes early.
Experiments
- loop — PROPOSE → MUTATE → BENCHMARK → ANNOTATE cycle. Living knowledge graph.
- execution — Result annotation, causal reasoning, continue/pivot/kill decisions.
Every session accumulates to a lab notebook:
.research-loop/sessions/<slug>/
lab_notebook.md # everything: framing, papers, gaps, scores, results
knowledge_graph.md # living DAG of hypotheses tried and why they failed
autoresearch.jsonl # machine-readable experiment history
Sessions are resumable. Bundles are portable. Any agent can resume from lab_notebook.md alone.
research-loop init # configure LLM backend
research-loop start <arxiv-url> # ingest paper, extract hypothesis
research-loop loop start # start experiment loop
research-loop list # list all sessions
research-loop resume <session-id> # resume a paused session
research-loop export # export .research bundle
research-loop mcp serve # start MCP bridge serverResearch Loop has no commands to memorize. You talk to Claude Code naturally and the right skill activates automatically.
"I want to research policy compression and dopamine in transformers"
The agent confirms the framing, then spawns 4 parallel searches (papers, repos, debates, open problems) and shows you a synthesis — not a dump. Everything is saved to the lab notebook.
"Explain rate-distortion theory" "What is the information bottleneck?" "I don't understand fast weight programmers"
The learn skill activates. You get 5 core mental models one at a time, 3 field debates with both sides steel-manned, 5 diagnostic questions that test real understanding vs. memorization, and finally a Socratic reverse test where you explain it back. Every gap you reveal is logged.
"What hasn't been tried in this space?" "Is this idea worth pursuing?"
The Carlini gate runs as a conversation — 4 questions (taste, uniqueness, impact, feasibility), one at a time, with scores saved to the lab notebook. Honest verdict at the end.
"Let's explore this from different angles" "What are the different ways to approach this hypothesis?"
4 hypothesis lanes run simultaneously — incremental improvement, cross-field transfer, assumption challenge, systems/efficiency. Weak lanes are killed early. You pick the one worth pursuing.
"Let's start running experiments" "I want to test this hypothesis against karpathy/autoresearch"
The loop proposes a code mutation, applies it, runs the benchmark, and annotates the result causally. Repeat. The knowledge graph grows. You can interrupt at any time and resume exactly where you left off.
"Where did we leave off?" "Resume my research on dopamine and fast weights"
The lab notebook has everything. The agent reads it and picks up the thread — no re-exploration, no repeated work.
"What does Carlini gate mean?" "Explain what a hypothesis lane is" "I don't understand the knowledge graph"
The learn skill activates mid-flow and teaches the concept, then hands control back to whatever you were doing.
- Researcher in control — the agent proposes, you approve, the agent executes
- One thing at a time — never dump; always present as choices
- Parallel by default — 4 agents simultaneously, not sequentially
- Persist everything — lab notebook accumulates every decision and finding
- Learn, don't just search — understanding deeply is part of the research process
Fellows (autonomous scheduled agents), the full 4-pane TUI, PDF ingestion pipeline, MCP bridge improvements, and the bundle registry are in active development. See ROADMAP.md for the full plan.
- Carlini gate — Nicholas Carlini, "How to win a best paper award". The four axes (taste, uniqueness, impact, feasibility) come directly from his framework.
- learn skill — inspired by this thread by Ihtesham Ali on the difference between reading a subject and understanding it.
- Skill architecture —
<SUBAGENT-STOP>,<HARD-GATE>, and the description conventions are borrowed from Superpowers by Jesse Vincent.
Skills live directly in this repository. To contribute:
- Fork the repository
- Create a branch for your skill or Fellow
- Follow the guide in
CONTRIBUTING.md - Submit a PR
See CONTRIBUTING.md for the complete guide, including skill writing rules, Fellow manifest format, and commit conventions.
Pull the latest skills:
git pull origin mainMIT — see LICENSE for details.
- Issues: https://github.com/moralespanitz/research-loop/issues
- Discussions: https://github.com/moralespanitz/research-loop/discussions
- Security: see
SECURITY.md