A multi-agent system for automatically fixing GitHub issues using Claude Code.
+-------------------------------------------------------------------+
| MULTI-AGENT PIPELINE |
+-------------------------------------------------------------------+
| |
| +----------+ +----------+ +-------------------+ |
| | TRIAGE | --> | RESEARCH | --> | FIX <--> REVIEW | |
| +----------+ +----------+ | (up to 3 loops) | |
| | | +-------------------+ |
| v v | |
| Classify Explore Implement |
| issue codebase & iterate |
| |
+-------------------------------------------------------------------+
| Agent | Purpose | Key Outputs |
|---|---|---|
| Triage | Classify if issue is AI-fixable | Classification, confidence, complexity |
| Research | Deep codebase exploration | Root cause, files to modify, patterns |
| Fix | Implement minimal changes | File changes, confidence assessment |
| Review | Self-review the fix | APPROVE / REQUEST_CHANGES / BLOCK |
If the review agent returns REQUEST_CHANGES, the pipeline loops back to revision:
- Fix agent receives review feedback (concerns, suggestions)
- Fix agent revises the implementation
- Review agent re-evaluates
- Repeat up to 3 times, then block if still not approved
Classifies issues into categories:
FIXABLE_CODE- Clear bug fixable with code changesFIXABLE_CONFIG- Configuration/environment changesNEEDS_CLARIFICATION- Issue too vagueNEEDS_HUMAN- Requires human judgmentALREADY_DONE- Issue appears resolvedOUT_OF_SCOPE- Not suitable for AI fixing
Only FIXABLE_CODE and FIXABLE_CONFIG proceed to the next stage.
Explores the codebase WITHOUT making changes:
- Locates all relevant files
- Maps architecture and data flow
- Identifies root cause
- Documents patterns to follow
- Notes risks and testing recommendations
Implements changes based on research:
- Executes focused changes
- Follows identified patterns
- Does NOT commit (pipeline handles commits after review)
- Reports confidence scores
Self-reviews the fix before PR creation:
- Checks correctness, completeness, safety
- Validates style and scope
- Can block problematic fixes
- Provides concerns and suggestions for revision
- Setup environment:
cp .env.sample .env
# Edit .env with your settings- Install prerequisites:
npm install -g @anthropic-ai/claude-code
# Ensure gh and git are available- Run on issues:
./run.py 5960 # Single issue
./run.py 5960 5961 5962 # Multiple issues- Run from an issue comment invocation (optional):
./run.py 6241 --issue-comment-id 123456789This mode validates that the issue comment contains a supported @ invocation, posts an acknowledgement comment, and then runs the normal pipeline for the issue.
Environment variables in .env:
| Variable | Description | Default |
|---|---|---|
GITHUB_REPO |
Repository (owner/repo) | blockapps/strato-platform |
PROJECT_DIR |
Path to local repo clone | Required |
BASE_BRANCH |
Base branch for PRs | develop |
TRIAGE_TIMEOUT |
Triage agent timeout (seconds) | 120 |
RESEARCH_TIMEOUT |
Research agent timeout (seconds) | 300 |
FIX_TIMEOUT |
Fix agent timeout (seconds) | 300 |
REVIEW_TIMEOUT |
Review agent timeout (seconds) | 180 |
MODEL_PROVIDER |
Default LLM provider (claude or codex) |
claude |
TRIAGE_MODEL_PROVIDER |
Optional provider override for triage | unset |
RESEARCH_MODEL_PROVIDER |
Optional provider override for research | unset |
FIX_MODEL_PROVIDER |
Optional provider override for fix | unset |
REVIEW_MODEL_PROVIDER |
Optional provider override for review | unset |
CLAUDE_CLI_COMMAND |
Claude CLI executable | claude |
CODEX_CLI_COMMAND |
Codex CLI executable | codex |
Each run creates detailed logs in runs/:
runs/2026-01-15_12-00-00-issue-5960/
├── issue.json # Original issue data
├── triage.prompt.md # Prompt sent to triage agent
├── triage.log # Full triage output
├── triage.state.json # Classification result
├── research.prompt.md # Prompt sent to research agent
├── research.log # Full research output
├── research.state.json # Research findings
├── fix.prompt.md # Prompt sent to fix agent
├── fix.log # Full fix output
├── fix.state.json # Fix result
├── review.prompt.md # Prompt sent to review agent
├── review.log # Full review output
├── review.state.json # Review verdict
├── fix-revision-2.prompt.md # (if revision needed)
├── fix-revision-2.log # (if revision needed)
├── fix-revision-2.state.json # (if revision needed)
└── pipeline.state.json # Aggregate pipeline results
Each agent reports confidence (0.0-1.0). The pipeline computes an aggregate:
Aggregate = Triage(0.15) + Research(0.20) + Fix(0.35) + Review(0.30)
strato-fix-all-the-things/
├── run.py # Main entry point
├── src/
│ ├── agents/
│ │ ├── base.py # Base agent class
│ │ ├── triage.py # Triage agent
│ │ ├── research.py # Research agent
│ │ ├── fix.py # Fix agent
│ │ └── review.py # Review agent
│ ├── claude_runner.py # Claude Code CLI wrapper
│ ├── config.py # Configuration loader
│ ├── github_client.py # GitHub API (via gh CLI)
│ ├── git_ops.py # Git operations
│ ├── models.py # Data models
│ └── pipeline.py # Pipeline orchestrator
├── prompts/
│ ├── triage.md # Triage prompt template
│ ├── research.md # Research prompt template
│ ├── fix.md # Fix prompt template
│ ├── fix-revision.md # Fix revision prompt template
│ └── review.md # Review prompt template
├── runs/ # Run logs (gitignored)
├── .env # Environment (gitignored)
├── .env.sample # Environment template
└── README.md
- Fetch issue from GitHub
- Triage classifies if it's AI-fixable
- Research explores codebase, identifies root cause
- Fix implements changes based on research
- Review self-checks the fix
- If
APPROVE: proceed to PR - If
REQUEST_CHANGES: loop back to fix (up to 3 times) - If
BLOCK: stop and comment on issue
- If
- Commit changes with detailed message
- Create PR as draft with
ai-fixes-experimentallabel - Comment on issue linking to PR with details
If any stage fails or blocks, the pipeline stops and comments on the issue explaining why.
- Creates PRs as drafts (not ready for review)
- Review agent can block problematic fixes
- Fix-review loop allows self-correction (up to 3 attempts)
.envfiles automatically excluded from commits- Force-syncs to latest
origin/{base_branch}before each issue - Cleans up existing branches/PRs to avoid conflicts
- Comments on issues even when blocked/failed (for visibility)
- Python 3.11+
- Claude Code CLI (
npm install -g @anthropic-ai/claude-code) - GitHub CLI (
gh) - authenticated git
- Phase 2 model-provider and benchmark work is tracked in
docs/model-benchmark-backlog.md.
Generate benchmark output from historical runs:
python3 scripts/benchmark_report.py --runs-dir runs --json-out runs/benchmark.json --csv-out runs/benchmark.csvApache-2.0 — see LICENSE