feat: add /hyper-plan — recursive codebase improvement with convergence scoring by ShaheerKhawaja · Pull Request #144 · garrytan/gstack

ShaheerKhawaja · 2026-03-18T01:27:22Z

What this does

Adds /hyper-plan — a new skill that chains /plan-ceo-review → /plan-eng-review → execute fixes → /qa into an iterative loop with LLM-as-Judge convergence control.

The problem

Individual /plan-ceo-review and /plan-eng-review passes find issues but don't close the loop. After a review, you manually decide what to fix, fix it, and hope you didn't break something else. There's no convergence criteria, no regression detection, and findings aren't tracked across iterations.

How /hyper-plan solves it

It treats codebase quality like gradient descent — each iteration moves toward a target grade, focused tighter each round on the weakest dimensions.

Iteration 1 runs full CEO + Engineering review, compiles findings into P0-P3, executes P0/P1 fixes with parallel agents, runs /qa diff-aware to verify, then scores 10 quality dimensions (1-10 each):

Code Quality	Security	Performance	UX/UI	Tests
Accessibility	Documentation	Error Handling	Observability	Deploy Safety

Iterations 2-7 focus on ONLY the 2 lowest-scoring dimensions from the previous round. This focus narrowing is how convergence happens — reviewing everything every round causes thrashing.

Convergence rules:

SUCCESS: overall grade ≥ target (default 8.0/10)
CONVERGED: improvement < 0.2 for 2 consecutive iterations (plateaued)
DEGRADED: any dimension decreased → HALT (something went wrong)
MAX_REACHED: 7 iterations completed

Key design decisions:

Judge reads actual source code with file:line evidence — doesn't trust fix-agent self-reports
Every fix batch passes a validation gate (lint + types + tests) before committing
Follows the Completeness Principle — when fixing, fix completely
All artifacts saved to .hyper-plan/ for auditability
Orchestrates existing skills — doesn't replace /plan-ceo-review, /plan-eng-review, or /qa

Example convergence

| Iteration | Grade | Delta | Focus           | Verdict  |
|-----------|-------|-------|-----------------|----------|
| Baseline  | 5.4   | —     | All             | —        |
| 1         | 6.2   | +0.8  | All             | CONTINUE |
| 2         | 6.8   | +0.6  | Tests, Security | CONTINUE |
| 3         | 7.2   | +0.4  | Perf, Deploy    | CONTINUE |
| 4         | 7.5   | +0.3  | UX, Docs        | CONTINUE |
| 5         | 7.8   | +0.3  | Errors, Observ. | CONTINUE |
| 6         | 8.1   | +0.3  | —               | SUCCESS  |

Files changed

hyper-plan/SKILL.md — new skill (154 lines)

Test plan

Run /hyper-plan on a sample project, verify it chains the 3 review skills
Verify convergence stops when grade ≥ target
Verify HALT triggers when any dimension score decreases
Verify .hyper-plan/ output files are created after each iteration
Verify focused iterations only review the 2 flagged dimensions

…ce scoring Adds a new skill that chains /plan-ceo-review → /plan-eng-review → execute fixes → /qa into an iterative loop with LLM-as-Judge convergence control. The problem: individual review passes find issues but don't close the loop. Findings aren't tracked across iterations, there's no convergence criteria, and fixed code doesn't get re-reviewed. /hyper-plan treats quality like gradient descent — each iteration moves toward a target grade, focused tighter each round on the weakest dimensions. How it works: - Iteration 1 runs full CEO + Eng review, fixes P0/P1 findings, runs QA, then scores 10 quality dimensions (Code Quality, Security, Performance, UX, Tests, A11y, Docs, Error Handling, Observability, Deploy Safety) - Iterations 2-7 focus on only the 2 lowest-scoring dimensions from the previous round - Stops on: SUCCESS (grade >= target), CONVERGED (delta < 0.2 twice), DEGRADED (any dimension decreased), or MAX_REACHED (7 iterations) - Judge reads actual source code with file:line evidence — no self-reporting from fixers - Every fix batch passes a validation gate (lint + types + tests) before committing - All artifacts saved to .hyper-plan/ for auditability Follows the Completeness Principle — when fixing an issue, fix it completely.

garrytan · 2026-03-18T04:01:46Z

Single raw SKILL.md, no .tmpl template. Closing.

garrytan closed this Mar 18, 2026

ShaheerKhawaja mentioned this pull request Mar 18, 2026

feat: add /hyper-plan skill — recursive codebase improvement with convergence scoring #166

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add /hyper-plan — recursive codebase improvement with convergence scoring#144

feat: add /hyper-plan — recursive codebase improvement with convergence scoring#144
ShaheerKhawaja wants to merge 1 commit into
garrytan:mainfrom
ShaheerKhawaja:feat/hyper-plan-mode

ShaheerKhawaja commented Mar 18, 2026

Uh oh!

garrytan commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ShaheerKhawaja commented Mar 18, 2026

What this does

The problem

How /hyper-plan solves it

Example convergence

Files changed

Test plan

Uh oh!

garrytan commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants