Composable Claude Code skills that enforce a correctness-oriented development workflow. Spec before you code. Test before you implement. Never let an agent grade its own work.
Built with Correctless.
AI coding assistants are fast but sloppy. They write code that works for the happy path, skip edge cases, and silently introduce bugs that don't surface until production. The same model that wrote the code will review it and say "looks good" — because it's confirming its own decisions.
Correctless fixes this by structuring the workflow so that every phase is executed by a different agent with a different lens:
- The spec agent asks "what does correct mean?" and researches current best practices before any code exists
- The review agent reads the spec cold and checks for security gaps, unstated assumptions, and untestable rules
- The test agent writes tests from the spec without knowing the implementation plan
- The test auditor checks whether those tests would actually catch bugs or just pass against mocks
- The implementation agent makes the tests pass without having written them
- The QA agent hunts for bugs with neither the test author's nor the implementer's blind spots
- The verification agent checks spec-to-code correspondence without insider knowledge
Same model — but the framing determines what the agent finds.
You need Claude Code and a Claude Max subscription ($100-200/mo).
/plugin marketplace add joshft/correctless
/plugin install correctless
/csetup
Alternative: Git clone
git clone https://github.com/joshft/correctless.git .claude/skills/workflow
.claude/skills/workflow/setup
/csetupStandard intensity by default. To increase: add "intensity": "high" or "critical" to the workflow section of .correctless/config/workflow-config.json.
git checkout -b feature/my-feature
/cspec
/plugin uninstall correctless
/plugin marketplace remove correctless
/plugin marketplace add joshft/correctless
/plugin install correctless
Then restart Claude Code. Git clone users: cd .claude/skills/workflow && git pull && ./setup
Correctless ships as a single plugin with 28 skills. You choose the intensity that matches your project's risk profile. Seven skills are gated behind intensity thresholds — they check your project's workflow.intensity setting and warn if invoked below their minimum.
| Intensity | Overhead | What You Get | Best For |
|---|---|---|---|
| standard | ~10-15 min | 19 core skills: spec, review, TDD, verify, docs, debug, refactor, release | SaaS, APIs, CLI tools, content sites |
| high | ~30-60 min | + adversarial spec review, convergence auditing, architecture tracking | Auth, payments, sensitive data |
| critical | ~1-2 hours | + Alloy formal modeling, live red team assessment | Security infrastructure, crypto, proxies |
Skills like /cpostmortem and /cdevadv are available at all intensity levels — they're about learning from the past, not adding rigor to the present.
Put another way: Standard intensity is like having someone next to you going through a checklist to make sure your project has some sanity. Critical intensity is like taking your Claude Max subscription tokens, setting them on fire, collecting the ash, and using it to create a tiny diamond.
graph LR
A["/cspec<br/>Write spec"] --> B["/creview<br/>Skeptical review"]
B --> C["/ctdd"]
C --> D["/cverify<br/>Rule coverage"]
D --> E["/cdocs<br/>Update docs"]
E --> F["Merge"]
subgraph "/ctdd — Enforced TDD"
direction LR
C1["RED<br/>Write tests"] --> C2["Test Audit<br/>Would tests catch bugs?"]
C2 --> C3["GREEN<br/>Implement"]
C3 --> C4["/simplify"]
C4 --> C5["QA<br/>Hostile review"]
C5 -.->|"Issues found"| C3
end
C --- C1
style A fill:#339af0,color:#fff
style B fill:#e599f7,color:#000
style C1 fill:#ff6b6b,color:#fff
style C3 fill:#51cf66,color:#fff
style C5 fill:#ffd43b,color:#000
style F fill:#51cf66,color:#fff
Each box is a separate agent. The test writer doesn't know the implementation plan. The QA agent didn't write the tests. A PreToolUse hook blocks source code edits until tests exist — this isn't a suggestion, it's enforced by bash. See the Standard Workflow Guide for state machine diagrams, hook architecture, and phase gating details.
graph LR
A["/cspec<br/>Typed invariants"] --> B["/cmodel<br/>Alloy modeling"]
B --> C["/creview-spec<br/>5-agent adversarial"]
C --> D["/ctdd<br/>+ mutation testing"]
D --> E["/cverify<br/>+ drift detection"]
E --> F["/cupdate-arch"]
F --> G["/cdocs"]
G --> H["/caudit<br/>Olympics"]
style B fill:#ff922b,color:#fff
style C fill:#e599f7,color:#000
style H fill:#ff922b,color:#fff
You don't have to pick intensity manually for every feature. /cspec evaluates signals in your spec — file paths touching auth/payments, STRIDE threat keywords, compliance references, antipattern history — and recommends the right intensity. You confirm or override.
graph LR
A["Feature request"] --> B["/cspec"]
B --> C{"Intensity<br/>detection"}
C -->|"CRUD endpoint"| D["standard"]
C -->|"Auth + payments"| E["high"]
C -->|"Crypto + HIPAA"| F["critical"]
D --> G["Review + TDD"]
E --> H["+ /caudit, /creview-spec"]
F --> I["+ /cmodel, /credteam"]
style D fill:#51cf66,color:#fff
style E fill:#ffd43b,color:#000
style F fill:#ff6b6b,color:#fff
Prompt-level instructions fade as context fills — enforcement that depends on the model is a suggestion. Correctless uses four independent layers:
graph TD
A["Agent wants to<br/>edit source file<br/>during QA phase"] --> B["Layer 1: Gate<br/>PreToolUse hook"]
B -->|"BLOCK"| C["Edit prevented<br/>Model can't bypass bash"]
B -->|"ALLOW<br/>(Bash slip-through)"| D["File modified"]
D --> E["Layer 2: Audit Trail<br/>PostToolUse hook"]
E --> F["Logged with phase context<br/>Alert shown to user"]
F --> G["/cwtf reads trail<br/>reports deviations"]
P["Layer 3: Path-scoped rules<br/>(higher-adherence advisory)"] -.->|"Loaded when a<br/>scoped file is opened"| A
H["Layer 4: Skill Instructions<br/>(prompt-level, advisory)"] -.->|"Subject to<br/>context fade"| A
style B fill:#ff6b6b,color:#fff
style C fill:#ff6b6b,color:#fff
style E fill:#ffd43b,color:#000
style P fill:#ffa94d,color:#000
style H fill:#dee2e6,color:#000
Fresh agents per phase add resilience: each phase spawns a new agent at 0% context with fresh instructions via context: fork. A QA agent at 0% follows hostile-lens instructions perfectly. A single agent at 70% may have forgotten it was supposed to be hostile.
Escaped bugs become antipatterns. Antipatterns become spec rules. Spec rules become tests. Six months in, the workflow knows your project's failure modes better than any individual developer.
graph LR
A["Bug escapes"] --> B["/cpostmortem"]
B --> C["antipatterns.md<br/>(class fix)"]
B --> D["CLAUDE.md<br/>(learning)"]
C --> E["/cspec reads<br/>antipatterns"]
D --> F["Every future<br/>session loads"]
E --> G["Feature N+1<br/>prevents<br/>same bug class"]
F --> G
style B fill:#ff922b,color:#fff
style G fill:#51cf66,color:#fff
| Skill | When to Use | What It Does |
|---|---|---|
/csetup |
First run, or re-run for health check | 19-point health check, convention mining, project scaffolding |
/cspec |
Starting a new feature | Testable rules with research agent, intensity detection |
/creview |
After /cspec | Skeptical review + OWASP security checklist |
/ctdd |
After review approves spec | RED, test audit, GREEN, /simplify, QA — all enforced |
/cverify |
After /ctdd completes | Spec-to-code verification, drift detection |
/cdocs |
After /cverify | Update README, AGENT_CONTEXT, ARCHITECTURE, feature docs |
/carchitect |
New project or missing architecture doc | Structured ARCHITECTURE.md — reverse-engineer or greenfield, entrypoints YAML |
/cauto |
After spec review approved | Orchestrate full pipeline: TDD, verify, docs, PR — with flexible phase resume |
/crelease |
Ready to tag a version | Version bump, changelog, sanity checks, annotated tag |
| Skill | When to Use | What It Does |
|---|---|---|
/cquick |
Small, well-understood changes | TDD without the ceremony — scope-guarded at 50 LOC / 3 files |
/crefactor |
Restructuring without behavior change | Characterization tests, behavioral equivalence, agent separation |
/cdebug |
Stuck on a bug | Root cause, hypothesis, git bisect, TDD fix, class fix |
/cpr-review |
Reviewing an incoming PR | Architecture, security, tests, antipatterns, dep bumps |
| Skill | When to Use | What It Does |
|---|---|---|
/ccontribute |
Contributing to another project | Learn conventions first, match patterns, pre-flight, generate PR |
/cmaintain |
Reviewing a contribution | Scope check, conventions, maintenance burden, pre-written comments |
| Skill | When to Use | What It Does |
|---|---|---|
/cstatus |
Anytime | Current phase, next steps, problem detection |
/chelp |
Need a quick reference | Workflow pipeline, all commands |
/csummary |
After a feature or mid-feature | What the workflow caught, by phase |
/cmetrics |
Monthly or for ROI analysis | Token cost, bugs caught, session analytics, trends |
/cwtf |
Suspect agents took shortcuts | Did agents actually follow instructions? |
/cexplain |
Onboarding or exploring a codebase | Guided mermaid diagrams, prose walkthroughs, HTML export |
| Skill | When to Use | What It Does |
|---|---|---|
/cpostmortem |
After a bug escapes | Trace which phase missed it, add antipattern + class fix |
/cdevadv |
Periodic deep analysis | Devil's advocate — challenge architecture and strategy |
| Skill | Min Intensity | What It Does |
|---|---|---|
/caudit |
high | Olympics convergence audit (QA / Hacker / Performance presets) |
/creview-spec |
high | 5-agent adversarial spec review |
/cupdate-arch |
high | Keep ARCHITECTURE.md current after features land |
/cmodel |
critical | Alloy formal modeling for state machines and protocols |
/credteam |
critical | Live adversarial red team with source code access |
Correctless hooks into Claude Code's infrastructure for real-time feedback and long-term learning. All features below are automatic after /csetup.
graph TB
subgraph "Claude Code Hooks"
A["PreToolUse"] --> H["sensitive-file-guard.sh<br/>Secret protection"]
A --> B["workflow-gate.sh<br/>Phase enforcement"]
C["PostToolUse"] --> D["audit-trail.sh<br/>Adherence feedback"]
C --> G["auto-format.sh<br/>Auto-formatting"]
E["Statusline"] --> F["statusline.sh<br/>Phase + cost + context"]
end
H -->|"block/allow"| M[".env, keys, credentials"]
B -->|"block/allow"| I["Every file edit"]
D -->|"alerts"| J["Real-time violations"]
G -->|"formats"| L["Edited files"]
F -->|"live display"| K["Always visible"]
style H fill:#e64980,color:#fff
style B fill:#ff6b6b,color:#fff
style D fill:#ffd43b,color:#000
style G fill:#74c0fc,color:#000
style F fill:#51cf66,color:#fff
| Hook | Runs | Purpose |
|---|---|---|
| sensitive-file-guard.sh | Before every file edit | Blocks writes to .env, credentials, keys, certificates — fail-closed, no overrides |
| workflow-gate.sh | Before every file edit | Blocks writes that violate the current phase (RED blocks source, QA blocks everything) |
| audit-trail.sh | After every tool call | Logs modifications with phase context, alerts on violations |
| token-tracking.sh | After Agent tool completion | Logs subagent token usage, cost, and duration to JSONL for /cmetrics analysis |
| auto-format.sh | After Edit/Write/MultiEdit | Runs project formatter (Prettier, Black, gofmt, etc.) with allowlist validation |
| statusline.sh | Continuously | Shows phase, QA round, cost, context %, lines delta |
| workflow-advance.sh | On command | State machine — validates transitions, enforces gates |
The statusline shows your workflow state at a glance:
project/ feature/auth Opus 34% RED QA:R0 $0.42 +87/-12
Phase (color-coded), QA round count, session cost, feature cost (background-cached), lines delta, context usage with warnings at 70%.
The audit trail hook monitors every modification and alerts immediately:
tdd-qa: Source file modified — middleware.ts (this phase should be read-only)GREEN: Test file edited — auth.test.ts (logged in test-edit-log)QA: Read middleware.ts (3 of 7 modified files reviewed)(high+ intensity)
/cmetrics reads Claude Code's session data for exact token costs, outcome rates, and a Correctless vs Freeform comparison table.
Postmortem findings, conventions, and audit learnings append to CLAUDE.md and load into every future session automatically. The spec agent just knows that "auth features in this project need middleware ordering checks" without being told.
- Git trailers in commit messages:
Spec:,Rules-covered:,Verified-by: - Git notes attaching verification summaries to commits
- Git bisect in
/cdebugfor automated regression finding
/csetup offers to configure two MCP servers that improve analysis across all skills:
- Serena — symbol-level code queries (call graphs, references, symbol lookup). 15 skills use it for precise analysis with 40-60% token savings on larger projects.
- Context7 — current library documentation on demand. The
/cspecresearch agent gets real docs for real versions.
Both are free, open source, run locally, and fall back silently when unavailable.
External-facing skills (/cpr-review, /ccontribute, /cmaintain) automatically redact paths, credentials, hostnames, and session IDs before posting.
/csetup runs 19 checks across 6 categories on first run:
| Category | Checks |
|---|---|
| Security | Hardcoded secrets, API keys in source, .env committed, missing .gitignore patterns |
| Code Quality | Linter configured, formatter configured, dependency audit |
| Testing | Test runner detected, test files exist, coverage configured |
| CI/CD | CI pipeline exists, runs tests, runs linter |
| Documentation | README exists, ARCHITECTURE.md, CONTRIBUTING.md |
| Git Hygiene | .gitignore present, no large binaries, branch protection |
For existing projects, setup also mines your codebase for conventions (commit message style, test patterns, import ordering) and bootstraps architecture documentation.
Check your workflow with /cstatus or the statusline. For advanced debugging:
.correctless/hooks/workflow-advance.sh diagnose "file" # Why a file is blocked
.correctless/hooks/workflow-advance.sh override "why" # Temporary gate bypass (10 tool calls)
.correctless/hooks/workflow-advance.sh spec-update "why" # Spec was wrong mid-TDD
.correctless/hooks/workflow-advance.sh reset # Remove all state for current branchIf the gate is blocking a typo fix:
.correctless/hooks/workflow-advance.sh override "quick bugfix: fixing typo in error message"Bypasses the gate for 10 tool calls. When no workflow is active, the gate allows all edits freely.
| Language | Test Runner | Mutation Tool | PBT Library |
|---|---|---|---|
| Go | go test |
go-mutesting | rapid |
| TypeScript | jest/vitest | Stryker | fast-check |
| Python | pytest | mutmut | hypothesis |
| Rust | cargo test | cargo-mutants | proptest |
Mutation testing and PBT helpers are available at high+ intensity. Standard intensity works with any language that has a test runner.
- Claude Code CLI
- A Claude Max subscription ($100/mo or $200/mo plan). Correctless spawns multiple agents per feature — $200/mo is recommended at high+ intensity.
- A project with a test runner
- jq (JSON processor) — required for all hooks. Install:
brew install jq(macOS),apt install jq(Ubuntu) - Bash 4+ — required for hooks. macOS ships Bash 3.2 by default; install modern bash:
brew install bash
Optional (high+/critical):
- Alloy Analyzer for formal modeling
- Mutation testing tool for your language
- Isolated environment (Docker/VPS) for red team assessments
Opt-in per feature. Correctless is passive when no workflow is active. Start a workflow with /cspec on a feature branch — skip it on branches where you don't need it. All normal Claude Code behavior is unchanged outside active workflows.
CI unchanged. Correctless runs entirely inside Claude Code sessions via local hooks. It does not modify your CI pipeline, add CI steps, or require any CI changes.
After merge. Delete the feature branch and start fresh with a new branch + /cspec. Workflow state files in .correctless/artifacts/ are branch-scoped and harmless — they provide history for /cmetrics and /csummary.
To fully remove Correctless from a project:
# Remove the plugin
/plugin uninstall correctless
# Clean up project files
rm -rf .correctless/
# Remove hook entries from .claude/settings.json (edit manually or delete the file)
# Remove the "## Correctless" section from CLAUDE.md if present| Term | Meaning |
|---|---|
| Agent separation | Each workflow phase runs in a fresh Claude session. The test writer doesn't know the implementation plan; the QA agent didn't write the tests. Prevents confirmation bias. |
| Instance fix | Fix the one bug here and now. |
| Class fix | Fix the entire category of this bug — add a structural test that prevents recurrence. |
| Convergence | Run multiple audit rounds until findings stabilize (no new critical/high issues). |
| Drift | Code that no longer matches documented architecture. Detected by /cverify, tracked in drift-debt.json. |
| Antipattern | A known bug class from your project's history. Stored in .correctless/antipatterns.md, checked by every future spec and review. |
| Spec | A document defining what "correct" means for a feature: testable rules, edge cases, security assumptions. A spec that can't be tested is incomplete. |
| Invariant | A rule that must always be true: "auth tokens expire after 24 hours." Specs are lists of invariants. |
| Intensity | The configured rigor level: standard, high, or critical. Higher intensity unlocks more skills but costs more tokens and time. |
| Mini-audit | After QA clears, four adversarial specialist agents (cross-component interaction, hostile input, resource bounds, upgrade compatibility) hunt for issues the QA lens misses. Runs at the end of /ctdd. |
| Mutation testing | Introduce small bugs into code and check if tests catch them. If a test passes with a mutation, that test is weak. Available at high+ intensity. |
| STRIDE | Threat modeling framework: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege. |
| RED / GREEN | TDD phases. RED = write tests that fail. GREEN = write code to make tests pass. |
Correctless 3.0.0 — Early release. 28 skills, 3 intensity levels, ~5,000 automated tests, 8 hooks. Real-world usage ongoing — file issues as you find them.
MIT
