From caa93409363956ab63733f766d12089db73c7770 Mon Sep 17 00:00:00 2001 From: Martin Stransky Date: Mon, 9 Mar 2026 19:05:08 +0100 Subject: [PATCH 1/4] feat: Add workflow skills (/sync, /design, /done) replacing rigid QSP classification Three entry-point skills replace mandatory upfront QSP classification with auto-detection at completion time. /sync checks workspace readiness, /design crystallizes brainstorming into plans, and /done validates + ships + documents with scope auto-detected from branch, diff size, and plan state. The /ship command is absorbed into /done Phase 2 (3-tier checklist preserved). Co-Authored-By: Claude Opus 4.6 --- .claude/commands/ship.md | 90 ---------------- .claude/skills/design/SKILL.md | 87 +++++++++++++++ .claude/skills/done/SKILL.md | 126 ++++++++++++++++++++++ .claude/skills/sync/SKILL.md | 53 ++++++++++ CLAUDE.md | 4 +- docs/CHANGELOG.md | 10 ++ docs/DECISIONS.md | 11 ++ docs/DEVELOPMENT_PROCESS.md | 37 ++++--- docs/WORKFLOW_SKILLS_PLAN.md | 2 +- tests/test_commands.py | 19 ---- tests/test_skills.py | 187 +++++++++++++++++++++++++++++++++ 11 files changed, 500 insertions(+), 126 deletions(-) delete mode 100644 .claude/commands/ship.md create mode 100644 .claude/skills/design/SKILL.md create mode 100644 .claude/skills/done/SKILL.md create mode 100644 .claude/skills/sync/SKILL.md create mode 100644 tests/test_skills.py diff --git a/.claude/commands/ship.md b/.claude/commands/ship.md deleted file mode 100644 index 4cc4c9b..0000000 --- a/.claude/commands/ship.md +++ /dev/null @@ -1,90 +0,0 @@ ---- -allowed-tools: Read, Glob, Grep, Bash -description: Pre-deployment verification with 3 tiers -- Blockers, High Priority, and Recommended checks. ---- - -# Ship Checklist - -Run pre-deployment verification before merging or releasing. - -## Tier 1: Blockers (must ALL pass) - -These will prevent shipping if any fail: - -1. **Tests pass** - - Run `uv run pytest -x -q` - - All tests must pass with zero failures - -2. **Lint clean** - - Run `uv run ruff check .` - - Zero lint errors - -3. **Type check clean** - - Run `uv run pyright` - - Zero type errors - -4. **No secrets in codebase** - - Search for: `sk-`, `AKIA`, `ghp_`, `password=`, `secret=`, `-----BEGIN.*PRIVATE KEY` - - Zero matches in tracked files (exclude `.env.example`) - -5. **No debug code** - - Search for: `breakpoint()`, `pdb.set_trace()`, `print(` in non-test source files - - Flag any found (prints may be intentional -- list for review) - -## Tier 2: High Priority (should pass) - -These are important but may have valid exceptions: - -1. **Test coverage** - - Run `uv run pytest --cov --cov-report=term-missing -q` - - Flag any source files with 0% coverage - -2. **No TODOs in shipping code** - - Search for `TODO`, `FIXME`, `HACK`, `XXX` in source files - - List all found with file:line - -3. **Documentation current** - - Check `docs/CHANGELOG.md` has entries for current work - - Check `docs/IMPLEMENTATION_PLAN.md` status is up to date - -4. **No unused imports** - - Run `uv run ruff check --select F401 .` - - List any unused imports found - -## Tier 3: Recommended (nice to have) - -1. **Git history clean** - - Run `git log --oneline -10` and check for WIP/fixup commits - - Suggest squashing if messy - -2. **Branch up to date** - - Run `git fetch origin` and check if behind base branch - - Suggest rebase if diverged - -3. **Version consistent** - - Run `python scripts/check_versions.py` if available - - Check all packages have synchronized MAJOR.MINOR - -## Report Format - -``` -# Ship Checklist Results - -## Blockers -- [PASS] Tests: X passed, 0 failed -- [PASS] Lint: 0 errors -- [PASS] Types: 0 errors -- [PASS] Secrets: none found -- [PASS] Debug code: none found - -## High Priority -- [PASS] Coverage: X% average -- [WARN] TODOs: 3 found (list below) -- [PASS] Docs: up to date - -## Recommended -- [INFO] Git history: clean -- [WARN] Branch: 2 commits behind origin/master - -## Verdict: [READY TO SHIP / BLOCKERS FOUND / REVIEW NEEDED] -``` diff --git a/.claude/skills/design/SKILL.md b/.claude/skills/design/SKILL.md new file mode 100644 index 0000000..083db70 --- /dev/null +++ b/.claude/skills/design/SKILL.md @@ -0,0 +1,87 @@ +--- +name: design +description: Crystallize brainstorming into a structured implementation plan. Reads DECISIONS.md for conflicts, auto-classifies scope (Q/S/P), and outputs an actionable plan. +argument-hint: "[topic or summary of what to plan]" +allowed-tools: Read, Glob, Grep, Bash, Edit +--- + +# Design + +Crystallize brainstorming into a structured implementation plan. Use at the start or end of brainstorming to formalize an approach. + +## Steps + +### 1. Check for Conflicts + +- Read `docs/DECISIONS.md` -- scan for entries that conflict with or overlap the proposed work +- Read `docs/IMPLEMENTATION_PLAN.md` -- check for active phases or overlapping planned work +- If conflicts found: present the contradiction to the user before proceeding + +### 2. Auto-Classify Scope + +Based on conversation context, classify as: + +| Scope | Criteria | +|-------|----------| +| **Q** (Quick) | Trivial, obvious, single-location change (typo, config tweak, one-liner) | +| **S** (Standard) | Fits in one session, clear scope (new feature, multi-file refactor, investigation) | +| **P** (Project) | Needs phased execution across sessions (multi-phase feature, architecture change) | + +### 3. Output Structured Plan + +The plan format varies by scope: + +#### Q (Quick) +``` +## Plan (Quick) +**Fix**: +**File**: +**Recommendation**: Proceed directly -- this is a single-location change. +``` + +#### S (Standard) +``` +## Plan (Standard) +**Scope**: <1-2 sentence summary> +**Branch**: `/` + +### Files to Modify +- -- + +### Approach + + +### Test Strategy + + +### Risks +- +``` + +#### P (Project) +``` +## Plan (Project) +**Scope**: <1-2 sentence summary> + +### Phase 1: +**Acceptance Criteria**: +- [ ] +**Files**: +**Approach**: + +### Phase 2: +... +``` + +For P-scoped plans: write the phase breakdown to `docs/IMPLEMENTATION_PLAN.md`. + +### 4. Decision Candidates + +List any decisions that should be recorded in `docs/DECISIONS.md`: +- Architectural choices made during planning +- Alternatives considered and rejected +- Constraints or assumptions + +### 5. User Confirmation + +Present the plan and wait for user approval before implementation begins. diff --git a/.claude/skills/done/SKILL.md b/.claude/skills/done/SKILL.md new file mode 100644 index 0000000..3cc6039 --- /dev/null +++ b/.claude/skills/done/SKILL.md @@ -0,0 +1,126 @@ +--- +name: done +description: Universal completion command. Auto-detects scope (Q/S/P), validates code quality, ships/lands/delivers changes, and updates documentation. +allowed-tools: Read, Glob, Grep, Bash, Edit, Write +disable-model-invocation: true +--- + +# Done + +Universal completion command. Call this when work is finished to validate, ship, and document. + +## Phase 1: Detect Scope + +Determine scope from workspace signals: + +| Signal | Q (ship) | S (land) | P (deliver) | +|--------|----------|----------|-------------| +| Branch | main/master | feature branch | feature branch | +| Files changed | <=3 | >3 | any | +| IMPLEMENTATION_PLAN.md | no active phases | no active phases | has unchecked phases | +| Diff size | <100 lines | >=100 lines | any | + +**Decision logic** (first match wins): +1. If `docs/IMPLEMENTATION_PLAN.md` has active/unchecked phases -> **P** (deliver) +2. If on a feature branch -> **S** (land) +3. If on main/master AND small scope (<=3 files, <100 lines changed) -> **Q** (ship) +4. If on main/master AND large scope -> warn user, suggest creating a feature branch + +**Always report the detected scope and accept user override.** + +## Phase 2: Validate + +Absorbs the former `/ship` checklist. Three tiers of checks: + +### Blockers (must ALL pass -- stop if any fail) + +1. **No secrets in codebase** + - Search for: `sk-`, `AKIA`, `ghp_`, `password=`, `secret=`, `-----BEGIN.*PRIVATE KEY` + - Zero matches in tracked files (exclude `.env.example`) + +2. **No debug code** + - Search for: `breakpoint()`, `pdb.set_trace()` in non-test source files + - Flag `print(` statements for review (may be intentional) + +3. **Pre-commit hygiene** + - Search for leftover `TODO`, `FIXME`, `HACK` markers in changed files + - List all found with file:line for review + +### High Priority (run via agents in parallel) + +Launch two agents simultaneously: + +| Agent | File | What it checks | +|-------|------|---------------| +| Code Quality | `.claude/agents/code-quality-validator.md` | Lint (`ruff check`), format (`ruff format --check`), type check (`pyright`) | +| Test Coverage | `.claude/agents/test-coverage-validator.md` | All tests pass, coverage report | + +All agents use `subagent_type: "general-purpose"`. + +### Recommended + +1. **Git history** -- check for WIP/fixup commits that should be squashed +2. **Branch up to date** -- check if behind base branch + +### Skip Conditions + +- If ALL changed files are `.md` files: skip Python tooling (lint, format, types, tests) +- Report skipped checks and why + +### Blocker Found + +If any Blocker fails: **STOP**. Report all findings and do not proceed to Phase 3. + +## Phase 3: Ship / Land / Deliver + +Actions depend on detected scope: + +### Q (Ship) + +1. `git add` changed files +2. `git commit` with descriptive message +3. `git push` to main/master +4. `gh run watch` to verify CI passes + +If branch protection is enabled: push to a short-lived branch, create PR with `gh pr create --fill`, watch checks, merge, delete branch. + +### S (Land) + +1. `git add` changed files +2. `git commit` with descriptive message +3. `git push -u origin ` +4. Create PR using `.claude/agents/pr-writer.md` agent for description +5. `gh pr checks --watch` to verify CI +6. When automated reviewer comments arrive, use `.claude/agents/review-responder.md` to triage and fix +7. Run `.claude/agents/code-reviewer.md` for independent code review +8. Fix Critical issues before merge + +### P (Deliver) + +All of S (Land), plus: + +1. Verify acceptance criteria using `.claude/agents/acceptance-criteria-validator.md` +2. Update `docs/IMPLEMENTATION_PLAN.md` using `.claude/agents/implementation-tracker.md` +3. Write phase handoff note (2-5 sentences: what completed, deviations, risks, dependencies, intentional debt) +4. If this is the final phase: version bump and changelog consolidation + +## Phase 4: Document + +### Q (Ship) +- Update `docs/CHANGELOG.md` only if the change is user-facing + +### S (Land) and P (Deliver) +- Always update `docs/CHANGELOG.md` with user-facing changes +- Always update `docs/DECISIONS.md` with decisions made during the work +- Use `.claude/agents/docs-updater.md` to verify documentation is complete + +## Failure Protocol + +| Failure | Action | +|---------|--------| +| Validation (Phase 2) fails | Fix issues, re-run from Phase 2 | +| CI (Phase 3) fails | Fix, push, re-run from Phase 3 CI step | +| CI fails on pre-existing issue | Document separately, do not block current work | +| Code review flags architectural concern | Pause. Evaluate rework vs. follow-up issue | +| Acceptance criteria (P) reveals regression | File as separate issue. Fix only if direct regression | +| Multiple steps fail repeatedly | Stop. Reassess scope -- may need to split work | diff --git a/.claude/skills/sync/SKILL.md b/.claude/skills/sync/SKILL.md new file mode 100644 index 0000000..39ee650 --- /dev/null +++ b/.claude/skills/sync/SKILL.md @@ -0,0 +1,53 @@ +--- +name: sync +description: Pre-flight workspace sync. Run before starting any work to check branch state, remote tracking, dirty files, and recent commits. +allowed-tools: Read, Bash, Grep +disable-model-invocation: true +--- + +# Sync + +Pre-flight workspace sync. Run this before starting any work. + +## Steps + +1. **Fetch remote refs** + - Run `git fetch origin` + +2. **Check workspace state** + - Run `git status` to see dirty files, staged changes, untracked files + - Run `git branch -vv` to see current branch, tracking info, ahead/behind counts + +3. **Warn on problems** + - If behind remote: warn and suggest `git pull --rebase` or `git rebase origin/` + - If on main/master with dirty working tree: warn that uncommitted changes exist on the base branch + - If no upstream tracking branch: note that the branch is local-only + +4. **Show recent context** + - Run `git log --oneline -3` to show the last 3 commits + +5. **Output structured report** + +``` +# Workspace Sync + +Branch: (tracking: /) +Status: +Remote: + +## Warnings +- + +## Recent Commits +- +- +- +``` + +## What This Skill Does NOT Do + +- Does NOT classify the task as Q/S/P +- Does NOT read DECISIONS.md or IMPLEMENTATION_PLAN.md +- Does NOT modify any files + +This is purely a workspace readiness check. diff --git a/CLAUDE.md b/CLAUDE.md index af8b1cb..7244d67 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,8 +1,8 @@ # CLAUDE.md -## Development Process (QSP classification) +## Development Process -Before starting ANY task, classify it as **Q** (Quick), **S** (Standard), or **P** (Project) and follow the corresponding path in `docs/DEVELOPMENT_PROCESS.md`. This classification is mandatory -- do not skip it. +Use `/sync` before starting work, `/design` to formalize a plan, and `/done` when finished. Scope (Q/S/P) is auto-detected by `/done` -- do not classify manually. See `docs/DEVELOPMENT_PROCESS.md` for the full workflow. ## Security diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 82a985d..e83f898 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -7,6 +7,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added +- 3 workflow skills (`/sync`, `/design`, `/done`) replace rigid upfront QSP classification -- `/sync` checks workspace readiness before work, `/design` crystallizes brainstorming into structured plans with conflict detection, `/done` auto-detects scope (Q/S/P) and runs the full validate-ship-document pipeline including the former `/ship` checklist + +### Changed +- QSP scope classification is now auto-detected by `/done` based on branch, diff size, and IMPLEMENTATION_PLAN.md state -- users no longer classify manually before starting work +- PCC shorthand now triggers `/done` instead of manually executing S.5-S.7 + +### Removed +- `/ship` slash command -- its 3-tier validation checklist (Blockers, High Priority, Recommended) is preserved in `/done` Phase 2 + ### Added - Three graduated permission tiers (Assisted, Autonomous, Full Trust) for devcontainer environments -- container isolation (firewall, non-root, hooks) enables safely expanding Claude Code permissions, reducing unnecessary prompts from dozens per session to zero in Tier 2/3 while blocking tool installation, package publishing, and container escape vectors via curated deny lists and a policy-enforcement hook - 5 hook scripts in `.claude/hooks/` run automatically during Claude Code sessions -- 3 security hooks block destructive commands, secret leaks, and invisible Unicode attacks in real time; 2 productivity hooks auto-format Python files and auto-run associated tests after every edit diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md index 51a298e..c2bca56 100644 --- a/docs/DECISIONS.md +++ b/docs/DECISIONS.md @@ -102,6 +102,17 @@ When a decision is superseded or obsolete, delete it (git history preserves the - DROP policies added after all ACCEPT rules -- reordering prevents a partial-failure scenario where DROP is installed before the ACCEPT rules complete, locking out the container - Replaced registry.npmjs.org with claude.ai in the firewall allowlist -- npmjs.org is no longer contacted now that the native installer is used; claude.ai is required for Claude Code authentication +## 2026-03-09: Workflow Skills (/sync, /design, /done) + +**Request**: Replace rigid QSP upfront classification with three entry-point skills that auto-detect scope at completion time. + +**Decisions**: +- Three skills (`/sync`, `/design`, `/done`) replace mandatory upfront QSP classification -- scope is now auto-detected by `/done` based on branch, diff size, and plan state +- `/plan` renamed to `/design` because `/plan` is a built-in Claude Code command (enters read-only plan mode) -- `/design` is distinct and forms a natural arc: sync -> design -> done +- `/ship` command absorbed into `/done` Phase 2 -- the 3-tier checklist (Blockers/High Priority/Recommended) is preserved in `/done`'s validate phase +- `/sync` and `/done` have `disable-model-invocation: true` (side effects: git fetch, git commit/push, PR creation); `/design` is intentionally model-invocable so Claude can suggest it during brainstorming +- QSP paths (Q/S/P) and their step descriptions preserved in DEVELOPMENT_PROCESS.md -- skills orchestrate the paths, they don't replace them + ## 2026-03-04: Devcontainer Permission Tiers **Request**: Expand Claude Code permissions for devcontainer usage, taking advantage of container isolation (firewall, non-root user, hooks) to reduce unnecessary permission prompts. diff --git a/docs/DEVELOPMENT_PROCESS.md b/docs/DEVELOPMENT_PROCESS.md index 687c862..41dcb87 100644 --- a/docs/DEVELOPMENT_PROCESS.md +++ b/docs/DEVELOPMENT_PROCESS.md @@ -19,22 +19,19 @@ This ensures continuity and prevents duplicated or missed work. ## Task Classification -Task complexity determines process depth. Classify each task, then follow the matching path. Within an activated path, execute all steps -- do not cherry-pick. +Scope (Q/S/P) is **auto-detected** by `/done` based on branch, diff size, and plan state. Do not classify manually upfront. -| Path | When to use | Examples | -|------|-------------|---------| -| **Q** (Quick) | Trivial, obvious, single-location change | Typo fix, config tweak, one-liner bug fix | -| **S** (Standard) | Fits in one session, clear scope | New feature, multi-file refactor, bug requiring investigation | -| **P** (Project) | Needs phased execution across sessions | Multi-phase feature, architectural change, large migration | +| Scope | When detected | Examples | +|-------|---------------|---------| +| **Q** (Quick) | On main/master, <=3 files, <100 lines | Typo fix, config tweak, one-liner bug fix | +| **S** (Standard) | On feature branch, no active plan phases | New feature, multi-file refactor, bug requiring investigation | +| **P** (Project) | IMPLEMENTATION_PLAN.md has unchecked phases | Multi-phase feature, architectural change, large migration | --- -## Pre-flight (all paths) +## Pre-flight -Before making any changes: - -1. **Sync** -- `git fetch origin && git status` to confirm you are on the correct branch and up to date with remote. If behind, pull or rebase before proceeding -2. **Classify** -- state the QSP classification to the user before proceeding +Run `/sync` before starting any work. It fetches remote refs, reports branch state, dirty files, ahead/behind counts, and recent commits. --- @@ -163,13 +160,25 @@ All hooks require `jq` for JSON parsing and degrade gracefully if jq is missing. ## Commands -3 slash commands in `.claude/commands/`: +2 slash commands in `.claude/commands/`: | Command | Purpose | |---------|---------| | `/catchup` | Context restoration after `/clear`. Reads IMPLEMENTATION_PLAN.md, CHANGELOG.md, git history; recommends next steps. | | `/security-audit` | 6-phase Python security scan (deps, secrets, code patterns, input validation, config, scoring). Outputs A-F grade. | -| `/ship` | Pre-deployment checklist with 3 tiers: Blockers (tests, lint, types, secrets), High Priority (coverage, TODOs, docs), Recommended (git history, branch sync). | + +--- + +## Skills + +4 skills in `.claude/skills/`: + +| Skill | Purpose | +|-------|---------| +| `/sync` | Pre-flight workspace sync. Fetches remote, reports branch state, dirty files, ahead/behind, recent commits. | +| `/design` | Crystallize brainstorming into a structured plan. Reads DECISIONS.md for conflicts, auto-classifies scope, outputs actionable plan. | +| `/done` | Universal completion. Auto-detects scope (Q/S/P), validates (3-tier checklist), ships/lands/delivers, updates docs. Absorbs former `/ship`. | +| `/edit-permissions` | Manage Claude Code permission rules in settings.json. Pattern syntax reference and safety guardrails. | --- @@ -202,4 +211,4 @@ Update changelog for every MINOR or MAJOR version bump. Patch updates are option ## PCC Shorthand -When the user says **"PCC"** or **"PCC now"**, execute S.5 through S.7 in order (Validate, Ship, Document). +When the user says **"PCC"** or **"PCC now"**, run `/done` (which executes Validate, Ship/Land/Deliver, and Document). diff --git a/docs/WORKFLOW_SKILLS_PLAN.md b/docs/WORKFLOW_SKILLS_PLAN.md index 6a17a59..8afac1a 100644 --- a/docs/WORKFLOW_SKILLS_PLAN.md +++ b/docs/WORKFLOW_SKILLS_PLAN.md @@ -2,7 +2,7 @@ Replace rigid QSP upfront classification with three entry-point skills that meet the user where they are. -**Status**: Planned (not yet implemented) +**Status**: Implemented (2026-03-09). Note: `/plan` renamed to `/design` because `/plan` is a built-in Claude Code command. --- diff --git a/tests/test_commands.py b/tests/test_commands.py index 3b77133..de00ff1 100644 --- a/tests/test_commands.py +++ b/tests/test_commands.py @@ -9,7 +9,6 @@ ALL_COMMANDS = [ "catchup.md", "security-audit.md", - "ship.md", ] @@ -96,21 +95,3 @@ def test_security_audit_checks_code_patterns(self) -> None: content = (COMMANDS_DIR / "security-audit.md").read_text(encoding="utf-8") for pattern in ["eval", "exec", "pickle", "subprocess"]: assert pattern in content, f"security-audit missing code pattern: {pattern}" - - def test_ship_has_three_tiers(self) -> None: - content = (COMMANDS_DIR / "ship.md").read_text(encoding="utf-8") - assert "Blocker" in content, "ship should have Blockers tier" - assert "High Priority" in content, "ship should have High Priority tier" - assert "Recommended" in content, "ship should have Recommended tier" - - def test_ship_checks_tests(self) -> None: - content = (COMMANDS_DIR / "ship.md").read_text(encoding="utf-8") - assert "pytest" in content, "ship should run tests" - - def test_ship_checks_lint(self) -> None: - content = (COMMANDS_DIR / "ship.md").read_text(encoding="utf-8") - assert "ruff" in content, "ship should check linting" - - def test_ship_checks_types(self) -> None: - content = (COMMANDS_DIR / "ship.md").read_text(encoding="utf-8") - assert "pyright" in content, "ship should check types" diff --git a/tests/test_skills.py b/tests/test_skills.py new file mode 100644 index 0000000..0fe0ab8 --- /dev/null +++ b/tests/test_skills.py @@ -0,0 +1,187 @@ +"""Tests for .claude/skills/ -- validates skill files exist and have correct structure.""" + +from pathlib import Path + +import pytest + +SKILLS_DIR = Path(__file__).parent.parent / ".claude" / "skills" + +ALL_SKILLS = [ + "edit-permissions", + "sync", + "design", + "done", +] + + +class TestSkillExistence: + """Verify all expected skill directories and files exist.""" + + def test_skills_directory_exists(self) -> None: + assert SKILLS_DIR.exists(), f"{SKILLS_DIR} does not exist" + assert SKILLS_DIR.is_dir(), f"{SKILLS_DIR} is not a directory" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_directory_exists(self, skill_name: str) -> None: + skill_dir = SKILLS_DIR / skill_name + assert skill_dir.exists(), f"Skill directory missing: {skill_name}" + assert skill_dir.is_dir(), f"{skill_name} is not a directory" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_file_exists(self, skill_name: str) -> None: + skill_path = SKILLS_DIR / skill_name / "SKILL.md" + assert skill_path.exists(), f"SKILL.md missing for: {skill_name}" + + +class TestSkillFrontmatter: + """Verify skill files have correct frontmatter.""" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_has_frontmatter(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + assert content.startswith("---"), f"{skill_name} missing YAML frontmatter" + parts = content.split("---", 2) + assert len(parts) >= 3, f"{skill_name} has unclosed frontmatter" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_has_name(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + assert "name:" in content, f"{skill_name} missing name in frontmatter" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_has_description(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + assert "description:" in content, f"{skill_name} missing description in frontmatter" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_has_allowed_tools(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + assert "allowed-tools:" in content, f"{skill_name} missing allowed-tools in frontmatter" + + +class TestSkillBody: + """Verify skill files have meaningful body content.""" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_body_not_empty(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + parts = content.split("---", 2) + body = parts[2].strip() if len(parts) >= 3 else "" + assert len(body) > 100, f"{skill_name} body is too short ({len(body)} chars)" + + @pytest.mark.parametrize("skill_name", ALL_SKILLS) + def test_skill_has_markdown_heading(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + parts = content.split("---", 2) + body = parts[2] if len(parts) >= 3 else "" + assert "# " in body, f"{skill_name} missing markdown heading in body" + + +class TestSkillSideEffects: + """Verify side-effect declarations are correct.""" + + @pytest.mark.parametrize("skill_name", ["sync", "done"]) + def test_side_effect_skills_disable_model_invocation(self, skill_name: str) -> None: + content = (SKILLS_DIR / skill_name / "SKILL.md").read_text(encoding="utf-8") + parts = content.split("---", 2) + frontmatter = parts[1] if len(parts) >= 3 else "" + assert "disable-model-invocation: true" in frontmatter, ( + f"{skill_name} should have disable-model-invocation: true (has side effects)" + ) + + def test_design_allows_model_invocation(self) -> None: + content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") + parts = content.split("---", 2) + frontmatter = parts[1] if len(parts) >= 3 else "" + assert "disable-model-invocation" not in frontmatter, ( + "design should NOT have disable-model-invocation (intentionally model-invocable)" + ) + + +class TestSkillContent: + """Verify specific content per skill.""" + + # /sync + def test_sync_runs_git_fetch(self) -> None: + content = (SKILLS_DIR / "sync" / "SKILL.md").read_text(encoding="utf-8") + assert "git fetch" in content, "sync should run git fetch" + + def test_sync_checks_git_status(self) -> None: + content = (SKILLS_DIR / "sync" / "SKILL.md").read_text(encoding="utf-8") + assert "git status" in content, "sync should check git status" + + def test_sync_shows_recent_commits(self) -> None: + content = (SKILLS_DIR / "sync" / "SKILL.md").read_text(encoding="utf-8") + assert "git log" in content, "sync should show recent commits" + + def test_sync_does_not_classify(self) -> None: + content = (SKILLS_DIR / "sync" / "SKILL.md").read_text(encoding="utf-8") + assert "does NOT classify" in content.lower() or "Does NOT classify" in content, ( + "sync should explicitly state it does not classify tasks" + ) + + # /design + def test_design_reads_decisions(self) -> None: + content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") + assert "DECISIONS.md" in content, "design should read DECISIONS.md" + + def test_design_reads_implementation_plan(self) -> None: + content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") + assert "IMPLEMENTATION_PLAN.md" in content, "design should read IMPLEMENTATION_PLAN.md" + + def test_design_classifies_scope(self) -> None: + content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") + assert "Q" in content and "S" in content and "P" in content, "design should classify scope as Q/S/P" + + def test_design_has_argument_hint(self) -> None: + content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") + assert "argument-hint:" in content, "design should have argument-hint in frontmatter" + + # /done + def test_done_has_four_phases(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "Phase 1" in content, "done should have Phase 1 (Detect)" + assert "Phase 2" in content, "done should have Phase 2 (Validate)" + assert "Phase 3" in content, "done should have Phase 3 (Ship/Land/Deliver)" + assert "Phase 4" in content, "done should have Phase 4 (Document)" + + def test_done_references_agents(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "code-quality-validator" in content, "done should reference code-quality-validator agent" + assert "test-coverage-validator" in content, "done should reference test-coverage-validator agent" + assert "pr-writer" in content, "done should reference pr-writer agent" + + def test_done_has_blocker_tier(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "Blocker" in content, "done should have Blockers validation tier" + + def test_done_has_high_priority_tier(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "High Priority" in content, "done should have High Priority validation tier" + + def test_done_has_recommended_tier(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "Recommended" in content, "done should have Recommended validation tier" + + def test_done_checks_secrets(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "secret" in content.lower(), "done should scan for secrets" + + def test_done_checks_debug_code(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "breakpoint()" in content, "done should check for breakpoint()" + assert "pdb" in content, "done should check for pdb" + + def test_done_updates_changelog(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "CHANGELOG.md" in content, "done should update CHANGELOG.md" + + def test_done_updates_decisions(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "DECISIONS.md" in content, "done should update DECISIONS.md" + + def test_done_has_scope_detection(self) -> None: + content = (SKILLS_DIR / "done" / "SKILL.md").read_text(encoding="utf-8") + assert "ship" in content.lower(), "done should describe Q=ship" + assert "land" in content.lower(), "done should describe S=land" + assert "deliver" in content.lower(), "done should describe P=deliver" From 7d4ecb0afeaf3620a584c7b4f406998e66f1d05c Mon Sep 17 00:00:00 2001 From: Martin Stransky Date: Mon, 9 Mar 2026 19:43:16 +0100 Subject: [PATCH 2/4] fix: Address CodeRabbit review findings on workflow skills - Fix inconsistent terminology: use "unchecked phases" consistently in done/SKILL.md - Move print() check out of Blockers (breakpoint/pdb remain hard blockers) - Broaden skip condition: "no .py files changed" instead of "all files are .md" - Q+branch protection now re-detects as S instead of ad-hoc PR workflow - Clarify /design is planning-time estimate vs /done's workspace-signal detection - Reference implementation-tracker agent for IMPLEMENTATION_PLAN.md format - Fix logic bug in test_sync_does_not_classify (case-insensitive check) - Split long CHANGELOG bullet into per-skill entries - Update all stale /plan references to /design in WORKFLOW_SKILLS_PLAN.md - Add CLAUDE.md note: plans must include QSP and read DEVELOPMENT_PROCESS.md first Co-Authored-By: Claude Opus 4.6 --- .claude/skills/design/SKILL.md | 4 ++-- .claude/skills/done/SKILL.md | 10 +++++----- CLAUDE.md | 2 +- docs/CHANGELOG.md | 4 +++- docs/WORKFLOW_SKILLS_PLAN.md | 12 ++++++------ tests/test_skills.py | 4 +--- 6 files changed, 18 insertions(+), 18 deletions(-) diff --git a/.claude/skills/design/SKILL.md b/.claude/skills/design/SKILL.md index 083db70..30d2805 100644 --- a/.claude/skills/design/SKILL.md +++ b/.claude/skills/design/SKILL.md @@ -19,7 +19,7 @@ Crystallize brainstorming into a structured implementation plan. Use at the star ### 2. Auto-Classify Scope -Based on conversation context, classify as: +This is a planning-time estimate based on conversation context. `/done` will later auto-detect actual scope from workspace signals (branch, files changed, diff size, plan state) at completion time. | Scope | Criteria | |-------|----------| @@ -73,7 +73,7 @@ The plan format varies by scope: ... ``` -For P-scoped plans: write the phase breakdown to `docs/IMPLEMENTATION_PLAN.md`. +For P-scoped plans: write the phase breakdown to `docs/IMPLEMENTATION_PLAN.md` using the same structure shown above (phase name, acceptance criteria, files, approach). The `.claude/agents/implementation-tracker.md` agent validates this format. ### 4. Decision Candidates diff --git a/.claude/skills/done/SKILL.md b/.claude/skills/done/SKILL.md index 3cc6039..8ac0758 100644 --- a/.claude/skills/done/SKILL.md +++ b/.claude/skills/done/SKILL.md @@ -17,11 +17,11 @@ Determine scope from workspace signals: |--------|----------|----------|-------------| | Branch | main/master | feature branch | feature branch | | Files changed | <=3 | >3 | any | -| IMPLEMENTATION_PLAN.md | no active phases | no active phases | has unchecked phases | +| IMPLEMENTATION_PLAN.md | no unchecked phases | no unchecked phases | has unchecked phases | | Diff size | <100 lines | >=100 lines | any | **Decision logic** (first match wins): -1. If `docs/IMPLEMENTATION_PLAN.md` has active/unchecked phases -> **P** (deliver) +1. If `docs/IMPLEMENTATION_PLAN.md` has unchecked phases -> **P** (deliver) 2. If on a feature branch -> **S** (land) 3. If on main/master AND small scope (<=3 files, <100 lines changed) -> **Q** (ship) 4. If on main/master AND large scope -> warn user, suggest creating a feature branch @@ -40,7 +40,7 @@ Absorbs the former `/ship` checklist. Three tiers of checks: 2. **No debug code** - Search for: `breakpoint()`, `pdb.set_trace()` in non-test source files - - Flag `print(` statements for review (may be intentional) + - These are hard blockers -- any match stops the process 3. **Pre-commit hygiene** - Search for leftover `TODO`, `FIXME`, `HACK` markers in changed files @@ -64,7 +64,7 @@ All agents use `subagent_type: "general-purpose"`. ### Skip Conditions -- If ALL changed files are `.md` files: skip Python tooling (lint, format, types, tests) +- If no `.py` files are changed: skip Python tooling (lint, format, types, tests) - Report skipped checks and why ### Blocker Found @@ -82,7 +82,7 @@ Actions depend on detected scope: 3. `git push` to main/master 4. `gh run watch` to verify CI passes -If branch protection is enabled: push to a short-lived branch, create PR with `gh pr create --fill`, watch checks, merge, delete branch. +Note: If a direct push to main fails due to branch protection, re-detect scope as **S (Land)** and follow the S path instead. ### S (Land) diff --git a/CLAUDE.md b/CLAUDE.md index 7244d67..f712a4b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,7 +2,7 @@ ## Development Process -Use `/sync` before starting work, `/design` to formalize a plan, and `/done` when finished. Scope (Q/S/P) is auto-detected by `/done` -- do not classify manually. See `docs/DEVELOPMENT_PROCESS.md` for the full workflow. +Use `/sync` before starting work, `/design` to formalize a plan, and `/done` when finished. Scope (Q/S/P) is auto-detected by `/done` -- do not classify manually. Before creating any plan, read `docs/DEVELOPMENT_PROCESS.md` first and include the Q/S/P classification in the plan where applicable. ## Security diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index e83f898..b47b1f6 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -8,7 +8,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added -- 3 workflow skills (`/sync`, `/design`, `/done`) replace rigid upfront QSP classification -- `/sync` checks workspace readiness before work, `/design` crystallizes brainstorming into structured plans with conflict detection, `/done` auto-detects scope (Q/S/P) and runs the full validate-ship-document pipeline including the former `/ship` checklist +- Workflow skill `/sync` checks workspace readiness before starting work (git fetch, status, branch info, warnings) +- Workflow skill `/design` crystallizes brainstorming into structured plans with conflict detection against DECISIONS.md +- Workflow skill `/done` auto-detects scope (Q/S/P) and runs the full validate-ship-document pipeline, including the former `/ship` checklist ### Changed - QSP scope classification is now auto-detected by `/done` based on branch, diff size, and IMPLEMENTATION_PLAN.md state -- users no longer classify manually before starting work diff --git a/docs/WORKFLOW_SKILLS_PLAN.md b/docs/WORKFLOW_SKILLS_PLAN.md index 8afac1a..355a5f4 100644 --- a/docs/WORKFLOW_SKILLS_PLAN.md +++ b/docs/WORKFLOW_SKILLS_PLAN.md @@ -22,7 +22,7 @@ Three skills replace the rigid classify-then-follow model: | Skill | Purpose | When to use | |-------|---------|-------------| | `/sync` | Pre-flight workspace sync | Before starting any work | -| `/plan` | Crystallize brainstorm into structured plan | At start or end of brainstorming | +| `/design` | Crystallize brainstorm into structured plan | At start or end of brainstorming | | `/done` | Validate + ship/land/deliver + document | When work is finished | QSP classification moves from a gate to an auto-detected property of `/done`. @@ -35,8 +35,8 @@ Interactive workflows (user calls /done): Debugging: /sync -> debug with Claude -> /done Delegated workflows (Claude handles /done automatically): - Feature: /sync -> brainstorm -> /plan -> "implement this" -> [build + /done] - Project: /sync -> brainstorm -> /plan -> "implement phase N" -> [build + /done] + Feature: /sync -> brainstorm -> /design -> "implement this" -> [build + /done] + Project: /sync -> brainstorm -> /design -> "implement phase N" -> [build + /done] Read-only: no skills needed -- explore, analyze, review freely ``` @@ -78,14 +78,14 @@ Pre-flight workspace sync. Lightweight, fast. Does NOT classify the task. Does NOT read DECISIONS.md. Just "is my workspace ready?" -### /plan +### /design Crystallize brainstorming into a structured plan. Works at BOTH start and end: - **At start**: Guides brainstorming by reading codebase, checking DECISIONS.md - **At end**: Structures the discussion into an actionable plan with scope classification -Auto-trigger: Claude should suggest `/plan` when brainstorming seems ready to formalize. +Auto-trigger: Claude should suggest `/design` when brainstorming seems ready to formalize. Steps: 1. Read `docs/DECISIONS.md` -- check for conflicts @@ -144,7 +144,7 @@ Universal "I'm finished" command. Four phases: | Action | File | |--------|------| | CREATE | `.claude/skills/sync/SKILL.md` | -| CREATE | `.claude/skills/plan/SKILL.md` | +| CREATE | `.claude/skills/design/SKILL.md` | | CREATE | `.claude/skills/done/SKILL.md` | | DELETE | `.claude/commands/ship.md` | | MODIFY | `CLAUDE.md` | diff --git a/tests/test_skills.py b/tests/test_skills.py index 0fe0ab8..c7f8808 100644 --- a/tests/test_skills.py +++ b/tests/test_skills.py @@ -116,9 +116,7 @@ def test_sync_shows_recent_commits(self) -> None: def test_sync_does_not_classify(self) -> None: content = (SKILLS_DIR / "sync" / "SKILL.md").read_text(encoding="utf-8") - assert "does NOT classify" in content.lower() or "Does NOT classify" in content, ( - "sync should explicitly state it does not classify tasks" - ) + assert "does not classify" in content.lower(), "sync should explicitly state it does not classify tasks" # /design def test_design_reads_decisions(self) -> None: From 52b743bc4974472d206486a4d37d78516df52507 Mon Sep 17 00:00:00 2001 From: Martin Stransky Date: Mon, 9 Mar 2026 20:10:20 +0100 Subject: [PATCH 3/4] fix: Address CodeRabbit round 2 review findings - Clarify CLAUDE.md: /design estimates scope, /done auto-detects actual scope - Strengthen test_design_classifies_scope to check full labels not single letters - Remove stale /ship reference from older CHANGELOG Added entry Co-Authored-By: Claude Opus 4.6 --- CLAUDE.md | 2 +- docs/CHANGELOG.md | 2 +- tests/test_skills.py | 4 +++- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index f712a4b..db72ed1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,7 +2,7 @@ ## Development Process -Use `/sync` before starting work, `/design` to formalize a plan, and `/done` when finished. Scope (Q/S/P) is auto-detected by `/done` -- do not classify manually. Before creating any plan, read `docs/DEVELOPMENT_PROCESS.md` first and include the Q/S/P classification in the plan where applicable. +Use `/sync` before starting work, `/design` to formalize a plan, and `/done` when finished. `/design` estimates scope (Q/S/P) during planning; `/done` auto-detects actual scope at completion based on workspace signals. Before creating any plan, read `docs/DEVELOPMENT_PROCESS.md` first. ## Security diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index b47b1f6..4868c58 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -22,7 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - Three graduated permission tiers (Assisted, Autonomous, Full Trust) for devcontainer environments -- container isolation (firewall, non-root, hooks) enables safely expanding Claude Code permissions, reducing unnecessary prompts from dozens per session to zero in Tier 2/3 while blocking tool installation, package publishing, and container escape vectors via curated deny lists and a policy-enforcement hook - 5 hook scripts in `.claude/hooks/` run automatically during Claude Code sessions -- 3 security hooks block destructive commands, secret leaks, and invisible Unicode attacks in real time; 2 productivity hooks auto-format Python files and auto-run associated tests after every edit -- 3 slash commands (`/catchup`, `/security-audit`, `/ship`) provide one-command context restoration after `/clear`, a 6-phase security posture scan with A-F grading, and a 3-tier pre-deployment checklist +- 2 slash commands (`/catchup`, `/security-audit`) provide one-command context restoration after `/clear` and a 6-phase security posture scan with A-F grading - 3 new specialized agents: `security-auditor` (OWASP-based vulnerability analysis, read-only), `refactoring-specialist` (SOLID/code smell detection, read-only), `output-evaluator` (LLM-as-Judge quality scoring for automated pipelines) - 4 review rules in `.claude/rules/` auto-loaded as project context -- cover architecture, code quality, performance, and test quality concerns that linters cannot catch - AI-powered PR review via GitHub Actions (`claude-code-review.yml`) using `anthropics/claude-code-action@v1` -- automatically reviews PRs with read-only tools on open/sync/ready_for_review diff --git a/tests/test_skills.py b/tests/test_skills.py index c7f8808..cebb31e 100644 --- a/tests/test_skills.py +++ b/tests/test_skills.py @@ -129,7 +129,9 @@ def test_design_reads_implementation_plan(self) -> None: def test_design_classifies_scope(self) -> None: content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") - assert "Q" in content and "S" in content and "P" in content, "design should classify scope as Q/S/P" + assert "**Q** (Quick)" in content and "**S** (Standard)" in content and "**P** (Project)" in content, ( + "design should classify scope as Q/S/P with descriptive labels" + ) def test_design_has_argument_hint(self) -> None: content = (SKILLS_DIR / "design" / "SKILL.md").read_text(encoding="utf-8") From 7df7c7ad569ac644db23afb756fef8d1085111a6 Mon Sep 17 00:00:00 2001 From: Martin Stransky Date: Mon, 9 Mar 2026 20:17:54 +0100 Subject: [PATCH 4/4] fix: Consolidate duplicate Added headings in CHANGELOG Merge the two ### Added sections under [Unreleased] into one per Keep a Changelog format. Move workflow skills Changed/Removed entries into the existing Changed/Removed sections. Co-Authored-By: Claude Opus 4.6 --- docs/CHANGELOG.md | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 4868c58..14b3128 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -11,15 +11,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Workflow skill `/sync` checks workspace readiness before starting work (git fetch, status, branch info, warnings) - Workflow skill `/design` crystallizes brainstorming into structured plans with conflict detection against DECISIONS.md - Workflow skill `/done` auto-detects scope (Q/S/P) and runs the full validate-ship-document pipeline, including the former `/ship` checklist - -### Changed -- QSP scope classification is now auto-detected by `/done` based on branch, diff size, and IMPLEMENTATION_PLAN.md state -- users no longer classify manually before starting work -- PCC shorthand now triggers `/done` instead of manually executing S.5-S.7 - -### Removed -- `/ship` slash command -- its 3-tier validation checklist (Blockers, High Priority, Recommended) is preserved in `/done` Phase 2 - -### Added - Three graduated permission tiers (Assisted, Autonomous, Full Trust) for devcontainer environments -- container isolation (firewall, non-root, hooks) enables safely expanding Claude Code permissions, reducing unnecessary prompts from dozens per session to zero in Tier 2/3 while blocking tool installation, package publishing, and container escape vectors via curated deny lists and a policy-enforcement hook - 5 hook scripts in `.claude/hooks/` run automatically during Claude Code sessions -- 3 security hooks block destructive commands, secret leaks, and invisible Unicode attacks in real time; 2 productivity hooks auto-format Python files and auto-run associated tests after every edit - 2 slash commands (`/catchup`, `/security-audit`) provide one-command context restoration after `/clear` and a 6-phase security posture scan with A-F grading @@ -40,6 +31,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed +- QSP scope classification is now auto-detected by `/done` based on branch, diff size, and IMPLEMENTATION_PLAN.md state -- users no longer classify manually before starting work +- PCC shorthand now triggers `/done` instead of manually executing S.5-S.7 - Setup script now makes `.claude/hooks/*.sh` files executable after placeholder substitution -- hook scripts work immediately after project setup without manual `chmod` - Agent count increased from 9 to 12 with security-auditor, refactoring-specialist, and output-evaluator - `docs/DEVELOPMENT_PROCESS.md` expanded with hooks, commands, rules reference tables and 4 new agent entries @@ -52,6 +45,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Removed +- `/ship` slash command -- its 3-tier validation checklist (Blockers, High Priority, Recommended) is preserved in `/done` Phase 2 - Shell Command Style and Allowed Operations sections from CLAUDE.md -- absolute path preferences and read-only command lists are now handled by settings.json permission rules rather than prose instructions ### Fixed