diff --git a/design-review/references/validation-log.md b/design-review/references/validation-log.md new file mode 100644 index 0000000000..e75a2e8496 --- /dev/null +++ b/design-review/references/validation-log.md @@ -0,0 +1,45 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-design-review` + +### 1. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time. diff --git a/office-hours/references/validation-log.md b/office-hours/references/validation-log.md new file mode 100644 index 0000000000..fe3558646e --- /dev/null +++ b/office-hours/references/validation-log.md @@ -0,0 +1,71 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-office-hours` + +### 1. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time. + +5. **Ask: what's your goal with this?** This is a real question, not a formality. The answer determines everything about how the session runs. + + Via AskUserQuestion, ask: + + > Before we dig in — what's your goal with this? + > + > - **Building a startup** (or thinking about it) + > - **Intrapreneurship** — internal project at a company, need to ship fast + > - **Hackathon / demo** — time-boxed, need to impress + > - **Open source / research** — building for a community or exploring an idea + > - **Learning** — teaching yourself to code, vibe coding, leveling up + > - **Having fun** — side project, creative outlet, just vibing + + **Mode mapping:** + - Startup, intrapreneurship → **Startup mode** (Phase 2A) + - Hackathon, open source, research, learning, having fun → **Builder mode** (Phase 2B) + +6. **Assess product stage** (only for startup/intrapreneurship modes): + - Pre-product (idea stage, no users yet) + - Has users (people using it, not yet paying) + - Has paying customers + +Output: "Here's what I understand about this project and the area you want to change: ..." + +--- diff --git a/plan-ceo-review/references/validation-log.md b/plan-ceo-review/references/validation-log.md new file mode 100644 index 0000000000..a88ddc0dcb --- /dev/null +++ b/plan-ceo-review/references/validation-log.md @@ -0,0 +1,125 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-plan-ceo-review` + +### 1. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time. + +### 2. Review Log + +## Review Log + +After producing the Completion Summary above, persist the review result. + +**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes review metadata to +`~/.gstack/` (user config directory, not project files). The skill preamble +already writes to `~/.gstack/sessions/` and `~/.gstack/analytics/` — this is +the same pattern. The review dashboard depends on this data. Skipping this +command breaks the review readiness dashboard in /ship. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-log '{"skill":"plan-ceo-review","timestamp":"TIMESTAMP","status":"STATUS","unresolved":N,"critical_gaps":N,"mode":"MODE","scope_proposed":N,"scope_accepted":N,"scope_deferred":N,"commit":"COMMIT"}' +``` + +Before running this command, substitute the placeholder values from the Completion Summary you just produced: +- **TIMESTAMP**: current ISO 8601 datetime (e.g., 2026-03-16T14:30:00) +- **STATUS**: "clean" if 0 unresolved decisions AND 0 critical gaps; otherwise "issues_open" +- **unresolved**: number from "Unresolved decisions" in the summary +- **critical_gaps**: number from "Failure modes: ___ CRITICAL GAPS" in the summary +- **MODE**: the mode the user selected (SCOPE_EXPANSION / SELECTIVE_EXPANSION / HOLD_SCOPE / SCOPE_REDUCTION) +- **scope_proposed**: number from "Scope proposals: ___ proposed" in the summary (0 for HOLD/REDUCTION) +- **scope_accepted**: number from "Scope proposals: ___ accepted" in the summary (0 for HOLD/REDUCTION) +- **scope_deferred**: number of items deferred to TODOS.md from scope decisions (0 for HOLD/REDUCTION) +- **COMMIT**: output of `git rev-parse --short HEAD` + +### 3. Review Readiness Dashboard + +## Review Readiness Dashboard + +After completing the review, read the review log and config to display the dashboard. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-read +``` + +Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review. + +**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before. + +Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer. + +Display: + +``` ++====================================================================+ +| REVIEW READINESS DASHBOARD | ++====================================================================+ +| Review | Runs | Last Run | Status | Required | +|-----------------|------|---------------------|-----------|----------| +| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES | +| CEO Review | 0 | — | — | no | +| Design Review | 0 | — | — | no | +| Adversarial | 0 | — | — | no | +| Outside Voice | 0 | — | — | no | ++--------------------------------------------------------------------+ +| VERDICT: CLEARED — Eng Review passed | ++====================================================================+ +``` + +**Review tiers:** +- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting). +- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. +- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. +- **Adversarial Review (automatic):** Always-on for every review. Every diff gets both Claude adversarial subagent and Codex adversarial challenge. Large diffs (200+ lines) additionally get Codex structured review with P1 gate. No configuration needed. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. + +**Verdict logic:** +- **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) +- **NOT CLEARED**: Eng Review missing, stale (>7 days), or has open issues +- CEO, Design, and Codex reviews are shown for context but never block shipping +- If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED + +**Staleness detection:** After displaying the dashboard, check if any existing reviews may be stale: +- Parse the \`---HEAD---\` section from the bash output to get the current HEAD commit hash +- For each review entry that has a \`commit\` field: compare it against the current HEAD. If different, count elapsed commits: \`git rev-list --count STORED_COMMIT..HEAD\`. Display: "Note: {skill} review from {date} may be stale — {N} commits since review" +- For entries without a \`commit\` field (legacy entries): display "Note: {skill} review from {date} has no commit tracking — consider re-running for accurate staleness detection" +- If all reviews match the current HEAD, do not display any staleness notes diff --git a/plan-design-review/references/validation-log.md b/plan-design-review/references/validation-log.md new file mode 100644 index 0000000000..6fe5af7967 --- /dev/null +++ b/plan-design-review/references/validation-log.md @@ -0,0 +1,267 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-plan-design-review` + +### 1. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time. + +### Pass 1: Information Architecture +Rate 0-10: Does the plan define what the user sees first, second, third? +FIX TO 10: Add information hierarchy to the plan. Include ASCII diagram of screen/page structure and navigation flow. Apply "constraint worship" — if you can only show 3 things, which 3? +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues, say so and move on. Do NOT proceed until user responds. + +### Pass 2: Interaction State Coverage +Rate 0-10: Does the plan specify loading, empty, error, success, partial states? +FIX TO 10: Add interaction state table to the plan: +``` + FEATURE | LOADING | EMPTY | ERROR | SUCCESS | PARTIAL + ---------------------|---------|-------|-------|---------|-------- + [each UI feature] | [spec] | [spec]| [spec]| [spec] | [spec] +``` +For each state: describe what the user SEES, not backend behavior. +Empty states are features — specify warmth, primary action, context. +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. + +### Pass 3: User Journey & Emotional Arc +Rate 0-10: Does the plan consider the user's emotional experience? +FIX TO 10: Add user journey storyboard: +``` + STEP | USER DOES | USER FEELS | PLAN SPECIFIES? + -----|------------------|-----------------|---------------- + 1 | Lands on page | [what emotion?] | [what supports it?] + ... +``` +Apply time-horizon design: 5-sec visceral, 5-min behavioral, 5-year reflective. +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. + +### Pass 4: AI Slop Risk +Rate 0-10: Does the plan describe specific, intentional UI — or generic patterns? +FIX TO 10: Rewrite vague UI descriptions with specific alternatives. + +### Design Hard Rules + +**Classifier — determine rule set before evaluating:** +- **MARKETING/LANDING PAGE** (hero-driven, brand-forward, conversion-focused) → apply Landing Page Rules +- **APP UI** (workspace-driven, data-dense, task-focused: dashboards, admin, settings) → apply App UI Rules +- **HYBRID** (marketing shell with app-like sections) → apply Landing Page Rules to hero/marketing sections, App UI Rules to functional sections + +**Hard rejection criteria** (instant-fail patterns — flag if ANY apply): +1. Generic SaaS card grid as first impression +2. Beautiful image with weak brand +3. Strong headline with no clear action +4. Busy imagery behind text +5. Sections repeating same mood statement +6. Carousel with no narrative purpose +7. App UI made of stacked cards instead of layout + +**Litmus checks** (answer YES/NO for each — used for cross-model consensus scoring): +1. Brand/product unmistakable in first screen? +2. One strong visual anchor present? +3. Page understandable by scanning headlines only? +4. Each section has one job? +5. Are cards actually necessary? +6. Does motion improve hierarchy or atmosphere? +7. Would design feel premium with all decorative shadows removed? + +**Landing page rules** (apply when classifier = MARKETING/LANDING): +- First viewport reads as one composition, not a dashboard +- Brand-first hierarchy: brand > headline > body > CTA +- Typography: expressive, purposeful — no default stacks (Inter, Roboto, Arial, system) +- No flat single-color backgrounds — use gradients, images, subtle patterns +- Hero: full-bleed, edge-to-edge, no inset/tiled/rounded variants +- Hero budget: brand, one headline, one supporting sentence, one CTA group, one image +- No cards in hero. Cards only when card IS the interaction +- One job per section: one purpose, one headline, one short supporting sentence +- Motion: 2-3 intentional motions minimum (entrance, scroll-linked, hover/reveal) +- Color: define CSS variables, avoid purple-on-white defaults, one accent color default +- Copy: product language not design commentary. "If deleting 30% improves it, keep deleting" +- Beautiful defaults: composition-first, brand as loudest text, two typefaces max, cardless by default, first viewport as poster not document + +**App UI rules** (apply when classifier = APP UI): +- Calm surface hierarchy, strong typography, few colors +- Dense but readable, minimal chrome +- Organize: primary workspace, navigation, secondary context, one accent +- Avoid: dashboard-card mosaics, thick borders, decorative gradients, ornamental icons +- Copy: utility language — orientation, status, action. Not mood/brand/aspiration +- Cards only when card IS the interaction +- Section headings state what area is or what user can do ("Selected KPIs", "Plan status") + +**Universal rules** (apply to ALL types): +- Define CSS variables for color system +- No default font stacks (Inter, Roboto, Arial, system) +- One job per section +- "If deleting 30% of the copy improves it, keep deleting" +- Cards earn their existence — no decorative card grids +- NEVER use small, low-contrast type (body text < 16px or contrast ratio < 4.5:1 on body text) +- NEVER put labels inside form fields as the only label (placeholder-as-label pattern — labels must be visible when the field has content) +- ALWAYS preserve visited vs unvisited link distinction (visited links must have a different color) +- NEVER float headings between paragraphs (heading must be visually closer to the section it introduces than to the preceding section) + +**AI Slop blacklist** (the 10 patterns that scream "AI-generated"): +1. Purple/violet/indigo gradient backgrounds or blue-to-purple color schemes +2. **The 3-column feature grid:** icon-in-colored-circle + bold title + 2-line description, repeated 3x symmetrically. THE most recognizable AI layout. +3. Icons in colored circles as section decoration (SaaS starter template look) +4. Centered everything (`text-align: center` on all headings, descriptions, cards) +5. Uniform bubbly border-radius on every element (same large radius on everything) +6. Decorative blobs, floating circles, wavy SVG dividers (if a section feels empty, it needs better content, not decoration) +7. Emoji as design elements (rockets in headings, emoji as bullet points) +8. Colored left-border on cards (`border-left: 3px solid `) +9. Generic hero copy ("Welcome to [X]", "Unlock the power of...", "Your all-in-one solution for...") +10. Cookie-cutter section rhythm (hero → 3 features → testimonials → pricing → CTA, every section same height) +11. system-ui or `-apple-system` as the PRIMARY display/body font — the "I gave up on typography" signal. Pick a real typeface. + +Source: [OpenAI "Designing Delightful Frontends with GPT-5.4"](https://developers.openai.com/blog/designing-delightful-frontends-with-gpt-5-4) (Mar 2026) + gstack design methodology. +- "Cards with icons" → what differentiates these from every SaaS template? +- "Hero section" → what makes this hero feel like THIS product? +- "Clean, modern UI" → meaningless. Replace with actual design decisions. +- "Dashboard with widgets" → what makes this NOT every other dashboard? +If visual mockups were generated in Step 0.5, evaluate them against the AI slop blacklist above. Read each mockup image using the read_file tool. Does the mockup fall into generic patterns (3-column grid, centered hero, stock-photo feel)? If so, flag it and offer to regenerate with more specific direction via `$D iterate --feedback "..."`. +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. + +### Pass 5: Design System Alignment +Rate 0-10: Does the plan align with DESIGN.md? +FIX TO 10: If DESIGN.md exists, annotate with specific tokens/components. If no DESIGN.md, flag the gap and recommend `/design-consultation`. +Flag any new component — does it fit the existing vocabulary? +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. + +### Pass 6: Responsive & Accessibility +Rate 0-10: Does the plan specify mobile/tablet, keyboard nav, screen readers? +FIX TO 10: Add responsive specs per viewport — not "stacked on mobile" but intentional layout changes. Add a11y: keyboard nav patterns, ARIA landmarks, touch target sizes (44px min), color contrast requirements. +**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. + +### Pass 7: Unresolved Design Decisions +Surface ambiguities that will haunt implementation: +``` + DECISION NEEDED | IF DEFERRED, WHAT HAPPENS + -----------------------------|--------------------------- + What does empty state look like? | Engineer ships "No items found." + Mobile nav pattern? | Desktop nav hides behind hamburger + ... +``` +If visual mockups were generated in Step 0.5, reference them as evidence when surfacing unresolved decisions. A mockup makes decisions concrete — e.g., "Your approved mockup shows a sidebar nav, but the plan doesn't specify mobile behavior. What happens to this sidebar on 375px?" +Each decision = one AskUserQuestion with recommendation + WHY + alternatives. Edit the plan with each decision as it's made. + +### Post-Pass: Update Mockups (if generated) + +If mockups were generated in Step 0.5 and review passes changed significant design decisions (information architecture restructure, new states, layout changes), offer to regenerate (one-shot, not a loop): + +AskUserQuestion: "The review passes changed [list major design changes]. Want me to regenerate mockups to reflect the updated plan? This ensures the visual reference matches what we're actually building." + +If yes, use `$D iterate` with feedback summarizing the changes, or `$D variants` with an updated brief. Save to the same `$_DESIGN_DIR` directory. + +### 2. Review Log + +## Review Log + +After producing the Completion Summary above, persist the review result. + +**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes review metadata to +`~/.gstack/` (user config directory, not project files). The skill preamble +already writes to `~/.gstack/sessions/` and `~/.gstack/analytics/` — this is +the same pattern. The review dashboard depends on this data. Skipping this +command breaks the review readiness dashboard in /ship. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-log '{"skill":"plan-design-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"unresolved":N,"decisions_made":N,"commit":"COMMIT"}' +``` + +Substitute values from the Completion Summary: +- **TIMESTAMP**: current ISO 8601 datetime +- **STATUS**: "clean" if overall score 8+ AND 0 unresolved; otherwise "issues_open" +- **initial_score**: initial overall design score before fixes (0-10) +- **overall_score**: final overall design score after fixes (0-10) +- **unresolved**: number of unresolved design decisions +- **decisions_made**: number of design decisions added to the plan +- **COMMIT**: output of `git rev-parse --short HEAD` + +### 3. Review Readiness Dashboard + +## Review Readiness Dashboard + +After completing the review, read the review log and config to display the dashboard. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-read +``` + +Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review. + +**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before. + +Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer. + +Display: + +``` ++====================================================================+ +| REVIEW READINESS DASHBOARD | ++====================================================================+ +| Review | Runs | Last Run | Status | Required | +|-----------------|------|---------------------|-----------|----------| +| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES | +| CEO Review | 0 | — | — | no | +| Design Review | 0 | — | — | no | +| Adversarial | 0 | — | — | no | +| Outside Voice | 0 | — | — | no | ++--------------------------------------------------------------------+ +| VERDICT: CLEARED — Eng Review passed | ++====================================================================+ +``` + +**Review tiers:** +- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting). +- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. +- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. +- **Adversarial Review (automatic):** Always-on for every review. Every diff gets both Claude adversarial subagent and Codex adversarial challenge. Large diffs (200+ lines) additionally get Codex structured review with P1 gate. No configuration needed. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. + +**Verdict logic:** +- **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) +- **NOT CLEARED**: Eng Review missing, stale (>7 days), or has open issues +- CEO, Design, and Codex reviews are shown for context but never block shipping +- If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED + +**Staleness detection:** After displaying the dashboard, check if any existing reviews may be stale: +- Parse the \`---HEAD---\` section from the bash output to get the current HEAD commit hash +- For each review entry that has a \`commit\` field: compare it against the current HEAD. If different, count elapsed commits: \`git rev-list --count STORED_COMMIT..HEAD\`. Display: "Note: {skill} review from {date} may be stale — {N} commits since review" +- For entries without a \`commit\` field (legacy entries): display "Note: {skill} review from {date} has no commit tracking — consider re-running for accurate staleness detection" +- If all reviews match the current HEAD, do not display any staleness notes diff --git a/plan-devex-review/references/validation-log.md b/plan-devex-review/references/validation-log.md new file mode 100644 index 0000000000..48275c8981 --- /dev/null +++ b/plan-devex-review/references/validation-log.md @@ -0,0 +1,316 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-plan-devex-review` + +### 1. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time. + +### DX Trend Check + +Before starting review passes, check for prior DX reviews on this project: + +```bash +eval "$(~/.hermes/skills/gstack/bin/gstack-slug 2>/dev/null)" +~/.hermes/skills/gstack/bin/gstack-review-read 2>/dev/null | grep plan-devex-review || echo "NO_PRIOR_DX_REVIEWS" +``` + +If prior reviews exist, display the trend: +``` +DX TREND (prior reviews): + Dimension | Prior Score | Notes + Getting Started | 4/10 | from 2026-03-15 + ... +``` + +### Pass 1: Getting Started Experience (Zero Friction) + +Rate 0-10: Can a developer go from zero to hello world in under 5 minutes? + +**Evidence recall:** Reference the competitive benchmark from 0C (target tier), the +magical moment from 0D (delivery vehicle), and any Install/Hello World friction +points from 0F. + +Load reference: Read the "## Pass 1" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Installation**: One command? One click? No prerequisites? +- **First run**: Does the first command produce visible, meaningful output? +- **Sandbox/Playground**: Can developers try before installing? +- **Free tier**: No credit card, no sales call, no company email? +- **Quick start guide**: Copy-paste complete? Shows real output? +- **Auth/credential bootstrapping**: How many steps between "I want to try" and "it works"? +- **Magical moment delivery**: Is the vehicle chosen in 0D actually in the plan? +- **Competitive gap**: How far is the TTHW from the target tier chosen in 0C? + +FIX TO 10: Write the ideal getting started sequence. Specify exact commands, +expected output, and time budget per step. Target: 3 steps or fewer, under the +time chosen in 0C. + +Stripe test: Can a [persona from 0A] go from "never heard of this" to "it worked" +in one terminal session without leaving the terminal? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. Reference the persona. + +### Pass 2: API/CLI/SDK Design (Usable + Useful) + +Rate 0-10: Is the interface intuitive, consistent, and complete? + +**Evidence recall:** Does the API surface match [persona from 0A]'s mental model? +A YC founder expects `tool.do(thing)`. A platform engineer expects +`tool.configure(options).execute(thing)`. + +Load reference: Read the "## Pass 2" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Naming**: Guessable without docs? Consistent grammar? +- **Defaults**: Every parameter has a sensible default? Simplest call gives useful result? +- **Consistency**: Same patterns across the entire API surface? +- **Completeness**: 100% coverage or do devs drop to raw HTTP for edge cases? +- **Discoverability**: Can devs explore from CLI/playground without docs? +- **Reliability/trust**: Latency, retries, rate limits, idempotency, offline behavior? +- **Progressive disclosure**: Simple case is production-ready, complexity revealed gradually? +- **Persona fit**: Does the interface match how [persona] thinks about the problem? + +Good API design test: Can a [persona] use this API correctly after seeing one example? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 3: Error Messages & Debugging (Fight Uncertainty) + +Rate 0-10: When something goes wrong, does the developer know what happened, why, +and how to fix it? + +**Evidence recall:** Reference any error-related friction points from 0F and confusion +points from 0G. + +Load reference: Read the "## Pass 3" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +**Trace 3 specific error paths** from the plan or codebase. For each, evaluate against +the three-tier system from the Hall of Fame: +- **Tier 1 (Elm):** Conversational, first person, exact location, suggested fix +- **Tier 2 (Rust):** Error code links to tutorial, primary + secondary labels, help section +- **Tier 3 (Stripe API):** Structured JSON with type, code, message, param, doc_url + +For each error path, show what the developer currently sees vs. what they should see. + +Also evaluate: +- **Permission/sandbox/safety model**: What can go wrong? How clear is the blast radius? +- **Debug mode**: Verbose output available? +- **Stack traces**: Useful or internal framework noise? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 4: Documentation & Learning (Findable + Learn by Doing) + +Rate 0-10: Can a developer find what they need and learn by doing? + +**Evidence recall:** Does the docs architecture match [persona from 0A]'s learning +style? A YC founder needs copy-paste examples front and center. A platform engineer +needs architecture docs and API reference. + +Load reference: Read the "## Pass 4" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Information architecture**: Find what they need in under 2 minutes? +- **Progressive disclosure**: Beginners see simple, experts find advanced? +- **Code examples**: Copy-paste complete? Work as-is? Real context? +- **Interactive elements**: Playgrounds, sandboxes, "try it" buttons? +- **Versioning**: Docs match the version dev is using? +- **Tutorials vs references**: Both exist? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 5: Upgrade & Migration Path (Credible) + +Rate 0-10: Can developers upgrade without fear? + +Load reference: Read the "## Pass 5" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Backward compatibility**: What breaks? Blast radius limited? +- **Deprecation warnings**: Advance notice? Actionable? ("use newMethod() instead") +- **Migration guides**: Step-by-step for every breaking change? +- **Codemods**: Automated migration scripts? +- **Versioning strategy**: Semantic versioning? Clear policy? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 6: Developer Environment & Tooling (Valuable + Accessible) + +Rate 0-10: Does this integrate into developers' existing workflows? + +**Evidence recall:** Does local dev setup work for [persona from 0A]'s typical +environment? + +Load reference: Read the "## Pass 6" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Editor integration**: Language server? Autocomplete? Inline docs? +- **CI/CD**: Works in GitHub Actions, GitLab CI? Non-interactive mode? +- **TypeScript support**: Types included? Good IntelliSense? +- **Testing support**: Easy to mock? Test utilities? +- **Local development**: Hot reload? Watch mode? Fast feedback? +- **Cross-platform**: Mac, Linux, Windows? Docker? ARM/x86? +- **Local env reproducibility**: Works across OS, package managers, containers, proxies? +- **Observability/testability**: Dry-run mode? Verbose output? Sample apps? Fixtures? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 7: Community & Ecosystem (Findable + Desirable) + +Rate 0-10: Is there a community, and does the plan invest in ecosystem health? + +Load reference: Read the "## Pass 7" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **Open source**: Code open? Permissive license? +- **Community channels**: Where do devs ask questions? Someone answering? +- **Examples**: Real-world, runnable? Not just hello world? +- **Plugin/extension ecosystem**: Can devs extend it? +- **Contributing guide**: Process clear? +- **Pricing transparency**: No surprise bills? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Pass 8: DX Measurement & Feedback Loops (Implement + Refine) + +Rate 0-10: Does the plan include ways to measure and improve DX over time? + +Load reference: Read the "## Pass 8" section from `~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Evaluate: +- **TTHW tracking**: Can you measure getting started time? Is it instrumented? +- **Journey analytics**: Where do devs drop off? +- **Feedback mechanisms**: Bug reports? NPS? Feedback button? +- **Friction audits**: Periodic reviews planned? +- **Boomerang readiness**: Will /devex-review be able to measure reality vs. plan? + +**STOP.** AskUserQuestion once per issue. Recommend + WHY. + +### Appendix: Claude Code Skill DX Checklist + +**Conditional: only run when product type includes "Claude Code skill".** + +This is NOT a scored pass. It's a checklist of proven patterns from gstack's own DX. + +Load reference: Read the "## Claude Code Skill DX Checklist" section from +`~/.hermes/skills/gstack/plan-devex-review/dx-hall-of-fame.md`. + +Check each item. For any unchecked item, explain what's missing and suggest the fix. + +**STOP.** AskUserQuestion for any item that requires a design decision. + + + +When constructing the outside voice prompt, include the Developer Persona from Step 0A +and the Competitive Benchmark from Step 0C. The outside voice should critique the plan +in the context of who is using it and what they're competing against. + +### 2. Review Log + +## Review Log + +After producing the DX Scorecard above, persist the review result. + +**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes review metadata to +`~/.gstack/` (user config directory, not project files). + +```bash +~/.hermes/skills/gstack/bin/gstack-review-log '{"skill":"plan-devex-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"product_type":"TYPE","tthw_current":"TTHW_CURRENT","tthw_target":"TTHW_TARGET","mode":"MODE","persona":"PERSONA","competitive_tier":"TIER","pass_scores":{"getting_started":N,"api_design":N,"errors":N,"docs":N,"upgrade":N,"dev_env":N,"community":N,"measurement":N},"unresolved":N,"commit":"COMMIT"}' +``` + +Substitute values from the DX Scorecard. MODE is EXPANSION/POLISH/TRIAGE. +PERSONA is a short label (e.g., "yc-founder", "platform-eng"). +TIER is Champion/Competitive/NeedsWork/RedFlag. + +### 3. Review Readiness Dashboard + +## Review Readiness Dashboard + +After completing the review, read the review log and config to display the dashboard. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-read +``` + +Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review. + +**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before. + +Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer. + +Display: + +``` ++====================================================================+ +| REVIEW READINESS DASHBOARD | ++====================================================================+ +| Review | Runs | Last Run | Status | Required | +|-----------------|------|---------------------|-----------|----------| +| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES | +| CEO Review | 0 | — | — | no | +| Design Review | 0 | — | — | no | +| Adversarial | 0 | — | — | no | +| Outside Voice | 0 | — | — | no | ++--------------------------------------------------------------------+ +| VERDICT: CLEARED — Eng Review passed | ++====================================================================+ +``` + +**Review tiers:** +- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting). +- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. +- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. +- **Adversarial Review (automatic):** Always-on for every review. Every diff gets both Claude adversarial subagent and Codex adversarial challenge. Large diffs (200+ lines) additionally get Codex structured review with P1 gate. No configuration needed. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. + +**Verdict logic:** +- **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) +- **NOT CLEARED**: Eng Review missing, stale (>7 days), or has open issues +- CEO, Design, and Codex reviews are shown for context but never block shipping +- If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED + +**Staleness detection:** After displaying the dashboard, check if any existing reviews may be stale: +- Parse the \`---HEAD---\` section from the bash output to get the current HEAD commit hash +- For each review entry that has a \`commit\` field: compare it against the current HEAD. If different, count elapsed commits: \`git rev-list --count STORED_COMMIT..HEAD\`. Display: "Note: {skill} review from {date} may be stale — {N} commits since review" +- For entries without a \`commit\` field (legacy entries): display "Note: {skill} review from {date} has no commit tracking — consider re-running for accurate staleness detection" +- If all reviews match the current HEAD, do not display any staleness notes diff --git a/ship/references/validation-log.md b/ship/references/validation-log.md new file mode 100644 index 0000000000..5e2f1c8348 --- /dev/null +++ b/ship/references/validation-log.md @@ -0,0 +1,112 @@ +# Validation log + +Moved from SKILL.md to keep the primary skill body patchable. Future dated validation logs belong here, not in SKILL.md. + +## 2026-06-03 archive from `gstack-ship` + +### 1. Review Readiness Dashboard + +## Review Readiness Dashboard + +After completing the review, read the review log and config to display the dashboard. + +```bash +~/.hermes/skills/gstack/bin/gstack-review-read +``` + +Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Eng Review row, show whichever is more recent between `review` (diff-scoped pre-landing review) and `plan-eng-review` (plan-stage architecture review). Append "(DIFF)" or "(PLAN)" to the status to distinguish. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. For the Outside Voice row, show the most recent `codex-plan-review` entry — this captures outside voices from both /plan-ceo-review and /plan-eng-review. + +**Source attribution:** If the most recent entry for a skill has a \`"via"\` field, append it to the status label in parentheses. Examples: `plan-eng-review` with `via:"autoplan"` shows as "CLEAR (PLAN via /autoplan)". `review` with `via:"ship"` shows as "CLEAR (DIFF via /ship)". Entries without a `via` field show as "CLEAR (PLAN)" or "CLEAR (DIFF)" as before. + +Note: `autoplan-voices` and `design-outside-voices` entries are audit-trail-only (forensic data for cross-model consensus analysis). They do not appear in the dashboard and are not checked by any consumer. + +Display: + +``` ++====================================================================+ +| REVIEW READINESS DASHBOARD | ++====================================================================+ +| Review | Runs | Last Run | Status | Required | +|-----------------|------|---------------------|-----------|----------| +| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES | +| CEO Review | 0 | — | — | no | +| Design Review | 0 | — | — | no | +| Adversarial | 0 | — | — | no | +| Outside Voice | 0 | — | — | no | ++--------------------------------------------------------------------+ +| VERDICT: CLEARED — Eng Review passed | ++====================================================================+ +``` + +**Review tiers:** +- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting). +- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. +- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. +- **Adversarial Review (automatic):** Always-on for every review. Every diff gets both Claude adversarial subagent and Codex adversarial challenge. Large diffs (200+ lines) additionally get Codex structured review with P1 gate. No configuration needed. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. + +**Verdict logic:** +- **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) +- **NOT CLEARED**: Eng Review missing, stale (>7 days), or has open issues +- CEO, Design, and Codex reviews are shown for context but never block shipping +- If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED + +**Staleness detection:** After displaying the dashboard, check if any existing reviews may be stale: +- Parse the \`---HEAD---\` section from the bash output to get the current HEAD commit hash +- For each review entry that has a \`commit\` field: compare it against the current HEAD. If different, count elapsed commits: \`git rev-list --count STORED_COMMIT..HEAD\`. Display: "Note: {skill} review from {date} may be stale — {N} commits since review" +- For entries without a \`commit\` field (legacy entries): display "Note: {skill} review from {date} has no commit tracking — consider re-running for accurate staleness detection" +- If all reviews match the current HEAD, do not display any staleness notes + +If the Eng Review is NOT "CLEAR": + +Print: "No prior eng review found — ship will run its own pre-landing review in Step 9." + +Check diff size: `git diff ...HEAD --stat | tail -1`. If the diff is >200 lines, add: "Note: This is a large diff. Consider running `/plan-eng-review` or `/autoplan` for architecture-level review before shipping." + +If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block. + +For Design Review: run `source <(~/.hermes/skills/gstack/bin/gstack-diff-scope 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block. + +Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. + +--- + +### 2. Prior Learnings + +## Prior Learnings + +Search for relevant learnings from previous sessions: + +```bash +_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset") +echo "CROSS_PROJECT: $_CROSS_PROJ" +if [ "$_CROSS_PROJ" = "true" ]; then + $GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true +else + $GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true +fi +``` + +If `CROSS_PROJECT` is `unset` (first time): Use AskUserQuestion: + +> gstack can search learnings from your other projects on this machine to find +> patterns that might apply here. This stays local (no data leaves your machine). +> Recommended for solo developers. Skip if you work on multiple client codebases +> where cross-contamination would be a concern. + +Options: +- A) Enable cross-project learnings (recommended) +- B) Keep learnings project-scoped only + +If A: run `$GSTACK_BIN/gstack-config set cross_project_learnings true` +If B: run `$GSTACK_BIN/gstack-config set cross_project_learnings false` + +Then re-run the search with the appropriate flag. + +If learnings are found, incorporate them into your analysis. When a review finding +matches a past learning, display: + +**"Prior learning applied: [key] (confidence N/10, from [date])"** + +This makes the compounding visible. The user should see that gstack is getting +smarter on their codebase over time.