Skip to content

feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557

Merged
carlos-alm merged 113 commits intomainfrom
feat/release-skill-auto-semver
Mar 24, 2026
Merged

feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557
carlos-alm merged 113 commits intomainfrom
feat/release-skill-auto-semver

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Add /titan-run orchestrator skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling hands-free end-to-end execution
  • Add Diff Review Agent to /titan-forge (Step 9) — verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs (scope check, intent match, commit message accuracy, deletion audit, leftover check)
  • Add Semantic Assertions to /titan-gate (Step 5) — export signature stability, import resolution integrity, dependency direction assertions, re-export chain validation
  • Add Architectural Snapshot Comparator to /titan-gate (Step 5.5) — captures pre-forge architectural baseline, compares community stability, cross-domain dependency direction, cohesion delta, and drift after each commit
  • Add hardening to /titan-run: Pre-Agent Gate (G1-G4), post-phase validation (V1-V15), stall detection, state file backup/recovery, NDJSON integrity checks, mandatory human checkpoint before forge

Test plan

  • Run /titan-run on a test codebase in a worktree — verify full pipeline completes
  • Verify gauntlet loop resumes correctly across sub-agent boundaries
  • Verify diff review catches a scope violation (touch file outside sync plan)
  • Verify semantic assertions catch a removed export with active callers
  • Verify architectural snapshot comparison catches a new upward dependency
  • Verify stall detection stops the loop after 3 consecutive no-progress iterations
  • Verify state file backup/recovery works when titan-state.json is corrupted
  • Verify --start-from forge skips analysis phases but validates their artifacts exist

…in backlog

These two items deliver the highest immediate impact on agent experience
and graph accuracy without requiring Rust porting or TypeScript migration.
They should be implemented before any Phase 4+ roadmap work.

- #83: hook-optimized `codegraph brief` enriches passively-injected context
- #71: basic type inference closes the biggest resolution gap for TS/Java
Impact: 14 functions changed, 0 affected
Add new Phase 4 covering the port of JS-only build phases to Rust:
- 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings)
- 4.4: Batch SQLite inserts (~143ms)
- 4.5: Role classification & structure (~42ms)
- 4.6: Complete complexity pre-computation
- 4.7: Fix incremental rebuild data loss on native engine
- 4.8: Incremental rebuild performance (target sub-100ms)

Bump old Phases 4-10 to 5-11 with all cross-references updated.
Benchmark evidence shows ~50% of native build time is spent in
JS visitors that run identically on both engines.
Take main's corrected #57 section anchors; keep HEAD's v2.7.0 version reference.

Impact: 10 functions changed, 11 affected
…ative-acceleration

Impact: 25 functions changed, 46 affected
- Add COMMITS=0 guard in publish.yml to return clean version when HEAD
  is exactly at a tag (mirrors bench-version.js early return)
- Change bench-version.js to use PATCH+1-dev.COMMITS format instead of
  PATCH+COMMITS-dev.SHA (mirrors publish.yml's new scheme)
- Fix fallback in bench-version.js to use dev.1 matching publish.yml's
  no-tags COMMITS=1 default

Impact: 1 functions changed, 0 affected
The release skill now scans commit history using conventional commit
rules to determine major/minor/patch automatically. Explicit version
argument still works as before.
…ns, and architectural snapshot

Add /titan-run skill that dispatches the full Titan pipeline (recon → gauntlet →
sync → forge) to sub-agents with fresh context windows, enabling end-to-end
autonomous execution.

Hardening layers added across the pipeline:
- Pre-Agent Gate (G1-G4): git health, worktree validity, state integrity, backups
- Post-phase validation (V1-V15): artifact structure, coverage, consistency checks
- Stall detection with per-phase thresholds and no-progress abort
- Mandatory human checkpoint before forge (unless --yes)

New validation tools integrated into forge and gate:
- Diff Review Agent (forge Step 9): verifies each diff matches the gauntlet
  recommendation and sync plan intent before gate runs
- Semantic Assertions (gate Step 5): export signature stability, import resolution
  integrity, dependency direction, re-export chain validation
- Architectural Snapshot Comparator (gate Step 5.5): community stability, cross-domain
  dependency direction, cohesion delta, drift detection vs pre-forge baseline
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR adds three major capabilities to the Titan Paradigm pipeline:

  1. /titan-run orchestrator — a new top-level skill that dispatches the full recon → gauntlet → sync → forge pipeline to sub-agents with fresh context windows. Includes Pre-Agent Gate (G1-G4), artifact pre-validation (Step 0.5), stall detection, state file backup/recovery, and 15 post-phase V-checks.

  2. Diff Review Agent in /titan-forge (Step 9) — five checks (D1 scope, D2 intent, D3 commit message, D4 deletion audit, D5 leftover) run after staging but before gate, catching intent drift early. Staging is correctly moved to Step 8 before the review.

  3. Semantic Assertions (Step 5: export stability, import resolution, boundary rules, barrel re-exports) and Architectural Snapshot Comparator (Step 5.5: community stability, dependency direction, cohesion delta, drift resolution) in /titan-gate.

The PR includes an extensive series of fixes (50+ reviewer threads addressed) and is remarkably thorough — covering shell portability, temp-file lifecycle, stale variable persistence, off-by-one array accesses, non-deterministic Louvain IDs, and many edge cases. Two remaining issues were found:

  • Step 5a in titan-gate: codegraph exports is positioned before the new-file guard in the instruction layout, causing unnecessary invocations on newly-added files. The code block should appear inside the loop, after the guard.
  • Final report in titan-run: The Diff review rejections counter is a subset of Targets failed, but they appear as sibling rows without annotation. Users will misread the total as their sum, and agents have no explicit guidance on how to filter failedTargets by diff-review reason prefix.

Both issues also apply to the docs/examples/ mirror copies.

Confidence Score: 4/5

  • Safe to merge — no runtime-breaking issues remain. Two minor P1/P2 clarity issues in skill pseudocode are worth fixing before the next full pipeline run.
  • The PR is a very large and carefully engineered addition. After an extensive prior review cycle (50+ threads, all addressed), only two residual issues were found: a misplaced code block in gate Step 5a that causes unnecessary codegraph exports calls on new files, and a report template ambiguity where Diff review rejections and Targets failed counts overlap without annotation. Neither is a data-loss or crash risk. All rollback paths, stall detection, temp-file lifecycle, shell portability, and variable persistence patterns are correct.
  • titan-gate/SKILL.md (Step 5a code block placement) and titan-run/SKILL.md (final report count annotation) — both affect their docs/examples/ mirrors as well.

Important Files Changed

Filename Overview
.claude/skills/titan-forge/SKILL.md Adds Diff Review Agent (Step 9) with D1-D5 checks, re-numbers commit/rollback steps, adds diffWarnings tracking, and fixes test runner detection. Staging now occurs before diff review. Rollback paths use dynamic file discovery. Minor issues remain around report count clarity.
.claude/skills/titan-gate/SKILL.md Adds Semantic Assertions (Step 5, sub-checks 5a–5d) and Architectural Snapshot Comparator (Step 5.5 with A1–A4). Removes old drift-detection and issue-tracking steps. Package-manager-aware tool detection added. step5cViolations deduplication mechanism is well-designed but has an initialization ambiguity.
.claude/skills/titan-run/SKILL.md New file adding the full Titan pipeline orchestrator with sub-agent dispatch, G1-G4 pre-agent gates, artifact pre-validation (Step 0.5), gauntlet and forge loops with stall detection, architectural snapshot capture (Step 3.5a), forge checkpoint, V1-V15 post-phase validations, and final report. Very thorough handling of many edge cases from prior review cycles.
docs/examples/claude-code-skills/README.md Updates skill table, usage guide, artifact table, and command table to reflect /titan-run orchestrator addition. Adds arch-snapshot.json, RUN column entries, new command rows. Mirrors .claude/skills/ layout. No issues found.
vitest.config.js Adds hookTimeout: 30000 (matching testTimeout) and fixes .claude glob pattern to /.claude/ for consistent cross-repo matching.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["/titan-run"] --> B[Step 0: Pre-flight\nArgs, worktree, main sync]
    B --> B1[Step 0.5: Artifact pre-validation\nif --start-from or --skip-*]
    B1 --> C[Step 1: RECON sub-agent\nV1-V4 post-validation]
    C --> D[Step 2: GAUNTLET loop\nSub-agents until complete\nV5-V7 + NDJSON integrity]
    D --> E[Step 3: SYNC sub-agent\nV8-V10 post-validation]
    E --> F[Step 3.5: Pre-forge\nArch snapshot + Human checkpoint]
    F --> G[Step 4: FORGE loop\nG1-G4 pre-agent gate each iteration]

    G --> H{Forge sub-agent}
    H --> I[Step 8: Stage files\ngit add]
    I --> J[Step 9: Diff Review\nD1 scope, D2 intent, D3 msg,\nD4 deletion, D5 leftover]
    J -->|DIFF FAIL| K[Rollback + continue\nadd to failedTargets]
    J -->|DIFF PASS/WARN| L[Step 10: Run tests\nPackage-manager aware]
    L -->|fail| K
    L -->|pass| M[Step 11: /titan-gate\nStep 5 Semantic + Step 5.5 Arch]
    M -->|cycle/test/lint/build FAIL| N[Step 13: Rollback\ngit reset + checkout]
    M -->|structural/semantic FAIL| O[Manual cleanup + continue]
    M -->|PASS| P[Step 12: Commit\nRecord SHA + diffWarnings]

    G --> Q[V11-V13 post-agent checks\nCommit audit + test suite]
    Q --> R{All phases done?}
    R -->|no| G
    R -->|yes| S[V14-V15 Final validation\n+ Step 5 Report]
Loading

Comments Outside Diff (2)

  1. .claude/skills/titan-gate/SKILL.md, line 129-135 (link)

    codegraph exports called before the new-file guard — unnecessary work and potential confusion

    The code block at line 134 (codegraph exports <changed-file> -T --json) appears before the "For each changed file" loop that contains the new-file guard. An AI agent following the instruction sequentially would:

    1. Run codegraph exports <changed-file> for each changed file (described as "Get the list")
    2. Then check if the file is new and skip if so

    This means codegraph exports is called on newly-added files before the guard has a chance to skip them. For a new file, the exports command returns the current state with nothing to compare against — the results are thrown away anyway — but the wasted call could also confuse agents into thinking the export check partially ran.

    The code block should be placed inside the loop, after the new-file guard:

    For each changed file from diff-impact (Step 1):
    - First, check if the file existed before this commit:
      ```bash
      git show HEAD:<changed-file> -- 2>/dev/null | head -1

    If the file is new (command fails or returns nothing), skip 5a for this file — new exports cannot break existing callers.

    • Otherwise, run:
      codegraph exports <changed-file> -T --json
      For each exported symbol in the file: [...]
    
    The same structural issue exists in `docs/examples/claude-code-skills/titan-gate/SKILL.md` at the corresponding lines.
    
    
    
  2. .claude/skills/titan-run/SKILL.md, line 605-611 (link)

    Diff review rejections is a subset of Targets failed — overlapping counts mislead users

    The final report template shows both fields side by side:

      Targets failed: <N>
      Diff review rejections: <N>
    

    Diff review rejections is a subset of Targets failed (entries in execution.failedTargets where reason starts with "diff-review:"). A user reading the report will naturally add these together, concluding there were N_failed + N_diff_reject total failures when in reality the total is just N_failed.

    Additionally, the template doesn't tell the orchestrator how to derive each count, so an agent may:

    • Report all of execution.failedTargets for both rows (double-counting)
    • OR report Targets failed excluding diff-review entries and Diff review rejections separately — which also doesn't add up to the true total

    Consider making the relationship explicit:

      Targets failed: <N>         ← total (test + gate + diff-review)
        of which diff review rejections: <N>
      Diff review warnings: <N>
    

    Or alternatively clarify the derivation in the template with (included in above):

      Targets failed: <N>
      Diff review rejections (included in above): <N>
      Diff review warnings: <N>
    

    The same template exists in docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding lines.

Reviews (48): Last reviewed commit: "fix(skill): update forge Rules rollback ..." | Re-trigger Greptile

Comment on lines +200 to +205

if currentAuditedCount == previousAuditedCount:
stallCount += 1
Print: "WARNING: Gauntlet iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)"
if stallCount >= maxStalls:
Stop: "Gauntlet stalled for <maxStalls> consecutive iterations at <currentAuditedCount>/<expectedTargetCount> targets. Likely stuck on a problematic target. Check gauntlet.ndjson for the last successful entry and investigate the next target in the batch."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Undefined variable previousAuditedCountBeforeAgent in gauntlet efficiency check

The efficiency check references previousAuditedCountBeforeAgent, which is never defined in the pseudocode. By the time this line is reached, previousAuditedCount has already been updated to currentAuditedCount on the line just above, so using it there would always yield 0.

To correctly compute how many targets this iteration processed, you need to capture the pre-agent count before the update. For example:

    previousAuditedCount = currentAuditedCount   # update for next iteration's stall check

    # Efficiency check: if progress is very slow (< 2 targets per iteration), warn
    targetsThisIteration = currentAuditedCount - previousAuditedCountBeforeAgent  # ← undefined

Should be:

    # Save count before update for efficiency check
    countBeforeUpdate = previousAuditedCount
    previousAuditedCount = currentAuditedCount   # update for next iteration's stall check

    # Efficiency check
    targetsThisIteration = currentAuditedCount - countBeforeUpdate
    if targetsThisIteration == 1 and iteration > 3:
        Print: "WARNING: Only 1 target per iteration..."

This same issue exists in the identical copy at docs/examples/claude-code-skills/titan-run/SKILL.md at the same line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — saved pre-update count to countBeforeUpdate before reassignment. Applied to both .claude/skills/ and docs/examples/ copies.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Resolved merge conflicts with main and fixed the undefined previousAuditedCountBeforeAgent variable in the gauntlet loop efficiency check.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

If any SHA doesn't exist → **WARN**: "Commit <sha> recorded in state but not found in git history. State may be out of sync."

**V15. Gate log consistency:**
If `.codegraph/titan/gate-log.ndjson` exists:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Incomplete merge conflict detection — misses AU, UA, DU, UD markers

The check only lists UU, AA, DD as conflict indicators in git status --porcelain output. Git also produces AU (added by us, modified by them), UA (modified by us, added by them), DU (deleted by us, modified by them), and UD (modified by us, deleted by them) in the porcelain format for merge conflicts. Missing these cases could allow the pipeline to proceed with unresolved conflicts into a forge sub-agent that commits.

Suggested change
If `.codegraph/titan/gate-log.ndjson` exists:
- **Merge conflicts** (lines starting with `UU`, `AA`, `DD`, `AU`, `UA`, `DU`, `UD`): Stop immediately: "Unresolved merge conflict detected. Resolve before continuing."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added AU, UA, DU, UD to the merge conflict detection markers in both .claude/skills/ and docs/examples/ copies.

Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5).
The arch snapshot is at .codegraph/titan/arch-snapshot.json.
6. Commit on success, rollback on failure

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 --start-from forge silently disables architectural comparison without warning

When using --start-from forge, Step 3.5a (architectural snapshot capture) is skipped entirely since it lives inside Step 3.5, which comes after sync. The gate's Step 5.5 gracefully handles a missing arch-snapshot.json by skipping the comparison, but the user receives no warning that the architectural drift check will be inactive for the entire forge run.

Consider adding an explicit notice at the start of the forge loop when arch-snapshot.json is absent and the start phase was forge:

if startPhase == "forge" and not exists(".codegraph/titan/arch-snapshot.json"):
    Print: "NOTE: No arch-snapshot.json found. Architectural comparison in /titan-gate (Step 5.5) will be skipped for this run.
           To enable it, run '/titan-run --start-from sync' to re-capture the pre-forge snapshot."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a pre-loop check in Step 4a that warns when arch-snapshot.json is missing, advising the user to run --start-from sync to re-capture it.

- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet
  efficiency check (save pre-update count before reassignment)
- Add AU, UA, DU, UD to merge conflict detection markers
- Add warning when --start-from forge runs without arch-snapshot.json
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet
  efficiency check (save pre-update count before reassignment)
- Add AU, UA, DU, UD to merge conflict detection markers
- Add warning when --start-from forge runs without arch-snapshot.json
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines 206 to 208
13. **On failure (test, gate, or diff-review):**
```bash
git checkout -- <changed files>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Incomplete rollback — staged files not unstaged in Step 13

The new flow stages files in Step 8 (git add) before running tests (Step 10) and gate (Step 11). When a test or gate failure routes to Step 13, only the working-tree restoration is performed:

git checkout -- <changed files>

This restores the file contents on disk but does not remove the files from the staging area. The changes remain indexed in git. If the orchestrator loops back and the next target's git add picks up different files, the leftover staged changes from the failed target will be silently included in the next commit.

The rollback should unstage first, then restore the working tree:

Suggested change
13. **On failure (test, gate, or diff-review):**
```bash
git checkout -- <changed files>
13. **On failure (test, gate, or diff-review):**
```bash
git reset HEAD <changed files>
git checkout -- <changed files>
```

Note: the DIFF FAIL path in Step 9 already says "Unstage and revert changes" correctly — this gap only affects Step 13. The same fix is needed in the docs mirror at docs/examples/claude-code-skills/titan-forge/SKILL.md.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added unstaging step before working tree restore in Step 13 rollback, so staged files are properly cleared before checkout. Applied to both .claude/skills/ and docs/examples/ copies.

Both commands are called in titan-forge's new diff review step (Step 9):
fn-impact for deletion audit (D4) and exports for re-export chain checks.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed the remaining open item from the summary:

README command table (P2): Restored FORGE to both codegraph fn-impact and codegraph exports in the tool reference table. Both commands are called in titan-forge's new diff review Step 9 — fn-impact for the deletion audit (D4) and exports for re-export chain checks (D3). The two P1 issues (shell substitution bug in arch-snapshot.json and incomplete git rollback in Step 13) were addressed in an earlier commit.

The --yes passthrough was also checked — titan-run passes --yes to forge and forge still handles it at Step 8 ($ARGUMENTS contains --yes), so that is working as intended.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines 197 to 199
| `codegraph exports` | GAUNTLET | Per-symbol export consumers |
| `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius |
| `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Command table stale after new diff-review and semantic-assertion additions

Three commands were removed from FORGE/GATE in the table, but the new steps introduced in this PR use them:

  1. codegraph fn-impact — listed as "GAUNTLET, SYNC" only, but is still called in titan-forge Step 4 (pre-touch understanding) and in the new Step 9 D4 (deletion audit). It's also now used in titan-gate Step 5a. All three users (FORGE, GATE) are missing.

  2. codegraph context — listed as "SYNC" only, but titan-forge Step 4 still calls codegraph context <target> -T --json (this code is unchanged and confirmed in the current file). FORGE should be restored.

  3. codegraph exports — listed as "GAUNTLET" only, but the new titan-gate Step 5a explicitly runs codegraph exports <changed-file> -T --json. GATE should be added.

The same table appears in .claude/skills/ (mirrored) so the same correction applies there.

Suggested correction:

Suggested change
| `codegraph exports` | GAUNTLET | Per-symbol export consumers |
| `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius |
| `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) |
| `codegraph exports` | GAUNTLET, GATE | Per-symbol export consumers |
| `codegraph fn-impact` | GAUNTLET, SYNC, FORGE, GATE | Blast radius |

And separately restore FORGE to the codegraph context row:

| `codegraph context` | SYNC, FORGE | Full function context |

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added GATE to exports and fn-impact rows, added FORGE to context row in the command table. Verified against actual usage in titan-gate Step 5a and titan-forge Steps 4 and 9.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added GATE to codegraph communities, codegraph structure, and codegraph deps in the command table. These are all used by GATE Step 5.5 (architectural snapshot comparison) but were missing from the "Used by" column.

5. Run /titan-gate — read .claude/skills/titan-gate/SKILL.md and follow it exactly.
Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5).
The arch snapshot is at .codegraph/titan/arch-snapshot.json.
6. Commit on success, rollback on failure
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 --yes description inaccurate — forge no longer accepts this flag

The argument hint says --yes is "passed through to forge", but this PR removes --yes from titan-forge's supported arguments list (the --yes → skip confirmation prompt line was deleted from titan-forge). The flag is now passed to forge's dispatch invocation (line ~835 of this file: yesFlag = "--yes" if autoConfirm else ""), but forge silently ignores it since it no longer handles that argument.

--yes still has meaningful effects within the orchestrator itself (auto-resume on existing state, skip the pre-pipeline and forge-checkpoint confirmations), so it should stay — but the description should be updated to reflect its actual scope.

Suggested change
6. Commit on success, rollback on failure
- `--yes` → skip all confirmation prompts in the orchestrator (pre-pipeline, forge checkpoint, and resume prompts)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated the --yes description to reflect its actual scope: orchestrator-level prompts only (pre-pipeline, forge checkpoint, and resume). Applied to both .claude/skills/ and docs/examples/ copies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated the Rules section description of --yes to clarify both purposes: (1) passed through to forge so it skips its per-phase confirmation prompt, and (2) within the orchestrator itself, skips pre-pipeline, forge checkpoint, and resume prompts. Updated in both .claude/skills/titan-run/SKILL.md and the example copy.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines +325 to +326
For structural-only and semantic **advisory** checks (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report and let user decide.
For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 "Advisory" label misleading for semantic/arch FAILs

Steps 5 (semantic assertions) and 5.5 (architectural snapshot) can produce hard FAILs — e.g., "Removed export parseConfig still imported by 3 files" or "Significant community restructuring detected." These are gating failures, not suggestions. However, the label "advisory" in this line groups them with genuinely informational checks like Step 7 (sync alignment) and Step 6 (branch structural diff), which could cause an AI agent to treat semantic FAILs with less urgency than intended.

The "advisory" wording describes the rollback behaviour (preserve staged changes for in-place fix), but an agent reading this may interpret it as "severity advisory" and downgrade the reported FAIL to a WARN.

Consider changing the label to something that distinguishes rollback behavior from verdict severity:

Suggested change
For structural-only and semantic **advisory** checks (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report and let user decide.
For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback.
For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the misleading "advisory" label with explicit "structural and semantic failures" wording. The line now reads: "For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually." This makes clear these are gating failures, not suggestions. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +222 to +223
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree.
- If FAIL on **semantic/structural** (gate does not auto-rollback its staging area, but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Complexity FAILs (gate Step 3) not covered by either named path

The two forge paths are "cycle/test/lint/build" (→ step 13) and "semantic/structural" (→ manual cleanup). Gate Step 3 (complexity delta) can produce hard FAILs — cognitive > 30, halstead.bugs > 1.0, mi < 20 — but it belongs to neither label.

An AI agent reading this must infer "complexity FAIL → not cycle/test/lint/build → therefore semantic/structural path." That deduction works because of the "(gate auto-rolls back staged changes)" hint in the cycle/test path, but an agent that pattern-matches on the label "semantic/structural" may not classify complexity failures there and route them incorrectly.

The comment "Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged" also uses "gate" ambiguously — it could mean "any gate failure" rather than just the auto-rollback subset.

Consider being explicit:

Suggested change
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree.
- If FAIL on **semantic/structural** (gate does not auto-rollback its staging area, but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback.
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree.
- If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback.

The same pattern is present in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the "semantic/structural" catch-all with an explicit "any other check" listing that names all non-rollback failure types: complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8). Also clarified the "Do NOT go to step 13" reason with "Step 2/4 failures" instead of the ambiguous "cycle/test/gate failures." The rollback command now uses git diff --cached --name-only and git diff --name-only for portable file discovery instead of relying on <files> placeholders. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +429 to +431
git fetch origin main 2>/dev/null || true
mergeBase=$(git merge-base HEAD origin/main)
mainAdvance=$(git rev-list --count $mergeBase..origin/main)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale origin/main silently produces wrong divergence count on fetch failure

The 2>/dev/null || true makes git fetch non-fatal, but git merge-base HEAD origin/main on the next line still uses origin/main — which may now point to an arbitrarily old commit from the last successful fetch (days or weeks ago). The result of git rev-list --count $mergeBase..origin/main would then be a completely wrong number (could be hundreds or thousands of commits), potentially alarming users into re-running the full recon when the codebase hasn't actually diverged much.

There's no signal to the user that the divergence count is based on stale data.

Consider checking whether the fetch succeeded and annotating the warning accordingly:

if git fetch origin main 2>/dev/null; then
  mergeBase=$(git merge-base HEAD origin/main)
  mainAdvance=$(git rev-list --count $mergeBase..origin/main)
  # warn if mainAdvance > 0
else
  Print: "NOTE: Could not fetch origin/main — skipping divergence check."
fi

The same applies to docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding lines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the divergence check now detects fetch failure instead of silently falling through with stale data. The structure is now:

if git fetch origin main 2>/dev/null; then
  mergeBase=$(git merge-base HEAD origin/main)
  mainAdvance=$(git rev-list --count $mergeBase..origin/main)
else
  mainAdvance="unknown"
fi

When fetch fails, the checkpoint prints "NOTE: Could not fetch origin/main — skipping divergence check." instead of a potentially misleading commit count. Applied to both .claude/skills/ and docs/examples/ copies.

carlos-alm added a commit that referenced this pull request Mar 24, 2026
…dle stale fetch (#557)

- Gate Step 9: replace misleading "advisory" label with explicit "structural and semantic failures" wording
- Forge Step 11: replace "semantic/structural" catch-all with explicit "any other check" listing complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8)
- Run Step 3.5b: divergence check now detects fetch failure and skips instead of computing against stale origin/main
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

- Gate Step 9: replace misleading "advisory" label with
  explicit "structural and semantic failures" wording
- Forge Step 11: replace "semantic/structural" catch-all
  with explicit listing of all non-rollback failure types
- Run Step 3.5b: divergence check now detects fetch failure
  and skips instead of computing against stale origin/main
@carlos-alm carlos-alm force-pushed the feat/release-skill-auto-semver branch from 5b16ba8 to c83683f Compare March 24, 2026 21:30
Comment on lines +427 to +444
**Divergence check** — before asking for confirmation, check how far main has advanced since the initial sync:
```bash
if git fetch origin main 2>/dev/null; then
mergeBase=$(git merge-base HEAD origin/main)
mainAdvance=$(git rev-list --count $mergeBase..origin/main)
else
mainAdvance="unknown"
fi
```
If `mainAdvance == "unknown"`, append to the checkpoint output:
```
NOTE: Could not fetch origin/main — skipping divergence check.
```
If `mainAdvance > 0` (numeric), append to the checkpoint output:
```
WARNING: main has advanced <mainAdvance> commits since initial sync.
If significant, consider re-running: /titan-run --start-from recon
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Divergence check runs after the confirmation print block, causing agents to show [y/n] before computing the warning

The **Divergence check** section says "before asking for confirmation, check how far main has advanced" but it is placed after the Print: block (lines 400–425), which already shows the full formatted checkpoint including Proceed with /titan-forge? [y/n] and the closing ================================================================. An AI agent executing these instructions sequentially will:

  1. Print the formatted checkpoint box (including the [y/n] prompt)
  2. Then run git fetch origin main and compute mainAdvance
  3. "Append" the divergence warning or note outside the formatted box

This means either the agent shows the confirmation prompt before presenting the divergence warning, or the warning appears as disconnected text below the closing border — both are confusing and undermine the intent of the divergence check.

The snapshotStatus fix from a prior thread set the correct precedent: gather data before the Print: block. The divergence check should follow the same pattern.

The fix is to move the git fetch computation before the Print: block, bind the result to a variable (e.g., mainAdvanceNote), and include it inline in the template:

### 3.5b. Human checkpoint

First, determine the snapshot status from Step 3.5a:

snapshotStatus = file_exists('.codegraph/titan/arch-snapshot.json') ? "captured" : "FAILED — gate A1/A3/A4 will be skipped"


**Divergence check** — run before printing the checkpoint:
```bash
if git fetch origin main 2>/dev/null; then
  mergeBase=$(git merge-base HEAD origin/main)
  mainAdvance=$(git rev-list --count $mergeBase..origin/main)
else
  mainAdvance="unknown"
fi
mainAdvanceNote = mainAdvance == "unknown"
  ? "NOTE: Could not fetch origin/main — skipping divergence check."
  : mainAdvance > 0
    ? "WARNING: main has advanced <mainAdvance> commits since initial sync.\nIf significant, consider re-running: /titan-run --start-from recon"
    : ""

Print:

================================================================
  ANALYSIS COMPLETE — FORGE CHECKPOINT
================================================================
...
Architectural snapshot: <snapshotStatus>
<mainAdvanceNote>
Proceed with /titan-forge? [y/n]
================================================================

The same fix is needed in `docs/examples/claude-code-skills/titan-run/SKILL.md` at the corresponding lines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved the divergence check (git fetch + merge-base computation) before the Print block, binding the result to a mainAdvanceNote variable that is included inline in the checkpoint template. The [y/n] prompt now appears after the divergence warning, not before. Applied to both .claude/skills/ and docs/examples/ copies.

- Add target to `execution.completedTargets`
- Record any diff-review warnings by **appending** to `execution.diffWarnings` (if any). Each entry must follow this schema:
```json
{ "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 "check": "D3|D5" schema notation is ambiguous — agent may emit a literal pipe-separated value

The diffWarnings entry schema shows:

{ "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N }

The D3|D5 notation is used as a type union (meaning "one of D3 or D5"), but it looks syntactically like the literal string value "D3|D5". An AI agent reading this schema might emit:

{ "target": "parseConfig", "check": "D3|D5", "message": "...", "phase": 2 }

instead of the correct:

{ "target": "parseConfig", "check": "D3", "message": "...", "phase": 2 }

Any downstream filtering of execution.diffWarnings by check type (e.g., counting only D5 warnings in the final report) would silently miss entries with the malformed "D3|D5" value.

Suggested change
{ "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N }
```json
{ "target": "<target-name>", "check": "<D3 or D5>", "message": "<warning text>", "phase": N }
```

The same ambiguity exists in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed "check": "D3|D5" to "check": "<D3 or D5>" to make clear this is a type union, not a literal pipe-separated value. An agent reading the schema will now produce "D3" or "D5" as individual values. Applied to both .claude/skills/ and docs/examples/ copies.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines +127 to +134
For each **exported** symbol in changed files:
- Check if the symbol existed before this change: `git show HEAD:<file>` and compare function signatures
- If a function's **parameter list changed** (added required params, removed params, changed types):
```bash
codegraph fn-impact <symbol> -T --json
```
Count callers. If callers > 0 and callers are NOT also staged → **FAIL**: "Signature change in `<symbol>` breaks <N> callers not updated in this commit: <caller list>"
- If an **export was removed entirely** and callers exist → **FAIL**: "Removed export `<symbol>` still imported by <N> files"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Step 5a silently no-ops for entirely new files

codegraph exports <changed-file> -T --json followed by git show HEAD:<file> fails with a fatal error for files that are new in this commit (they don't exist in HEAD). The step gives no guidance on this case, so an agent that hits the error may either:

  1. Abort the entire semantic assertion step, silently skipping 5b–5d as well.
  2. Treat the failure as "no prior state" and skip the caller check for that file — the desired behavior, but not documented.

Consider adding an explicit guard:

### 5a. Export signature stability

For each changed file from diff-impact (Step 1):
- First, check if the file existed before this commit:
  ```bash
  git show HEAD:<changed-file> -- 2>/dev/null | head -1

If the file is new (command fails or returns nothing), skip 5a for this file — new exports cannot break existing callers.

  • Otherwise, run codegraph exports <changed-file> -T --json and compare signatures against HEAD…

The same guard should be applied in the docs mirror at `docs/examples/claude-code-skills/titan-gate/SKILL.md`.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added an explicit new-file guard at the top of Step 5a. Before checking exported symbols, it now runs git show HEAD:<changed-file> -- 2>/dev/null | head -1 and skips the file if it's new (command fails). New exports cannot break existing callers. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +144 to +154
### 5c. Dependency direction assertions (codegraph boundary rules)

Check the Step 1 `codegraph check --staged --boundaries` results for boundary violations (already collected — do not re-run). This covers `.codegraphrc.json` onion-architecture rules and any custom boundary predicates.

For each boundary violation reported by `codegraph check`:
- New dependency that violates a configured boundary rule → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary"

Additionally, from the diff-impact results already collected in Step 1, extract any **new** edges (imports that didn't exist before):
- New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition"

> **Note:** Step 5c relies exclusively on `codegraph check --boundaries` results. Domain-direction checks against `GLOBAL_ARCH.md` are handled by Step 5.5 A2 — do not duplicate them here.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Step 5c and A2 deduplication has no explicit tracking mechanism

A2 says to "skip any edge already flagged by Step 5c's codegraph check --boundaries results to avoid duplicate verdicts," and Step 5c's note says "Domain-direction checks against GLOBAL_ARCH.md are handled by Step 5.5 A2." However, Step 5c doesn't produce a named data structure of "flagged edges" — it emits FAIL verdicts in narrative form. An agent executing A2 later must somehow recall which specific (source, target) edges were already flagged by Step 5c.

Without an explicit de-duplication mechanism, an agent may:

  1. Re-flag edges already covered by Step 5c (duplicate FAIL messages in the gate report for the same edge).
  2. Incorrectly skip a valid A2 violation because it loosely resembles a Step 5c violation.

Consider making the deduplication explicit by naming the artifact:

**Step 5c** stores flagged edges in `step5cViolations` — a list of `{ source, target }` pairs.

**A2** then checks: for each new diff-impact edge violating domain direction — if `{ source, target }` is already in `step5cViolations`, skip it. Otherwise, emit FAIL.

The same applies to docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — Step 5c now explicitly builds a step5cViolations list of { source, target } pairs as it processes boundary violations. A2 then checks each new edge against this list: if the edge is already in step5cViolations, it skips it. This replaces the vague prose instruction with a concrete deduplication mechanism. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +515 to +521
newFailedTargets = execution.failedTargets (or [])

if newCompletedPhases == previousCompletedPhases and len(newCompletedTargets) == len(previousCompletedTargets):
stallCount += 1
Print: "WARNING: Forge iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)"
if stallCount >= maxStalls:
Stop: "Forge stalled on phase <nextPhase> for <maxStalls> consecutive iterations. Check titan-state.json → execution.failedTargets for details."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Forge stall fires on all-fail phases even though the agent made real progress

The stall condition is:

if newCompletedPhases == previousCompletedPhases and len(newCompletedTargets) == len(previousCompletedTargets):
    stallCount += 1

execution.completedTargets only includes successfully committed targets. If a forge sub-agent processes 5 targets and all 5 fail (adding to execution.failedTargets instead), len(newCompletedTargets) stays unchanged. Two such iterations → stallCount == maxStalls (2) → pipeline aborts with "Forge stalled."

The pipeline aborts even though 10 targets were attempted and failed — the agent was not stalled, it was failing. The abort message ("Forge stalled") is therefore misleading, and the user loses the chance to manually fix the remaining targets.

The fix from the previous review cycle (tracking both completedPhases and completedTargets) already handles the case where a phase spans multiple sub-agent invocations. It doesn't handle the case where every target fails:

# Count total targets processed (succeeded + failed) to distinguish true stalls from all-fail phases
newProcessedTargets = len(newCompletedTargets) + len(newFailedTargets)
previousProcessedTargets = len(previousCompletedTargets) + len(previousFailedTargets)

if newCompletedPhases == previousCompletedPhases and newProcessedTargets == previousProcessedTargets:
    stallCount += 1
    Print: "WARNING: Forge iteration <iteration> made no progress…"
else:
    stallCount = 0

The same fix is needed in docs/examples/claude-code-skills/titan-run/SKILL.md.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — stall detection now counts total processed targets (completed + failed) instead of only completed targets. If a forge sub-agent attempts 5 targets and all 5 fail, that's progress (not a stall). Also initialized previousFailedTargets at loop start and tracks it across iterations. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +207 to +212
10. **Run tests** — detect the project's test command from `package.json` (same detection as gate Step 4):
```bash
testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const cmd=s.test?'npm test':s['test:ci']?'npm run test:ci':null;console.log(cmd||'NO_TEST_SCRIPT');")
```
- If `testCmd == "NO_TEST_SCRIPT"` → skip pre-gate test run (no test script configured).
- Otherwise:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 testCmd detection doesn't handle yarn/pnpm workspaces where npm test may not be available

The detection script:

testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const cmd=s.test?'npm test':s['test:ci']?'npm run test:ci':null;console.log(cmd||'NO_TEST_SCRIPT');")

always returns npm test or npm run test:ci regardless of whether the project uses yarn, pnpm, or bun. On a pnpm workspaces project where npm is not available (e.g., CI image installs only pnpm), npm test would fail with command not found, triggering a rollback as though tests failed — even though the test suite was never run.

Gate Step 4 avoids this problem by only detecting the SCRIPT NAME and letting the agent choose the runner. Consider aligning the detection:

testCmd=$(node -e "
  const p = require('./package.json');
  const s = p.scripts || {};
  const script = s.test ? 'test' : s['test:ci'] ? 'test:ci' : null;
  if (!script) { console.log('NO_TEST_SCRIPT'); process.exit(0); }
  // Detect runner from lock file
  const fs = require('fs');
  const runner = fs.existsSync('yarn.lock') ? 'yarn' : fs.existsSync('pnpm-lock.yaml') ? 'pnpm' : fs.existsSync('bun.lockb') ? 'bun' : 'npm';
  console.log(runner + (script === 'test' ? ' test' : ' run ' + script));
")

The same applies to titan-run/SKILL.md V13 and both docs/examples/ mirrors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — test command detection now detects the package manager from lock files (yarn.lock, pnpm-lock.yaml, bun.lockb) and uses the appropriate runner instead of always defaulting to npm. Applied to both forge Step 10 and titan-run V13, and their docs/examples mirrors.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines +107 to +108
# Detect test command from package.json scripts (npm test, yarn test, pnpm test, etc.)
<test-runner> test 2>&1 || echo "TEST_FAILED"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 <test-runner> placeholder not resolvable — test step will always error

The PR changed npm test to <test-runner> test with a comment, but no detection block was added to resolve the placeholder. A shell will interpret <test-runner> as a redirect-from-file operator (or simply fail with "command not found"), not as the test runner. Every gate invocation will produce a TEST_FAILED output due to the malformed command, even on projects with passing tests — which then triggers the auto-rollback (Step 9) on every commit.

Forge Step 10 and titan-run V13 were both fixed with an explicit node -e detection block in prior review cycles. Gate Step 4 needs the same treatment:

Suggested change
# Detect test command from package.json scripts (npm test, yarn test, pnpm test, etc.)
<test-runner> test 2>&1 || echo "TEST_FAILED"
# Detect test command from package.json scripts
testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const script=s.test?'test':s['test:ci']?'test:ci':null;if(!script){console.log('NO_TEST_SCRIPT');process.exit(0);}const fs=require('fs');const runner=fs.existsSync('yarn.lock')?'yarn':fs.existsSync('pnpm-lock.yaml')?'pnpm':fs.existsSync('bun.lockb')?'bun':'npm';console.log(runner+(script==='test'?' test':' run '+script));")
if [ "$testCmd" != "NO_TEST_SCRIPT" ]; then
$testCmd 2>&1 || echo "TEST_FAILED"
fi

The same fix is needed at docs/examples/claude-code-skills/titan-gate/SKILL.md line 108.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the unresolvable <test-runner> placeholder with the same lock-file-based detection block used in forge Step 10 and titan-run V13. Gate Step 4 now detects the package manager from lock files (yarn.lock, pnpm-lock.yaml, bun.lockb) and skips tests if no test script is configured. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +200 to +203
```bash
git reset HEAD <changed files>
git checkout -- <changed files>
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DIFF FAIL rollback uses unresolved <changed files> placeholder — staged changes may persist

Step 9's DIFF FAIL rollback uses the static placeholder <changed files>:

git reset HEAD <changed files>
git checkout -- <changed files>

Unlike Step 13, which was fixed in a prior review cycle to use dynamic discovery ($(git diff --cached --name-only) and $(git diff --name-only)), this path carries no equivalent pattern. If an agent fails to substitute <changed files> with the actual file list, git reset HEAD and git checkout -- receive no arguments — the staged changes from Step 8 remain in the index and the working tree is left dirty. The next target's git add will then silently bundle the previous target's uncommitted, DIFF-FAILed changes into the next commit.

Apply the same portable pattern that Step 13 already uses:

Suggested change
```bash
git reset HEAD <changed files>
git checkout -- <changed files>
```
**On DIFF FAIL:**
```bash
git reset HEAD -- $(git diff --cached --name-only)
git checkout -- $(git diff --name-only)

The same fix is needed at `docs/examples/claude-code-skills/titan-forge/SKILL.md` lines 200–203.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the static <changed files> placeholder with dynamic file discovery using $(git diff --cached --name-only) and $(git diff --name-only), matching the portable pattern already used in Step 13. Applied to both .claude/skills/ and docs/examples/ copies.

carlos-alm added a commit that referenced this pull request Mar 24, 2026
carlos-alm added a commit that referenced this pull request Mar 24, 2026
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm force-pushed the feat/release-skill-auto-semver branch from 3635edf to 317f79c Compare March 24, 2026 22:17
Comment on lines +315 to +336
@@ -228,7 +332,8 @@ Aggregate all checks:

> "GATE FAIL: [reason]. Graph restored, changes unstaged but preserved. Fix and re-stage."

For structural-only failures (Steps 2-4, 6-8), do NOT auto-rollback — report and let user decide.
For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually.
For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Step 1's --cycles flag conflicts with the no-rollback exception

Step 1 is listed in the no-rollback exception: "For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback". However, Step 1 runs codegraph check --staged --cycles --blast-radius 30 --boundaries — the --cycles flag means Step 1 can produce a cycle FAIL verdict.

If an AI agent stops at Step 1 due to a cycle failure (which is legitimate — agents may short-circuit on FAIL), the no-rollback rule applies, leaving staged changes in the index. Step 2's auto-rollback trigger is never reached.

This contradicts the explicit rule two lines below: "For Step 2 (new cycle) → trigger auto-rollback." A cycle caught by Step 1 and a cycle caught by Step 2 are the same violation but would produce opposite rollback behavior.

The safe fix is to explicitly carve out the cycle sub-check from Step 1's no-rollback exception:

Suggested change
For structural and semantic failures (Steps 1 [manifesto/blast-radius/boundary only — not cycles], 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually.
For Step 2 (new cycle), Step 1 cycle violations (from `--cycles` flag), and Step 4 (lint/build/test), trigger auto-rollback.

The same clarification is needed in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — carved out cycle sub-check from Step 1's no-rollback exception. The exception now reads "Steps 1 [manifesto/blast-radius/boundary only — not cycles]" and the auto-rollback rule explicitly includes "Step 1 cycle violations (from --cycles flag)" alongside Step 2 and Step 4. Applied to both .claude/skills/ and docs/examples/ copies.

Comment on lines +220 to +223
11. **Run /titan-gate:**
Use the Skill tool to invoke `titan-gate`.
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree.
- If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD -- $(git diff --cached --name-only) && git checkout -- $(git diff --name-only)`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Forge Step 11 cycle routing gap mirrors gate's Step 1 ambiguity

Forge Step 11 routes based on the gate failure type: "cycle/test/lint/build" → Step 13 rollback, "any other check" → manual cleanup.

However, gate Step 1 (codegraph check --staged --cycles) can now produce a cycle FAIL that is classified as "structural failure (Step 1)" by gate's own no-rollback exception. When gate reports this failure, forge receives a FAIL report with "Step 1" framing — which forge routes to the "any other check" (manual cleanup) path, not to the cycle rollback path.

A cycle violation should always trigger Step 13 rollback. The routing condition in forge should explicitly include cycle violations from Step 1:

Suggested change
11. **Run /titan-gate:**
Use the Skill tool to invoke `titan-gate`.
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree.
- If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD -- $(git diff --cached --name-only) && git checkout -- $(git diff --name-only)`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback.
- If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes on Step 2/4 cycle; Step 1 may also report cycle violation via `--cycles` flag — treat that the same as Step 2) → go to rollback (step 13) to also revert working tree.

The same fix is needed in docs/examples/claude-code-skills/titan-forge/SKILL.md.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — forge Step 11 now explicitly notes that Step 1 may also report cycle violations via the --cycles flag and that those should be treated the same as Step 2 cycle failures (routed to Step 13 rollback). Applied to both .claude/skills/ and docs/examples/ copies.

git log --oneline -5
# Record the HEAD sha before dispatching

headBefore = $(git rev-parse HEAD)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Shell command substitution inside pseudocode block not consistently marked

headBefore = $(git rev-parse HEAD) uses bash command substitution syntax inside an indented pseudocode block that is otherwise described in prose. Line 510 correctly shows headAfter = $(git rev-parse HEAD) as a similar operation. However, the git log --oneline -5 on line 478 is inside the same pseudocode block and lacks any output-capture syntax — it reads as a fire-and-forget command with no assignment, which means an agent executing this literally would run git log and discard its output, providing no context for V12.

If the intent is for the output to be printed (context-gathering), a Print: prefix like the one used elsewhere in this file would make the intent explicit:

Suggested change
headBefore = $(git rev-parse HEAD)
# Context: show recent commits so agent has context for V12 commit audit
Print: git log --oneline -5
# Record the HEAD sha before dispatching

This pattern is already used elsewhere in the file (e.g., line 539: Print: "Forge phase..."). Same in docs/examples/claude-code-skills/titan-run/SKILL.md.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added Print: prefix to the git log --oneline -5 line so the intent (display recent commits for V12 context) is explicit and consistent with the pattern used elsewhere in the file. Applied to both .claude/skills/ and docs/examples/ copies.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed the 2 remaining items from the Greptile summary:

  1. Lint/build commands hardcoded to npm run (P2) — Replaced hardcoded npm run lint and npm run build in gate Step 4 with package-manager-aware detection (reads yarn.lock, pnpm-lock.yaml, bun.lockb), matching the pattern already used for test commands. Skips gracefully if no lint/build script exists. Applied to both .claude/skills/ and docs/examples/ copies.

  2. Stale rollback description in forge Rules (P2) — Updated the Rules bullet to include both the git reset HEAD unstage step and the git checkout -- working tree revert, matching Step 13's actual behavior. Applied to both copies.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant