feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557
feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557carlos-alm merged 113 commits intomainfrom
Conversation
…in backlog These two items deliver the highest immediate impact on agent experience and graph accuracy without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work. - #83: hook-optimized `codegraph brief` enriches passively-injected context - #71: basic type inference closes the biggest resolution gap for TS/Java
Impact: 14 functions changed, 0 affected
Add new Phase 4 covering the port of JS-only build phases to Rust: - 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings) - 4.4: Batch SQLite inserts (~143ms) - 4.5: Role classification & structure (~42ms) - 4.6: Complete complexity pre-computation - 4.7: Fix incremental rebuild data loss on native engine - 4.8: Incremental rebuild performance (target sub-100ms) Bump old Phases 4-10 to 5-11 with all cross-references updated. Benchmark evidence shows ~50% of native build time is spent in JS visitors that run identically on both engines.
Take main's corrected #57 section anchors; keep HEAD's v2.7.0 version reference. Impact: 10 functions changed, 11 affected
…ative-acceleration Impact: 25 functions changed, 46 affected
- Add COMMITS=0 guard in publish.yml to return clean version when HEAD is exactly at a tag (mirrors bench-version.js early return) - Change bench-version.js to use PATCH+1-dev.COMMITS format instead of PATCH+COMMITS-dev.SHA (mirrors publish.yml's new scheme) - Fix fallback in bench-version.js to use dev.1 matching publish.yml's no-tags COMMITS=1 default Impact: 1 functions changed, 0 affected
The release skill now scans commit history using conventional commit rules to determine major/minor/patch automatically. Explicit version argument still works as before.
…ns, and architectural snapshot Add /titan-run skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling end-to-end autonomous execution. Hardening layers added across the pipeline: - Pre-Agent Gate (G1-G4): git health, worktree validity, state integrity, backups - Post-phase validation (V1-V15): artifact structure, coverage, consistency checks - Stall detection with per-phase thresholds and no-progress abort - Mandatory human checkpoint before forge (unless --yes) New validation tools integrated into forge and gate: - Diff Review Agent (forge Step 9): verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs - Semantic Assertions (gate Step 5): export signature stability, import resolution integrity, dependency direction, re-export chain validation - Architectural Snapshot Comparator (gate Step 5.5): community stability, cross-domain dependency direction, cohesion delta, drift detection vs pre-forge baseline
Greptile SummaryThis PR adds three major capabilities to the Titan Paradigm pipeline:
The PR includes an extensive series of fixes (50+ reviewer threads addressed) and is remarkably thorough — covering shell portability, temp-file lifecycle, stale variable persistence, off-by-one array accesses, non-deterministic Louvain IDs, and many edge cases. Two remaining issues were found:
Both issues also apply to the Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["/titan-run"] --> B[Step 0: Pre-flight\nArgs, worktree, main sync]
B --> B1[Step 0.5: Artifact pre-validation\nif --start-from or --skip-*]
B1 --> C[Step 1: RECON sub-agent\nV1-V4 post-validation]
C --> D[Step 2: GAUNTLET loop\nSub-agents until complete\nV5-V7 + NDJSON integrity]
D --> E[Step 3: SYNC sub-agent\nV8-V10 post-validation]
E --> F[Step 3.5: Pre-forge\nArch snapshot + Human checkpoint]
F --> G[Step 4: FORGE loop\nG1-G4 pre-agent gate each iteration]
G --> H{Forge sub-agent}
H --> I[Step 8: Stage files\ngit add]
I --> J[Step 9: Diff Review\nD1 scope, D2 intent, D3 msg,\nD4 deletion, D5 leftover]
J -->|DIFF FAIL| K[Rollback + continue\nadd to failedTargets]
J -->|DIFF PASS/WARN| L[Step 10: Run tests\nPackage-manager aware]
L -->|fail| K
L -->|pass| M[Step 11: /titan-gate\nStep 5 Semantic + Step 5.5 Arch]
M -->|cycle/test/lint/build FAIL| N[Step 13: Rollback\ngit reset + checkout]
M -->|structural/semantic FAIL| O[Manual cleanup + continue]
M -->|PASS| P[Step 12: Commit\nRecord SHA + diffWarnings]
G --> Q[V11-V13 post-agent checks\nCommit audit + test suite]
Q --> R{All phases done?}
R -->|no| G
R -->|yes| S[V14-V15 Final validation\n+ Step 5 Report]
|
|
|
||
| if currentAuditedCount == previousAuditedCount: | ||
| stallCount += 1 | ||
| Print: "WARNING: Gauntlet iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)" | ||
| if stallCount >= maxStalls: | ||
| Stop: "Gauntlet stalled for <maxStalls> consecutive iterations at <currentAuditedCount>/<expectedTargetCount> targets. Likely stuck on a problematic target. Check gauntlet.ndjson for the last successful entry and investigate the next target in the batch." |
There was a problem hiding this comment.
Undefined variable
previousAuditedCountBeforeAgent in gauntlet efficiency check
The efficiency check references previousAuditedCountBeforeAgent, which is never defined in the pseudocode. By the time this line is reached, previousAuditedCount has already been updated to currentAuditedCount on the line just above, so using it there would always yield 0.
To correctly compute how many targets this iteration processed, you need to capture the pre-agent count before the update. For example:
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check: if progress is very slow (< 2 targets per iteration), warn
targetsThisIteration = currentAuditedCount - previousAuditedCountBeforeAgent # ← undefined
Should be:
# Save count before update for efficiency check
countBeforeUpdate = previousAuditedCount
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check
targetsThisIteration = currentAuditedCount - countBeforeUpdate
if targetsThisIteration == 1 and iteration > 3:
Print: "WARNING: Only 1 target per iteration..."
This same issue exists in the identical copy at docs/examples/claude-code-skills/titan-run/SKILL.md at the same line.
There was a problem hiding this comment.
Fixed — saved pre-update count to countBeforeUpdate before reassignment. Applied to both .claude/skills/ and docs/examples/ copies.
Impact: 1662 functions changed, 760 affected
|
Resolved merge conflicts with main and fixed the undefined previousAuditedCountBeforeAgent variable in the gauntlet loop efficiency check. |
| If any SHA doesn't exist → **WARN**: "Commit <sha> recorded in state but not found in git history. State may be out of sync." | ||
|
|
||
| **V15. Gate log consistency:** | ||
| If `.codegraph/titan/gate-log.ndjson` exists: |
There was a problem hiding this comment.
Incomplete merge conflict detection — misses
AU, UA, DU, UD markers
The check only lists UU, AA, DD as conflict indicators in git status --porcelain output. Git also produces AU (added by us, modified by them), UA (modified by us, added by them), DU (deleted by us, modified by them), and UD (modified by us, deleted by them) in the porcelain format for merge conflicts. Missing these cases could allow the pipeline to proceed with unresolved conflicts into a forge sub-agent that commits.
| If `.codegraph/titan/gate-log.ndjson` exists: | |
| - **Merge conflicts** (lines starting with `UU`, `AA`, `DD`, `AU`, `UA`, `DU`, `UD`): Stop immediately: "Unresolved merge conflict detected. Resolve before continuing." |
There was a problem hiding this comment.
Fixed — added AU, UA, DU, UD to the merge conflict detection markers in both .claude/skills/ and docs/examples/ copies.
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure | ||
|
|
There was a problem hiding this comment.
--start-from forge silently disables architectural comparison without warning
When using --start-from forge, Step 3.5a (architectural snapshot capture) is skipped entirely since it lives inside Step 3.5, which comes after sync. The gate's Step 5.5 gracefully handles a missing arch-snapshot.json by skipping the comparison, but the user receives no warning that the architectural drift check will be inactive for the entire forge run.
Consider adding an explicit notice at the start of the forge loop when arch-snapshot.json is absent and the start phase was forge:
if startPhase == "forge" and not exists(".codegraph/titan/arch-snapshot.json"):
Print: "NOTE: No arch-snapshot.json found. Architectural comparison in /titan-gate (Step 5.5) will be skipped for this run.
To enable it, run '/titan-run --start-from sync' to re-capture the pre-forge snapshot."
There was a problem hiding this comment.
Fixed — added a pre-loop check in Step 4a that warns when arch-snapshot.json is missing, advising the user to run --start-from sync to re-capture it.
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
.claude/skills/titan-forge/SKILL.md
Outdated
| 13. **On failure (test, gate, or diff-review):** | ||
| ```bash | ||
| git checkout -- <changed files> |
There was a problem hiding this comment.
Incomplete rollback — staged files not unstaged in Step 13
The new flow stages files in Step 8 (git add) before running tests (Step 10) and gate (Step 11). When a test or gate failure routes to Step 13, only the working-tree restoration is performed:
git checkout -- <changed files>This restores the file contents on disk but does not remove the files from the staging area. The changes remain indexed in git. If the orchestrator loops back and the next target's git add picks up different files, the leftover staged changes from the failed target will be silently included in the next commit.
The rollback should unstage first, then restore the working tree:
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git checkout -- <changed files> | |
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git reset HEAD <changed files> | |
| git checkout -- <changed files> | |
| ``` |
Note: the DIFF FAIL path in Step 9 already says "Unstage and revert changes" correctly — this gap only affects Step 13. The same fix is needed in the docs mirror at docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — added unstaging step before working tree restore in Step 13 rollback, so staged files are properly cleared before checkout. Applied to both .claude/skills/ and docs/examples/ copies.
Both commands are called in titan-forge's new diff review step (Step 9): fn-impact for deletion audit (D4) and exports for re-export chain checks.
|
Addressed the remaining open item from the summary: README command table (P2): Restored The |
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | ||
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | ||
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | |
There was a problem hiding this comment.
Command table stale after new diff-review and semantic-assertion additions
Three commands were removed from FORGE/GATE in the table, but the new steps introduced in this PR use them:
-
codegraph fn-impact— listed as "GAUNTLET, SYNC" only, but is still called in titan-forge Step 4 (pre-touch understanding) and in the new Step 9 D4 (deletion audit). It's also now used in titan-gate Step 5a. All three users (FORGE, GATE) are missing. -
codegraph context— listed as "SYNC" only, but titan-forge Step 4 still callscodegraph context <target> -T --json(this code is unchanged and confirmed in the current file). FORGE should be restored. -
codegraph exports— listed as "GAUNTLET" only, but the new titan-gate Step 5a explicitly runscodegraph exports <changed-file> -T --json. GATE should be added.
The same table appears in .claude/skills/ (mirrored) so the same correction applies there.
Suggested correction:
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | |
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | | |
| | `codegraph exports` | GAUNTLET, GATE | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC, FORGE, GATE | Blast radius | |
And separately restore FORGE to the codegraph context row:
| `codegraph context` | SYNC, FORGE | Full function context |
There was a problem hiding this comment.
Fixed — added GATE to exports and fn-impact rows, added FORGE to context row in the command table. Verified against actual usage in titan-gate Step 5a and titan-forge Steps 4 and 9.
There was a problem hiding this comment.
Fixed — added GATE to codegraph communities, codegraph structure, and codegraph deps in the command table. These are all used by GATE Step 5.5 (architectural snapshot comparison) but were missing from the "Used by" column.
| 5. Run /titan-gate — read .claude/skills/titan-gate/SKILL.md and follow it exactly. | ||
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure |
There was a problem hiding this comment.
--yes description inaccurate — forge no longer accepts this flag
The argument hint says --yes is "passed through to forge", but this PR removes --yes from titan-forge's supported arguments list (the --yes → skip confirmation prompt line was deleted from titan-forge). The flag is now passed to forge's dispatch invocation (line ~835 of this file: yesFlag = "--yes" if autoConfirm else ""), but forge silently ignores it since it no longer handles that argument.
--yes still has meaningful effects within the orchestrator itself (auto-resume on existing state, skip the pre-pipeline and forge-checkpoint confirmations), so it should stay — but the description should be updated to reflect its actual scope.
| 6. Commit on success, rollback on failure | |
| - `--yes` → skip all confirmation prompts in the orchestrator (pre-pipeline, forge checkpoint, and resume prompts) |
There was a problem hiding this comment.
Fixed — updated the --yes description to reflect its actual scope: orchestrator-level prompts only (pre-pipeline, forge checkpoint, and resume). Applied to both .claude/skills/ and docs/examples/ copies.
There was a problem hiding this comment.
Fixed — updated the Rules section description of --yes to clarify both purposes: (1) passed through to forge so it skips its per-phase confirmation prompt, and (2) within the orchestrator itself, skips pre-pipeline, forge checkpoint, and resume prompts. Updated in both .claude/skills/titan-run/SKILL.md and the example copy.
.claude/skills/titan-gate/SKILL.md
Outdated
| For structural-only and semantic **advisory** checks (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report and let user decide. | ||
| For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback. |
There was a problem hiding this comment.
"Advisory" label misleading for semantic/arch FAILs
Steps 5 (semantic assertions) and 5.5 (architectural snapshot) can produce hard FAILs — e.g., "Removed export parseConfig still imported by 3 files" or "Significant community restructuring detected." These are gating failures, not suggestions. However, the label "advisory" in this line groups them with genuinely informational checks like Step 7 (sync alignment) and Step 6 (branch structural diff), which could cause an AI agent to treat semantic FAILs with less urgency than intended.
The "advisory" wording describes the rollback behaviour (preserve staged changes for in-place fix), but an agent reading this may interpret it as "severity advisory" and downgrade the reported FAIL to a WARN.
Consider changing the label to something that distinguishes rollback behavior from verdict severity:
| For structural-only and semantic **advisory** checks (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report and let user decide. | |
| For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback. | |
| For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually. |
There was a problem hiding this comment.
Fixed — replaced the misleading "advisory" label with explicit "structural and semantic failures" wording. The line now reads: "For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually." This makes clear these are gating failures, not suggestions. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-forge/SKILL.md
Outdated
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | ||
| - If FAIL on **semantic/structural** (gate does not auto-rollback its staging area, but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback. |
There was a problem hiding this comment.
Complexity FAILs (gate Step 3) not covered by either named path
The two forge paths are "cycle/test/lint/build" (→ step 13) and "semantic/structural" (→ manual cleanup). Gate Step 3 (complexity delta) can produce hard FAILs — cognitive > 30, halstead.bugs > 1.0, mi < 20 — but it belongs to neither label.
An AI agent reading this must infer "complexity FAIL → not cycle/test/lint/build → therefore semantic/structural path." That deduction works because of the "(gate auto-rolls back staged changes)" hint in the cycle/test path, but an agent that pattern-matches on the label "semantic/structural" may not classify complexity failures there and route them incorrectly.
The comment "Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged" also uses "gate" ambiguously — it could mean "any gate failure" rather than just the auto-rollback subset.
Consider being explicit:
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | |
| - If FAIL on **semantic/structural** (gate does not auto-rollback its staging area, but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for cycle/test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback. | |
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree. | |
| - If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback. |
The same pattern is present in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — replaced the "semantic/structural" catch-all with an explicit "any other check" listing that names all non-rollback failure types: complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8). Also clarified the "Do NOT go to step 13" reason with "Step 2/4 failures" instead of the ambiguous "cycle/test/gate failures." The rollback command now uses git diff --cached --name-only and git diff --name-only for portable file discovery instead of relying on <files> placeholders. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-run/SKILL.md
Outdated
| git fetch origin main 2>/dev/null || true | ||
| mergeBase=$(git merge-base HEAD origin/main) | ||
| mainAdvance=$(git rev-list --count $mergeBase..origin/main) |
There was a problem hiding this comment.
Stale
origin/main silently produces wrong divergence count on fetch failure
The 2>/dev/null || true makes git fetch non-fatal, but git merge-base HEAD origin/main on the next line still uses origin/main — which may now point to an arbitrarily old commit from the last successful fetch (days or weeks ago). The result of git rev-list --count $mergeBase..origin/main would then be a completely wrong number (could be hundreds or thousands of commits), potentially alarming users into re-running the full recon when the codebase hasn't actually diverged much.
There's no signal to the user that the divergence count is based on stale data.
Consider checking whether the fetch succeeded and annotating the warning accordingly:
if git fetch origin main 2>/dev/null; then
mergeBase=$(git merge-base HEAD origin/main)
mainAdvance=$(git rev-list --count $mergeBase..origin/main)
# warn if mainAdvance > 0
else
Print: "NOTE: Could not fetch origin/main — skipping divergence check."
fiThe same applies to docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — the divergence check now detects fetch failure instead of silently falling through with stale data. The structure is now:
if git fetch origin main 2>/dev/null; then
mergeBase=$(git merge-base HEAD origin/main)
mainAdvance=$(git rev-list --count $mergeBase..origin/main)
else
mainAdvance="unknown"
fiWhen fetch fails, the checkpoint prints "NOTE: Could not fetch origin/main — skipping divergence check." instead of a potentially misleading commit count. Applied to both .claude/skills/ and docs/examples/ copies.
…dle stale fetch (#557) - Gate Step 9: replace misleading "advisory" label with explicit "structural and semantic failures" wording - Forge Step 11: replace "semantic/structural" catch-all with explicit "any other check" listing complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) - Run Step 3.5b: divergence check now detects fetch failure and skips instead of computing against stale origin/main
- Gate Step 9: replace misleading "advisory" label with explicit "structural and semantic failures" wording - Forge Step 11: replace "semantic/structural" catch-all with explicit listing of all non-rollback failure types - Run Step 3.5b: divergence check now detects fetch failure and skips instead of computing against stale origin/main
5b16ba8 to
c83683f
Compare
.claude/skills/titan-run/SKILL.md
Outdated
| **Divergence check** — before asking for confirmation, check how far main has advanced since the initial sync: | ||
| ```bash | ||
| if git fetch origin main 2>/dev/null; then | ||
| mergeBase=$(git merge-base HEAD origin/main) | ||
| mainAdvance=$(git rev-list --count $mergeBase..origin/main) | ||
| else | ||
| mainAdvance="unknown" | ||
| fi | ||
| ``` | ||
| If `mainAdvance == "unknown"`, append to the checkpoint output: | ||
| ``` | ||
| NOTE: Could not fetch origin/main — skipping divergence check. | ||
| ``` | ||
| If `mainAdvance > 0` (numeric), append to the checkpoint output: | ||
| ``` | ||
| WARNING: main has advanced <mainAdvance> commits since initial sync. | ||
| If significant, consider re-running: /titan-run --start-from recon | ||
| ``` |
There was a problem hiding this comment.
Divergence check runs after the confirmation print block, causing agents to show
[y/n] before computing the warning
The **Divergence check** section says "before asking for confirmation, check how far main has advanced" but it is placed after the Print: block (lines 400–425), which already shows the full formatted checkpoint including Proceed with /titan-forge? [y/n] and the closing ================================================================. An AI agent executing these instructions sequentially will:
- Print the formatted checkpoint box (including the
[y/n]prompt) - Then run
git fetch origin mainand computemainAdvance - "Append" the divergence warning or note outside the formatted box
This means either the agent shows the confirmation prompt before presenting the divergence warning, or the warning appears as disconnected text below the closing border — both are confusing and undermine the intent of the divergence check.
The snapshotStatus fix from a prior thread set the correct precedent: gather data before the Print: block. The divergence check should follow the same pattern.
The fix is to move the git fetch computation before the Print: block, bind the result to a variable (e.g., mainAdvanceNote), and include it inline in the template:
### 3.5b. Human checkpoint
First, determine the snapshot status from Step 3.5a:snapshotStatus = file_exists('.codegraph/titan/arch-snapshot.json') ? "captured" : "FAILED — gate A1/A3/A4 will be skipped"
**Divergence check** — run before printing the checkpoint:
```bash
if git fetch origin main 2>/dev/null; then
mergeBase=$(git merge-base HEAD origin/main)
mainAdvance=$(git rev-list --count $mergeBase..origin/main)
else
mainAdvance="unknown"
fi
mainAdvanceNote = mainAdvance == "unknown"
? "NOTE: Could not fetch origin/main — skipping divergence check."
: mainAdvance > 0
? "WARNING: main has advanced <mainAdvance> commits since initial sync.\nIf significant, consider re-running: /titan-run --start-from recon"
: ""
Print:
================================================================
ANALYSIS COMPLETE — FORGE CHECKPOINT
================================================================
...
Architectural snapshot: <snapshotStatus>
<mainAdvanceNote>
Proceed with /titan-forge? [y/n]
================================================================
The same fix is needed in `docs/examples/claude-code-skills/titan-run/SKILL.md` at the corresponding lines.
There was a problem hiding this comment.
Fixed — moved the divergence check (git fetch + merge-base computation) before the Print block, binding the result to a mainAdvanceNote variable that is included inline in the checkpoint template. The [y/n] prompt now appears after the divergence warning, not before. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-forge/SKILL.md
Outdated
| - Add target to `execution.completedTargets` | ||
| - Record any diff-review warnings by **appending** to `execution.diffWarnings` (if any). Each entry must follow this schema: | ||
| ```json | ||
| { "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N } |
There was a problem hiding this comment.
"check": "D3|D5" schema notation is ambiguous — agent may emit a literal pipe-separated value
The diffWarnings entry schema shows:
{ "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N }The D3|D5 notation is used as a type union (meaning "one of D3 or D5"), but it looks syntactically like the literal string value "D3|D5". An AI agent reading this schema might emit:
{ "target": "parseConfig", "check": "D3|D5", "message": "...", "phase": 2 }instead of the correct:
{ "target": "parseConfig", "check": "D3", "message": "...", "phase": 2 }Any downstream filtering of execution.diffWarnings by check type (e.g., counting only D5 warnings in the final report) would silently miss entries with the malformed "D3|D5" value.
| { "target": "<target-name>", "check": "D3|D5", "message": "<warning text>", "phase": N } | |
| ```json | |
| { "target": "<target-name>", "check": "<D3 or D5>", "message": "<warning text>", "phase": N } | |
| ``` |
The same ambiguity exists in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — changed "check": "D3|D5" to "check": "<D3 or D5>" to make clear this is a type union, not a literal pipe-separated value. An agent reading the schema will now produce "D3" or "D5" as individual values. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-gate/SKILL.md
Outdated
| For each **exported** symbol in changed files: | ||
| - Check if the symbol existed before this change: `git show HEAD:<file>` and compare function signatures | ||
| - If a function's **parameter list changed** (added required params, removed params, changed types): | ||
| ```bash | ||
| codegraph fn-impact <symbol> -T --json | ||
| ``` | ||
| Count callers. If callers > 0 and callers are NOT also staged → **FAIL**: "Signature change in `<symbol>` breaks <N> callers not updated in this commit: <caller list>" | ||
| - If an **export was removed entirely** and callers exist → **FAIL**: "Removed export `<symbol>` still imported by <N> files" |
There was a problem hiding this comment.
Step 5a silently no-ops for entirely new files
codegraph exports <changed-file> -T --json followed by git show HEAD:<file> fails with a fatal error for files that are new in this commit (they don't exist in HEAD). The step gives no guidance on this case, so an agent that hits the error may either:
- Abort the entire semantic assertion step, silently skipping 5b–5d as well.
- Treat the failure as "no prior state" and skip the caller check for that file — the desired behavior, but not documented.
Consider adding an explicit guard:
### 5a. Export signature stability
For each changed file from diff-impact (Step 1):
- First, check if the file existed before this commit:
```bash
git show HEAD:<changed-file> -- 2>/dev/null | head -1If the file is new (command fails or returns nothing), skip 5a for this file — new exports cannot break existing callers.
- Otherwise, run
codegraph exports <changed-file> -T --jsonand compare signatures against HEAD…
The same guard should be applied in the docs mirror at `docs/examples/claude-code-skills/titan-gate/SKILL.md`.
There was a problem hiding this comment.
Fixed — added an explicit new-file guard at the top of Step 5a. Before checking exported symbols, it now runs git show HEAD:<changed-file> -- 2>/dev/null | head -1 and skips the file if it's new (command fails). New exports cannot break existing callers. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-gate/SKILL.md
Outdated
| ### 5c. Dependency direction assertions (codegraph boundary rules) | ||
|
|
||
| Check the Step 1 `codegraph check --staged --boundaries` results for boundary violations (already collected — do not re-run). This covers `.codegraphrc.json` onion-architecture rules and any custom boundary predicates. | ||
|
|
||
| For each boundary violation reported by `codegraph check`: | ||
| - New dependency that violates a configured boundary rule → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | ||
|
|
||
| Additionally, from the diff-impact results already collected in Step 1, extract any **new** edges (imports that didn't exist before): | ||
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" | ||
|
|
||
| > **Note:** Step 5c relies exclusively on `codegraph check --boundaries` results. Domain-direction checks against `GLOBAL_ARCH.md` are handled by Step 5.5 A2 — do not duplicate them here. |
There was a problem hiding this comment.
Step 5c and A2 deduplication has no explicit tracking mechanism
A2 says to "skip any edge already flagged by Step 5c's codegraph check --boundaries results to avoid duplicate verdicts," and Step 5c's note says "Domain-direction checks against GLOBAL_ARCH.md are handled by Step 5.5 A2." However, Step 5c doesn't produce a named data structure of "flagged edges" — it emits FAIL verdicts in narrative form. An agent executing A2 later must somehow recall which specific (source, target) edges were already flagged by Step 5c.
Without an explicit de-duplication mechanism, an agent may:
- Re-flag edges already covered by Step 5c (duplicate FAIL messages in the gate report for the same edge).
- Incorrectly skip a valid A2 violation because it loosely resembles a Step 5c violation.
Consider making the deduplication explicit by naming the artifact:
**Step 5c** stores flagged edges in `step5cViolations` — a list of `{ source, target }` pairs.
**A2** then checks: for each new diff-impact edge violating domain direction — if `{ source, target }` is already in `step5cViolations`, skip it. Otherwise, emit FAIL.The same applies to docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — Step 5c now explicitly builds a step5cViolations list of { source, target } pairs as it processes boundary violations. A2 then checks each new edge against this list: if the edge is already in step5cViolations, it skips it. This replaces the vague prose instruction with a concrete deduplication mechanism. Applied to both .claude/skills/ and docs/examples/ copies.
| newFailedTargets = execution.failedTargets (or []) | ||
|
|
||
| if newCompletedPhases == previousCompletedPhases and len(newCompletedTargets) == len(previousCompletedTargets): | ||
| stallCount += 1 | ||
| Print: "WARNING: Forge iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)" | ||
| if stallCount >= maxStalls: | ||
| Stop: "Forge stalled on phase <nextPhase> for <maxStalls> consecutive iterations. Check titan-state.json → execution.failedTargets for details." |
There was a problem hiding this comment.
Forge stall fires on all-fail phases even though the agent made real progress
The stall condition is:
if newCompletedPhases == previousCompletedPhases and len(newCompletedTargets) == len(previousCompletedTargets):
stallCount += 1
execution.completedTargets only includes successfully committed targets. If a forge sub-agent processes 5 targets and all 5 fail (adding to execution.failedTargets instead), len(newCompletedTargets) stays unchanged. Two such iterations → stallCount == maxStalls (2) → pipeline aborts with "Forge stalled."
The pipeline aborts even though 10 targets were attempted and failed — the agent was not stalled, it was failing. The abort message ("Forge stalled") is therefore misleading, and the user loses the chance to manually fix the remaining targets.
The fix from the previous review cycle (tracking both completedPhases and completedTargets) already handles the case where a phase spans multiple sub-agent invocations. It doesn't handle the case where every target fails:
# Count total targets processed (succeeded + failed) to distinguish true stalls from all-fail phases
newProcessedTargets = len(newCompletedTargets) + len(newFailedTargets)
previousProcessedTargets = len(previousCompletedTargets) + len(previousFailedTargets)
if newCompletedPhases == previousCompletedPhases and newProcessedTargets == previousProcessedTargets:
stallCount += 1
Print: "WARNING: Forge iteration <iteration> made no progress…"
else:
stallCount = 0
The same fix is needed in docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — stall detection now counts total processed targets (completed + failed) instead of only completed targets. If a forge sub-agent attempts 5 targets and all 5 fail, that's progress (not a stall). Also initialized previousFailedTargets at loop start and tracks it across iterations. Applied to both .claude/skills/ and docs/examples/ copies.
| 10. **Run tests** — detect the project's test command from `package.json` (same detection as gate Step 4): | ||
| ```bash | ||
| testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const cmd=s.test?'npm test':s['test:ci']?'npm run test:ci':null;console.log(cmd||'NO_TEST_SCRIPT');") | ||
| ``` | ||
| - If `testCmd == "NO_TEST_SCRIPT"` → skip pre-gate test run (no test script configured). | ||
| - Otherwise: |
There was a problem hiding this comment.
testCmd detection doesn't handle yarn/pnpm workspaces where npm test may not be available
The detection script:
testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const cmd=s.test?'npm test':s['test:ci']?'npm run test:ci':null;console.log(cmd||'NO_TEST_SCRIPT');")always returns npm test or npm run test:ci regardless of whether the project uses yarn, pnpm, or bun. On a pnpm workspaces project where npm is not available (e.g., CI image installs only pnpm), npm test would fail with command not found, triggering a rollback as though tests failed — even though the test suite was never run.
Gate Step 4 avoids this problem by only detecting the SCRIPT NAME and letting the agent choose the runner. Consider aligning the detection:
testCmd=$(node -e "
const p = require('./package.json');
const s = p.scripts || {};
const script = s.test ? 'test' : s['test:ci'] ? 'test:ci' : null;
if (!script) { console.log('NO_TEST_SCRIPT'); process.exit(0); }
// Detect runner from lock file
const fs = require('fs');
const runner = fs.existsSync('yarn.lock') ? 'yarn' : fs.existsSync('pnpm-lock.yaml') ? 'pnpm' : fs.existsSync('bun.lockb') ? 'bun' : 'npm';
console.log(runner + (script === 'test' ? ' test' : ' run ' + script));
")The same applies to titan-run/SKILL.md V13 and both docs/examples/ mirrors.
There was a problem hiding this comment.
Fixed — test command detection now detects the package manager from lock files (yarn.lock, pnpm-lock.yaml, bun.lockb) and uses the appropriate runner instead of always defaulting to npm. Applied to both forge Step 10 and titan-run V13, and their docs/examples mirrors.
.claude/skills/titan-gate/SKILL.md
Outdated
| # Detect test command from package.json scripts (npm test, yarn test, pnpm test, etc.) | ||
| <test-runner> test 2>&1 || echo "TEST_FAILED" |
There was a problem hiding this comment.
<test-runner> placeholder not resolvable — test step will always error
The PR changed npm test to <test-runner> test with a comment, but no detection block was added to resolve the placeholder. A shell will interpret <test-runner> as a redirect-from-file operator (or simply fail with "command not found"), not as the test runner. Every gate invocation will produce a TEST_FAILED output due to the malformed command, even on projects with passing tests — which then triggers the auto-rollback (Step 9) on every commit.
Forge Step 10 and titan-run V13 were both fixed with an explicit node -e detection block in prior review cycles. Gate Step 4 needs the same treatment:
| # Detect test command from package.json scripts (npm test, yarn test, pnpm test, etc.) | |
| <test-runner> test 2>&1 || echo "TEST_FAILED" | |
| # Detect test command from package.json scripts | |
| testCmd=$(node -e "const p=require('./package.json');const s=p.scripts||{};const script=s.test?'test':s['test:ci']?'test:ci':null;if(!script){console.log('NO_TEST_SCRIPT');process.exit(0);}const fs=require('fs');const runner=fs.existsSync('yarn.lock')?'yarn':fs.existsSync('pnpm-lock.yaml')?'pnpm':fs.existsSync('bun.lockb')?'bun':'npm';console.log(runner+(script==='test'?' test':' run '+script));") | |
| if [ "$testCmd" != "NO_TEST_SCRIPT" ]; then | |
| $testCmd 2>&1 || echo "TEST_FAILED" | |
| fi |
The same fix is needed at docs/examples/claude-code-skills/titan-gate/SKILL.md line 108.
There was a problem hiding this comment.
Fixed — replaced the unresolvable <test-runner> placeholder with the same lock-file-based detection block used in forge Step 10 and titan-run V13. Gate Step 4 now detects the package manager from lock files (yarn.lock, pnpm-lock.yaml, bun.lockb) and skips tests if no test script is configured. Applied to both .claude/skills/ and docs/examples/ copies.
| ```bash | ||
| git reset HEAD <changed files> | ||
| git checkout -- <changed files> | ||
| ``` |
There was a problem hiding this comment.
DIFF FAIL rollback uses unresolved
<changed files> placeholder — staged changes may persist
Step 9's DIFF FAIL rollback uses the static placeholder <changed files>:
git reset HEAD <changed files>
git checkout -- <changed files>Unlike Step 13, which was fixed in a prior review cycle to use dynamic discovery ($(git diff --cached --name-only) and $(git diff --name-only)), this path carries no equivalent pattern. If an agent fails to substitute <changed files> with the actual file list, git reset HEAD and git checkout -- receive no arguments — the staged changes from Step 8 remain in the index and the working tree is left dirty. The next target's git add will then silently bundle the previous target's uncommitted, DIFF-FAILed changes into the next commit.
Apply the same portable pattern that Step 13 already uses:
| ```bash | |
| git reset HEAD <changed files> | |
| git checkout -- <changed files> | |
| ``` | |
| **On DIFF FAIL:** | |
| ```bash | |
| git reset HEAD -- $(git diff --cached --name-only) | |
| git checkout -- $(git diff --name-only) |
The same fix is needed at `docs/examples/claude-code-skills/titan-forge/SKILL.md` lines 200–203.
There was a problem hiding this comment.
Fixed — replaced the static <changed files> placeholder with dynamic file discovery using $(git diff --cached --name-only) and $(git diff --name-only), matching the portable pattern already used in Step 13. Applied to both .claude/skills/ and docs/examples/ copies.
…file detection in gate Step 4 (#557)
3635edf to
317f79c
Compare
.claude/skills/titan-gate/SKILL.md
Outdated
| @@ -228,7 +332,8 @@ Aggregate all checks: | |||
|
|
|||
| > "GATE FAIL: [reason]. Graph restored, changes unstaged but preserved. Fix and re-stage." | |||
|
|
|||
| For structural-only failures (Steps 2-4, 6-8), do NOT auto-rollback — report and let user decide. | |||
| For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually. | |||
| For Step 2 (new cycle) and Step 4 (lint/build/test), trigger auto-rollback. | |||
There was a problem hiding this comment.
Step 1's
--cycles flag conflicts with the no-rollback exception
Step 1 is listed in the no-rollback exception: "For structural and semantic failures (Steps 1, 3, 5, 5.5, 6-8), do NOT auto-rollback". However, Step 1 runs codegraph check --staged --cycles --blast-radius 30 --boundaries — the --cycles flag means Step 1 can produce a cycle FAIL verdict.
If an AI agent stops at Step 1 due to a cycle failure (which is legitimate — agents may short-circuit on FAIL), the no-rollback rule applies, leaving staged changes in the index. Step 2's auto-rollback trigger is never reached.
This contradicts the explicit rule two lines below: "For Step 2 (new cycle) → trigger auto-rollback." A cycle caught by Step 1 and a cycle caught by Step 2 are the same violation but would produce opposite rollback behavior.
The safe fix is to explicitly carve out the cycle sub-check from Step 1's no-rollback exception:
| For structural and semantic failures (Steps 1 [manifesto/blast-radius/boundary only — not cycles], 3, 5, 5.5, 6-8), do NOT auto-rollback — report the FAIL and let the user decide whether to fix in place or unstage manually. | |
| For Step 2 (new cycle), Step 1 cycle violations (from `--cycles` flag), and Step 4 (lint/build/test), trigger auto-rollback. |
The same clarification is needed in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — carved out cycle sub-check from Step 1's no-rollback exception. The exception now reads "Steps 1 [manifesto/blast-radius/boundary only — not cycles]" and the auto-rollback rule explicitly includes "Step 1 cycle violations (from --cycles flag)" alongside Step 2 and Step 4. Applied to both .claude/skills/ and docs/examples/ copies.
| 11. **Run /titan-gate:** | ||
| Use the Skill tool to invoke `titan-gate`. | ||
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree. | ||
| - If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD -- $(git diff --cached --name-only) && git checkout -- $(git diff --name-only)`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback. |
There was a problem hiding this comment.
Forge Step 11 cycle routing gap mirrors gate's Step 1 ambiguity
Forge Step 11 routes based on the gate failure type: "cycle/test/lint/build" → Step 13 rollback, "any other check" → manual cleanup.
However, gate Step 1 (codegraph check --staged --cycles) can now produce a cycle FAIL that is classified as "structural failure (Step 1)" by gate's own no-rollback exception. When gate reports this failure, forge receives a FAIL report with "Step 1" framing — which forge routes to the "any other check" (manual cleanup) path, not to the cycle rollback path.
A cycle violation should always trigger Step 13 rollback. The routing condition in forge should explicitly include cycle violations from Step 1:
| 11. **Run /titan-gate:** | |
| Use the Skill tool to invoke `titan-gate`. | |
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes, gate Step 2/4) → go to rollback (step 13) to also revert working tree. | |
| - If FAIL on **any other check** — complexity (Step 3), semantic (Step 5), structural/arch (Steps 1, 5.5, 6-8) — gate does NOT auto-rollback its staging area; forge must clean up for the next target → unstage with `git reset HEAD -- $(git diff --cached --name-only) && git checkout -- $(git diff --name-only)`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for Step 2/4 failures where gate already unstaged; going there again would attempt a duplicate rollback. | |
| - If FAIL on **cycle/test/lint/build** (gate auto-rolls back staged changes on Step 2/4 cycle; Step 1 may also report cycle violation via `--cycles` flag — treat that the same as Step 2) → go to rollback (step 13) to also revert working tree. |
The same fix is needed in docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — forge Step 11 now explicitly notes that Step 1 may also report cycle violations via the --cycles flag and that those should be treated the same as Step 2 cycle failures (routed to Step 13 rollback). Applied to both .claude/skills/ and docs/examples/ copies.
| git log --oneline -5 | ||
| # Record the HEAD sha before dispatching | ||
|
|
||
| headBefore = $(git rev-parse HEAD) |
There was a problem hiding this comment.
Shell command substitution inside pseudocode block not consistently marked
headBefore = $(git rev-parse HEAD) uses bash command substitution syntax inside an indented pseudocode block that is otherwise described in prose. Line 510 correctly shows headAfter = $(git rev-parse HEAD) as a similar operation. However, the git log --oneline -5 on line 478 is inside the same pseudocode block and lacks any output-capture syntax — it reads as a fire-and-forget command with no assignment, which means an agent executing this literally would run git log and discard its output, providing no context for V12.
If the intent is for the output to be printed (context-gathering), a Print: prefix like the one used elsewhere in this file would make the intent explicit:
| headBefore = $(git rev-parse HEAD) | |
| # Context: show recent commits so agent has context for V12 commit audit | |
| Print: git log --oneline -5 | |
| # Record the HEAD sha before dispatching |
This pattern is already used elsewhere in the file (e.g., line 539: Print: "Forge phase..."). Same in docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — added Print: prefix to the git log --oneline -5 line so the intent (display recent commits for V12 context) is explicit and consistent with the pattern used elsewhere in the file. Applied to both .claude/skills/ and docs/examples/ copies.
|
Addressed the 2 remaining items from the Greptile summary:
|
Summary
/titan-runorchestrator skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling hands-free end-to-end execution/titan-forge(Step 9) — verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs (scope check, intent match, commit message accuracy, deletion audit, leftover check)/titan-gate(Step 5) — export signature stability, import resolution integrity, dependency direction assertions, re-export chain validation/titan-gate(Step 5.5) — captures pre-forge architectural baseline, compares community stability, cross-domain dependency direction, cohesion delta, and drift after each commit/titan-run: Pre-Agent Gate (G1-G4), post-phase validation (V1-V15), stall detection, state file backup/recovery, NDJSON integrity checks, mandatory human checkpoint before forgeTest plan
/titan-runon a test codebase in a worktree — verify full pipeline completes--start-from forgeskips analysis phases but validates their artifacts exist