Skip to content

[Spec 746] Baked architectural decisions in SPIR issue body#756

Merged
waleedkadous merged 39 commits into
mainfrom
builder/spir-746
May 18, 2026
Merged

[Spec 746] Baked architectural decisions in SPIR issue body#756
waleedkadous merged 39 commits into
mainfrom
builder/spir-746

Conversation

@waleedkadous
Copy link
Copy Markdown
Contributor

Summary

Adds a structured channel for architects to pin architectural decisions in SPIR / ASPIR / AIR issue bodies, so builders and CMAP reviewers honor those decisions instead of re-litigating them. Pure prompt-and-documentation change — zero new code surface, zero runtime impact.

Closes #746

Architects who file an issue with a ## Baked Decisions section see those decisions:

  • Surfaced explicitly to the builder by the builder-prompt
  • Honored during spec drafting (SPIR/ASPIR) or implementation (AIR) via the drafting prompts
  • Protected from relitigation by all six CMAP reviewer prompts (with explicit COMMENT-vs-REQUEST_CHANGES distinction + contradiction handling)
  • Discoverable via a new sub-section in each protocol.md

When the section is absent, behavior is unchanged from today — the no-op default.

Changes

  • 6 builder-prompts (SPIR/ASPIR/AIR + codev-skeleton mirrors): added "Baked Decisions" instruction paragraph after ## Protocol
  • 6 drafting prompts (SPIR/ASPIR specify.md, AIR implement.md + skeleton mirrors): added clause directing the builder to copy the section verbatim into Constraints / treat as fixed during implementation
  • 12 reviewer prompts (spec-review, plan-review, impl-review, pr-review + skeleton mirrors): added anti-relitigation clause with COMMENT/REQUEST_CHANGES distinction and contradiction handling
  • 6 protocol.md files (SPIR/ASPIR/AIR + skeleton mirrors): added discoverability sub-section
  • 1 new test file with 193 tests (grep regression, pure-addition diff, mirror parity, end-to-end smoke)
  • 12 baseline snapshots under __tests__/fixtures/baselines/ for the no-regression assertion
  • 3 new entries added to codev/resources/lessons-learned.md

All prompt language uses the architect-override carveout framing ("do not autonomously override / relitigate") per spec Resolved Decision #12.

Testing

  • 193 new tests in packages/codev/src/agent-farm/__tests__/baked-decisions.test.ts — all passing
  • Full pre-existing suite (2,632 tests across 130 files) — all passing
  • Programmatic end-to-end smoke verifies SPIR/ASPIR/AIR builder-prompts render correctly against fixture issues with and without Baked Decisions sections

Spec

codev/specs/746-spir-architect-s-baked-archite.md

Plan

codev/plans/746-spir-architect-s-baked-archite.md

Review

codev/reviews/746-spir-architect-s-baked-archite.md (includes full consultation history across spec, plan, and all 4 implement phases)

iter-3 revision:
- Drop .github/ISSUE_TEMPLATE/ scope (Codev is CLI-driven)
- Replace with discoverability paragraph in each protocol.md
- Tighten end-to-end transcript criterion to concrete snapshot diff
- Drop moot 'template count' open question
- Add architect-override carveout framing (Decision #12)
iter-2 plan revision addressing CMAP feedback:
- Phase 1: clarify spawn.ts as read-only verification touchpoint
- Phase 2/3: add explicit contradiction-handling language + grep tests
- Phase 5: expand no-regression coverage to all 12 static prompt files (consult-types + drafting + protocol.md)
iter-3 revision per architect feedback:
- Drop parser + TemplateContext field + handlebars block entirely (pure prompt/docs change)
- Reduce from 5 phases to 4 (parser phase + e2e fixture phase removed)
- Reduce from 15 baselines to 12 (no template-rendering snapshots needed)
- Test infrastructure simplifies to grep + pure-addition diff only
- More robust to variant section names (LLM recognition vs regex)
…t files

Snapshot of the 12 prompt files touched across Phases 1-3 (3 builder-prompts
+ 3 drafting prompts + 6 reviewer prompts), captured before any edits.
Subsequent phases assert the post-change file is a pure-addition diff
vs. these baselines.
Adds a 'Baked Decisions' section to SPIR/ASPIR/AIR builder-prompts (and
their codev-skeleton mirrors) telling the builder to recognize the section
in the issue body and treat its contents as fixed. Uses architect-override
carveout framing ('do not autonomously...') per spec Resolved Decision #12
and includes contradiction-handling per Codex iter-1 feedback.

Tests verify each touched file contains the required language and is a
pure-addition diff of its pre-change baseline.
…ross codev/skeleton

Addresses Codex iter-1 feedback (REQUEST_CHANGES #3): the original test
suite only checked codev/ files against baselines; never asserted that
the codev/ and skeleton copies of the new paragraph match each other.

Adds a focused parity check that extracts the Baked Decisions section
from each builder-prompt (codev/ + skeleton, for SPIR/ASPIR/AIR) and
asserts byte-identical content. Scope kept to the section this phase
owns — the pre-existing Multi-PR/Verify divergence in the surrounding
file is intentionally NOT in scope for this work.
Adds a baked-decisions clause to SPIR/ASPIR specify.md and AIR
implement.md (+ skeleton mirrors), instructing the builder to look for
the section first and treat its content as fixed during drafting/
implementation. Uses carveout phrasing and includes contradiction-handling.

Tests extend baked-decisions.test.ts with grep, pure-addition diff, and
mirror-parity assertions for the 6 new files. Total: 62 tests pass.
Adds anti-relitigation language to all 6 reviewer prompts (SPIR/ASPIR
spec-review + plan-review, AIR impl-review + pr-review) and their
codev-skeleton mirrors. Each clause: (1) carveout ('do not autonomously
challenge'), (2) COMMENT-vs-REQUEST_CHANGES distinction, (3) contradiction
handling ('clarify before proceeding'). plan-review supplements the
existing 'don't re-litigate spec decisions' line; the new paragraph names
baked decisions explicitly.

Tests extend baked-decisions.test.ts with grep, pure-addition diff, and
mirror-parity assertions for the 12 new files. Total: 123 tests pass.
…racted Baked Decisions section

Addresses Codex Phase 3 iter-1 feedback. The original assertion checked
'expect(content).toContain("COMMENT")' against the whole file, but the
pre-existing '## Verdict Format' section already contains those tokens,
so the test would pass even if the new Baked Decisions paragraph lost
the distinction.

Fix: lift extractBakedSection helper to the top of the Phase 3 describe,
rewrite per-file blocks to assert against the extracted section. A
regression where COMMENT/REQUEST_CHANGES disappears from the new
paragraph will now fail loudly. 123 tests still pass.
… cross-phase sweep

Adds 'Baked Decisions (Optional)' sub-section to SPIR/ASPIR/AIR
protocol.md (+ skeleton mirrors). This is the primary discoverability
surface — architects reading the protocol they're about to invoke learn
the convention here. Each paragraph mentions category hints (language /
framework / dependencies / deferred) and the amend/rescind escape hatch.

Tests extend baked-decisions.test.ts with:
- Phase 4 grep + mirror parity for the 6 docs files
- Final cross-phase sweep asserting all 30 touched files (Phases 1-4)
  contain 'Baked Decisions'

Manual smoke verified: spawning SPIR with a fixture issue containing
a '## Baked Decisions' section produces a rendered prompt containing
both the instruction paragraph and the issue's section content.

Total: 181 tests pass.
…o-end test

Addresses Codex Phase 4 iter-1 feedback: the manual smoke deliverable was
run during implementation but never committed as evidence future readers
could verify.

Upgraded to a programmatic end-to-end smoke that runs every test
invocation. For each of SPIR/ASPIR/AIR builder-prompts, renders against
two fixture issue bodies (with + without a Baked Decisions section)
and asserts (a) the Phase 1 instruction paragraph is present in both,
(b) baked-decisions content reaches the builder verbatim when present,
(c) no handlebars residue. Strictly stronger than a one-time check.

Also cleaned up a stale stream-of-consciousness comment in the inventory
list. Total: 193 tests pass.
Captures the project's spec compliance, deviations (notably the iter-3
drop-the-parser pivot), CMAP consultation history across all 4 implement
phases, and three generalizable lessons added to lessons-learned.md:
prompt-driven design over regex parsing, section-scoped parity criteria,
and programmatic-over-manual smoke.
…spec/plan

Addresses Codex PR-level iter-1 feedback.

1. Spec amendment (Amendment 1): the spec text described the parser-based
   conditional-rendering design ('no Baked Decisions block when section
   absent'), but the architect's iter-3 plan-approval feedback directed
   an unconditional instruction paragraph. The plan was rewritten; the
   spec text was not. This commit brings the spec text into alignment
   with the architect-directed design. The implementation was correct
   all along — only the spec text was stale.

2. YAML frontmatter (approved: 2026-05-17, validated: gemini/codex/claude)
   added to both spec and plan per CLAUDE.md repo convention for approved
   artifacts on main.

Zero code/test changes; all 193 tests still pass.
@waleedkadous
Copy link
Copy Markdown
Contributor Author

Architect approval — diff scanned, the prompt edits land exactly the architect-override carveout phrasing we landed on through 3 design iterations, and the COMMENT-vs-REQUEST_CHANGES distinction in the reviewer prompts is the right shape. codev/ ↔ skeleton/ parity confirmed via file hashes. Pinned by the 808-line test file. Approved — please merge with gh pr merge 756 --merge --admin.

@waleedkadous waleedkadous merged commit 0fe2457 into main May 18, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SPIR: architect's baked architectural decisions surface only at iter-2 CMAP, wasting iters

1 participant