Summary
12 of 160 vitest E2E tests fail in packages/codev/src/commands/porch/__tests__/e2e/, all in two files. All failures share the same shape: the test harness expects a structured __PORCH_RESULT__ payload from porch's stdout, but receives null.
Discovered while running pnpm --filter @cluesmith/codev test:e2e during the v3.0.1 release on 2026-04-30. Not a release blocker (143/160 = 89% pass rate, all failures in one functional area), but worth fixing.
Failing tests
src/commands/porch/__tests__/e2e/scenarios/happy-path.test.ts
Happy Path > continues to plan phase after spec approval — expected null to be 'plan-approval'
Full Lifecycle > completes full SPIR protocol with auto-approve — expected false to be true
src/commands/porch/__tests__/e2e/scenarios/single-phase.test.ts
Single Phase Mode > runs specify phase and returns structured result — expected null not to be null
Single Phase Mode > can resume with another --single-phase after gate approval — expected null not to be null
Single Phase Result Format > includes artifact path in result for specify phase — expected null not to be null
- (plus 7 additional cases in these files with the same shape)
Total: 12 failed | 143 passed | 5 skipped (160). Test files: 6 failed | 11 passed (17).
Hypotheses
The __PORCH_RESULT__ JSON sentinel in porch's stdout isn't reaching the test's parser. Likely either:
- Test harness regression — the parser for
__PORCH_RESULT__ (in the test fixtures) broke after some output buffering or line-splitting change.
- Porch output regression — porch isn't emitting the result sentinel anymore (e.g., a recent change to the result-output codepath).
- Build artifact staleness —
pnpm test:e2e runs pnpm build && vitest run. If the build is incremental and a stale dist file masks the latest source, tests would run against old porch behavior. Worth ruling out by pnpm clean && pnpm test:e2e.
Repro
pnpm --filter @cluesmith/codev test:e2e
The failures are deterministic across runs (not flakes).
Impact
- Unit tests (2,564 of them) all pass — only the E2E single-phase scenarios are affected
- The
--single-phase mode is for programmatic/planner-style use, not the human-driven SPIR workflow
- Released 3.0.1 ships with this issue; not surfaced in the published release notes
Suggested approach
Bugfix protocol. First step is figuring out whether porch is the regressed party or the test harness is. A 5-minute investigation will probably reveal which side broke.
Summary
12 of 160 vitest E2E tests fail in
packages/codev/src/commands/porch/__tests__/e2e/, all in two files. All failures share the same shape: the test harness expects a structured__PORCH_RESULT__payload from porch's stdout, but receivesnull.Discovered while running
pnpm --filter @cluesmith/codev test:e2eduring the v3.0.1 release on 2026-04-30. Not a release blocker (143/160 = 89% pass rate, all failures in one functional area), but worth fixing.Failing tests
src/commands/porch/__tests__/e2e/scenarios/happy-path.test.tsHappy Path > continues to plan phase after spec approval— expected null to be 'plan-approval'Full Lifecycle > completes full SPIR protocol with auto-approve— expected false to be truesrc/commands/porch/__tests__/e2e/scenarios/single-phase.test.tsSingle Phase Mode > runs specify phase and returns structured result— expected null not to be nullSingle Phase Mode > can resume with another --single-phase after gate approval— expected null not to be nullSingle Phase Result Format > includes artifact path in result for specify phase— expected null not to be nullTotal: 12 failed | 143 passed | 5 skipped (160). Test files: 6 failed | 11 passed (17).
Hypotheses
The
__PORCH_RESULT__JSON sentinel in porch's stdout isn't reaching the test's parser. Likely either:__PORCH_RESULT__(in the test fixtures) broke after some output buffering or line-splitting change.pnpm test:e2erunspnpm build && vitest run. If the build is incremental and a stale dist file masks the latest source, tests would run against old porch behavior. Worth ruling out bypnpm clean && pnpm test:e2e.Repro
The failures are deterministic across runs (not flakes).
Impact
--single-phasemode is for programmatic/planner-style use, not the human-driven SPIR workflowSuggested approach
Bugfix protocol. First step is figuring out whether porch is the regressed party or the test harness is. A 5-minute investigation will probably reveal which side broke.