[codex] Centralize session status policy#67
Conversation
Change-Id: Ieb854aba647511d3797d6c5319cc31d06eb20446 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
/coder-agents-review |
There was a problem hiding this comment.
The policy module is well-designed: satisfies Record<SessionStatus, SessionStatusPolicy> catches table completeness at compile time, module-load invariants enforce structural relationships, and the test suite verifies every predicate against every status including the destroying split (active + offline-replay, not terminal or collectable). Behavioral equivalence verified for all 11 migrated files.
2 P2, 5 P3, 2 P4, 5 Nit.
The two P2s are convergent: six reviewers independently flagged the rendering-path invariant gap, three flagged the active/terminal gap. The old code enforced both at compile time via exhaustive switches with never defaults. The policy table replaces those guarantees with runtime invariants, but these two are missing. The fix is two invariant() calls in the existing loop.
"The host-side messages don't even include the session ID or current status. 'Session is not running.' with no details. That is a doctor saying 'you're sick.'" (Leorio)
Process: the PR description mentions "one run integration status mismatch" during local testing and defers to CI without identifying the test, the assertion, or the status values. In a PR that changes status classification across 11 files, a status mismatch merits investigation, not deference.
The sessionState.ts status comparisons (6 direct comparisons for state-machine transition guards) were correctly excluded from migration. These define the state machine itself, not policy consumption. Consider documenting that boundary.
src/cli/commands/snapshot.ts:315
P3 [DEREM-5] snapshot.ts:315 and screenshot.ts:258 retain manifest.status === 'running' for the live-host-vs-offline-replay decision. The policy table classifies exiting as liveHostEligible: true, and inspect.ts follows via isLiveHostEligibleSessionStatus. These two files silently disagree.
The divergence predates this PR. Migrating would change exiting behavior, which violates the behavior-preserving claim. But the policy module is now the canonical authority, and these files are not documented as intentional exclusions. A developer adding a new live-host-eligible status would update the policy table, see inspect pick it up, and miss snapshot and screenshot. At minimum, add a comment explaining the intentional narrower check. (Kite P3, Pariston P3, Robin P3, Mafu-san P2)
🤖
test/unit/commands/mark.test.ts:52
P3 [DEREM-7] The PR rewrote the destroyed-session guard in mark.ts, type.ts, and wait.ts from === 'destroyed' to isDestroyedSessionStatus. None of these commands' test files exercise the destroyed status (createSessionRecord restricts status to 'running' | 'exiting' | 'exited'). run.test.ts already has this test (line 220). The gap was pre-existing, but the indirection through the policy table means a wiring mistake (e.g., swapping isDestroyedSessionStatus with isTerminalSessionStatus) would go undetected at the command level. (Bisky)
🤖
src/cli/commands/inspect.ts:40
Nit [DEREM-9] computeUptime retains session.status === 'running' in a file that otherwise imports from the policy module. Whether exiting and destroying sessions should count elapsed time as live uptime is a legitimate semantic question, but having an unmigrated inline check alongside migrated predicates is a maintenance trap. (Chopper, Robin, Pariston, Meruem)
🤖
🤖 This review was automatically generated with Coder Agents.
|
Addressed the remaining review-body items in d0b9297:
|
|
/coder-agents-review |
There was a problem hiding this comment.
All 14 Round 1 findings addressed in d0b9297. The fixes are thorough: the author added both missing invariants (DEREM-2, DEREM-3), extracted the shared guard (DEREM-8), restructured the test table from verbatim copy to classification-set mapping (DEREM-16), and documented every intentional exclusion (DEREM-5 snapshot/screenshot, DEREM-6 legacy error text, DEREM-10/11 destroyed convenience flag). Netero verified all fixes; the panel confirmed.
1 P3, 1 P4, 3 Nit new in this round.
The P3 is a false comment introduced in the fix: assertSessionCommandable claims its details: { status } reaches callers, but buildCliErrorResponse only serializes code and message, and RpcErrorSchema is .strict(). The details are dead data and the comment is factually incorrect about the wire format.
"The truth table held up under inspection. Solid opponent." (Hisoka)
Process note: commit d0b9297 subject "refactor: address status policy review" conveys mechanism, not intent. If this survives as a non-squashed entry, a reader six months from now learns nothing about what changed. Something like "refactor: add policy invariants and destroyed-session guard tests" names the changes.
test/unit/commands/wait.test.ts:350
P4 [DEREM-19] No test for wait --exit on an already-terminal session. The early-return path at wait.ts:350 (isTerminalSessionStatus(manifestStatus)) emits success with the exit code and returns without contacting the host. Every waitForExit: true test uses a running session. A wrong predicate would silently fall through to RPC against a dead host instead of returning immediately. Risk is low (predicate well-tested in sessionStatusPolicy.test.ts), but the path itself is uncovered.
Sketch: feed createSessionRecord('exited', 42) + waitForExit: true, assert emitSuccess called with { timedOut: false, exitCode: 42 } and sendRpc never called. (Bisky)
🤖
🤖 This review was automatically generated with Coder Agents.
d0b9297 to
9a5e318
Compare
|
Addressed the remaining review-body items in 9a5e318:
Local validation for this round:
|
|
/coder-agents-review |
There was a problem hiding this comment.
All 19 findings from Rounds 1-2 addressed. The fixes are thorough and demonstrate strong engineering judgment: the author generalized from corrections (DEREM-8 guard extraction), restructured rather than parroted (DEREM-16 test rewrite), and documented intentional exclusions rather than silently skipping them (DEREM-5 snapshot/screenshot comments).
1 P4 new. 9 of 11 reviewers report no findings. Netero clean.
The P4 is a narrow test gap in the DEREM-19 fix: the wait terminal early-return test covers exitCode: 42 but not the exitCode === null branch, where the conditional spread at wait.ts:353 omits the field entirely. A wrong spread would change the public JSON contract from {timedOut: false} to {timedOut: false, exitCode: null}. Low risk (the conditional spread is straightforward), but worth adding a null case.
"The truth table held up under inspection. Solid opponent." (Hisoka)
🤖 This review was automatically generated with Coder Agents.
Change-Id: I317c32b10c7bd62ac82784bfb9d3d6c791d373b9 Signed-off-by: Thomas Kosiewski <tk@coder.com>
9a5e318 to
999d42c
Compare
|
Addressed DEREM-22 in 999d42c:
Local validation for this round:
|
|
/coder-agents-review |
There was a problem hiding this comment.
All 20 posted findings across 3 rounds addressed. DEREM-22 (null exitCode test gap) verified fixed in 999d42c. 11 reviewers, 0 new findings. Netero clean.
This PR is ready to merge. The session status policy module centralizes scattered inline checks into a single truth table with compile-time exhaustiveness (satisfies Record<SessionStatus, SessionStatusPolicy>), import-time structural invariants (active XOR terminal, collectable implies terminal, commandable implies active, exactly one renderer source), and named predicates that make call sites self-documenting. Behavioral equivalence verified across all substitution sites by multiple reviewers across 4 rounds.
🤖 This review was automatically generated with Coder Agents.
Closes #58
What changed
SessionStatusas active, terminal, commandable, live-host eligible, offline-replay eligible, collectable, and destroyed.CONTEXT.md, including thedestroyingsplit: active and offline-replay eligible, but not terminal or collectable.Impact
This is intended to be behavior-preserving. The public status values, protocol schemas, event log behavior, and artifact formats are unchanged.
Validation
Passed:
npx vitest run test/unit/protocol/sessionStatusPolicy.test.tsnpx vitest run test/unit/commands/type.test.tsnpx vitest run test/unit/commands/mark.test.tsnpx vitest run test/unit/commands/wait.test.tsnpx vitest run test/unit/host/hostMain.test.tsnpx vitest run --maxWorkers=1 test/integration/lifecycle.test.tsnpx vitest run test/unit/protocol/sessionStatusPolicy.test.ts test/unit/commands/inspect.test.ts test/unit/commands/gc.test.ts test/unit/commands/run.test.ts test/unit/commands/type.test.ts test/unit/commands/mark.test.ts test/unit/commands/wait.test.ts test/unit/host/hostMain.test.tsnpx vitest run --maxWorkers=1 test/integration/lifecycle.test.ts test/integration/gc.test.tsnpx vitest run --maxWorkers=1 test/integration/run.test.ts test/integration/wait-render.test.tsnpx vitest run test/unit/commands/golden-envelopes.test.tsnpm run typechecknpm run format:checkgit diff --checknpx vitest run --maxWorkers=1 test/integration/backend-selection.test.ts test/integration/run.test.tsNote:
mise run cipassed format-check, workflow-lint, lint, and typecheck, then the local full Vitest run hit two 5s backend-selection timeouts and one run integration status mismatch. The failing suites passed when rerun serially with--maxWorkers=1, so this looks like local full-suite contention rather than a status-policy regression. Remote PR checks should give the canonical signal.