Context
Raw SSE logs from session 2026-03-02-1772489470 revealed two observability gaps:
- 3 code execution timeouts —
run_python_analysis hit the 300s wall-clock limit because neither the system prompt nor tool description told the model about the time budget. The model wrote expensive Monte Carlo simulations that exceeded it.
- Subagent start/stop mismatch (33 starts vs 40 stops) — the
agentRegistry in hookSSEBridge.js was fire-and-forget (deleted entries on stop), so there was no session-end audit. Stops without matching starts (compaction recovery, gate re-invocations) went undetected.
Both were observability/guidance gaps, not functional bugs — the session completed successfully.
Changes (v3.10.3, commit 8f1f2cf)
Issue 1: Code Execution Timeout Awareness
- System prompt: Execution budget paragraph added (interpolates
OVERALL_TIMEOUT_MS / 1000 = 300s)
- Tool description:
EXECUTION BUDGET block with iteration limits and vectorization guidance
- Structured error: Timeout path now sets
timeout: true + retry_hint on the result object
Issue 2: Subagent Lifecycle Ledger
createSSEBridge(): New factory replacing wrapHooksForSSE() as server's bridge entry point
agentLedger (Map): Persistent ledger tracking all start/stop events (never deletes entries)
- Orphan detection: Stops without matching starts logged to console + flagged in summary
getAgentSummary(): Returns { total_starts, total_stops, orphan_stops[], unmatched_starts[], has_mismatch }
- SSE event:
agent_lifecycle_summary emitted before manifest.finalize() when mismatches exist
Backward Compat
wrapHooksForSSE() remains exported and functional (passes null for ledger — all guards are no-ops)
- Zero behavioral change to any existing code path
Test Results
test/sdk/code-execution-bridge.test.js: 337/337 pass
- Inline hookSSEBridge verification: 8/8 scenarios pass (normal, orphan, unmatched, bail-out)
Files Changed
| File |
Lines |
src/tools/codeExecutionBridge.js |
+18 -6 |
src/utils/hookSSEBridge.js |
+108 -2 |
src/server/claude-sdk-server.js |
+13 -2 |
CHANGELOG.md |
+29 |
Future Work
Context
Raw SSE logs from session
2026-03-02-1772489470revealed two observability gaps:run_python_analysishit the 300s wall-clock limit because neither the system prompt nor tool description told the model about the time budget. The model wrote expensive Monte Carlo simulations that exceeded it.agentRegistryin hookSSEBridge.js was fire-and-forget (deleted entries on stop), so there was no session-end audit. Stops without matching starts (compaction recovery, gate re-invocations) went undetected.Both were observability/guidance gaps, not functional bugs — the session completed successfully.
Changes (v3.10.3, commit 8f1f2cf)
Issue 1: Code Execution Timeout Awareness
OVERALL_TIMEOUT_MS / 1000= 300s)EXECUTION BUDGETblock with iteration limits and vectorization guidancetimeout: true+retry_hinton the result objectIssue 2: Subagent Lifecycle Ledger
createSSEBridge(): New factory replacingwrapHooksForSSE()as server's bridge entry pointagentLedger(Map): Persistent ledger tracking all start/stop events (never deletes entries)getAgentSummary(): Returns{ total_starts, total_stops, orphan_stops[], unmatched_starts[], has_mismatch }agent_lifecycle_summaryemitted beforemanifest.finalize()when mismatches existBackward Compat
wrapHooksForSSE()remains exported and functional (passesnullfor ledger — all guards are no-ops)Test Results
test/sdk/code-execution-bridge.test.js: 337/337 passFiles Changed
src/tools/codeExecutionBridge.jssrc/utils/hookSSEBridge.jssrc/server/claude-sdk-server.jsCHANGELOG.mdFuture Work
agent_lifecycle_summarySSE event in the dashboard (e.g., warning badge on mismatch)OVERALL_TIMEOUT_MSif Anthropic changes the ~4.5 min container idle expiryretry_hintshould trigger automatic retry with simplified constraints