Observability: code execution timeout awareness + subagent lifecycle ledger

## Context

Raw SSE logs from session `2026-03-02-1772489470` revealed two observability gaps:

1. **3 code execution timeouts** — `run_python_analysis` hit the 300s wall-clock limit because neither the system prompt nor tool description told the model about the time budget. The model wrote expensive Monte Carlo simulations that exceeded it.
2. **Subagent start/stop mismatch** (33 starts vs 40 stops) — the `agentRegistry` in hookSSEBridge.js was fire-and-forget (deleted entries on stop), so there was no session-end audit. Stops without matching starts (compaction recovery, gate re-invocations) went undetected.

Both were observability/guidance gaps, not functional bugs — the session completed successfully.

## Changes (v3.10.3, commit 8f1f2cf)

### Issue 1: Code Execution Timeout Awareness
- **System prompt**: Execution budget paragraph added (interpolates `OVERALL_TIMEOUT_MS / 1000` = 300s)
- **Tool description**: `EXECUTION BUDGET` block with iteration limits and vectorization guidance
- **Structured error**: Timeout path now sets `timeout: true` + `retry_hint` on the result object

### Issue 2: Subagent Lifecycle Ledger
- **`createSSEBridge()`**: New factory replacing `wrapHooksForSSE()` as server's bridge entry point
- **`agentLedger`** (Map): Persistent ledger tracking all start/stop events (never deletes entries)
- **Orphan detection**: Stops without matching starts logged to console + flagged in summary
- **`getAgentSummary()`**: Returns `{ total_starts, total_stops, orphan_stops[], unmatched_starts[], has_mismatch }`
- **SSE event**: `agent_lifecycle_summary` emitted before `manifest.finalize()` when mismatches exist

### Backward Compat
- `wrapHooksForSSE()` remains exported and functional (passes `null` for ledger — all guards are no-ops)
- Zero behavioral change to any existing code path

## Test Results
- `test/sdk/code-execution-bridge.test.js`: **337/337 pass**
- Inline hookSSEBridge verification: **8/8 scenarios pass** (normal, orphan, unmatched, bail-out)

## Files Changed
| File | Lines |
|------|-------|
| `src/tools/codeExecutionBridge.js` | +18 -6 |
| `src/utils/hookSSEBridge.js` | +108 -2 |
| `src/server/claude-sdk-server.js` | +13 -2 |
| `CHANGELOG.md` | +29 |

## Future Work
- [ ] Frontend: render `agent_lifecycle_summary` SSE event in the dashboard (e.g., warning badge on mismatch)
- [ ] Metrics: track timeout frequency and retry success rate in hookDBBridge
- [ ] Consider adjusting `OVERALL_TIMEOUT_MS` if Anthropic changes the ~4.5 min container idle expiry
- [ ] Evaluate whether `retry_hint` should trigger automatic retry with simplified constraints

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observability: code execution timeout awareness + subagent lifecycle ledger #24

Context

Changes (v3.10.3, commit `8f1f2cf`)

Issue 1: Code Execution Timeout Awareness

Issue 2: Subagent Lifecycle Ledger

Backward Compat

Test Results

Files Changed

Future Work

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File	Lines
`src/tools/codeExecutionBridge.js`	+18 -6
`src/utils/hookSSEBridge.js`	+108 -2
`src/server/claude-sdk-server.js`	+13 -2
`CHANGELOG.md`	+29

Observability: code execution timeout awareness + subagent lifecycle ledger #24

Description

Context

Changes (v3.10.3, commit 8f1f2cf)

Issue 1: Code Execution Timeout Awareness

Issue 2: Subagent Lifecycle Ledger

Backward Compat

Test Results

Files Changed

Future Work

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Changes (v3.10.3, commit `8f1f2cf`)