Skip to content

Wrapped-subagent turn budget: raised 25→50 + env-tunable + exhaustion salvage (canary-pending) #230

Description

@Number531

Summary

The wrapped-subagent runner capped every agent at a hardcoded MAX_TURNS = 25 — the only turn constant in runner.js with no env override — which doubled as both a work budget and a runaway backstop and fired on legitimate tool-dense work. Fix committed on branch harness-deterministic-orchestration (commit d948d457).

Diagnosis (session 2026-06-16-1781627690, Project Roku — Fox/Roku)

  • 6 of 8 specialists finished naturally (stop_reason: end_turn) at 6–13 turns.
  • financial-analyst and regulatory-rulemaking-analyst terminated with stop_reason: tool_use at exactly 25 turns — i.e. severed by the cap mid-work.
  • financial-analyst's report shipped with §VI/VII/VIII as [To be populated] scaffold stubs; its 25th-turn transcript was the placeholder self-check (Grep for placeholder|populated) — it was actively finishing the report when the cap cut it.
  • The session's halt was a separate, later, intentional user interruption at the orchestrator level (open parent mcp__subagents__run_financial_analyst dispatch). It did not cause the truncation — the cap did. is_final: true, errors: [] confirm the subagent finalized cleanly at the cap.

Change (committed, branch harness-deterministic-orchestration @ d948d457)

  • runner.js:572const MAX_TURNS = Number(process.env.WRAPPED_MAX_TURNS) || 50 (was 25). Default 50 ≈ 4× observed natural max (13); operator-tunable via WRAPPED_MAX_TURNS.
  • Exhaustion salvage: on loop exit mid-tool_use, the runner now returns recovered assistant text + stop_reason: 'max_turns_exhausted' / truncated: true instead of '(no text content)'. Guarded on finalContent.length === 0 && (stopReason === 'tool_use' || stopReason === 'compaction') so it never rewrites refusal/max_tokens/end_turn.
  • New test/sdk/wrappedSubagents/runner-turn-budget.test.js (env-override + default-50 + salvage guard).

Audit — no new binding constraint at 50

Parent MCP dispatch awaits runWrappedAgent directly (in-process tool(), no timeout); no overall wall-clock deadline; heartbeat is persistence-only; input context plateaus under the 30-msg history cap (so 50 turns is no riskier than 25 for the 1M ceiling); orchestrator budget maxTurns: 500. MAX_TURNS has zero coupling outside runner.js; no test asserts 25.

Status / next

  • Fix + test committed to branch
  • CHANGELOG updated ([Unreleased])
  • Live canary — confirm financial-analyst / regulatory-rulemaking-analyst reach end_turn ≤ 50 (also validates the never-run downstream phases)
  • PR → main after canary is green

Follow-ups (tracked here)

  • Deferred "ideal": progress/liveness-based runaway detection + a near-fuse "finalize now" reminder (the model currently gets zero turn-budget signal) — the principled form that lets completion drive termination.
  • Coupled quality lever: MAX_WRAPPED_TURN_HISTORY (default 30) bounds an agent's conversational window to ~15 turns at a 50-turn budget; raise to ~40 if more context retention is wanted.
  • Discovered, unrelated: test/sdk/wrappedSubagents/phase4-12-envelope-cleaner.test.js fails on main (cleaner call positioned before persistChartsToDisk in codeExecutionBridge.js) — pre-existing, deserves its own issue.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions