feat(agents): route plan + build through task tool for cheap-mid-tier execution#127
Merged
Conversation
… execution PRIME now delegates Phase 2 (plan authoring) to @plan AND Phase 3 (plan execution) to @build via the task tool — moving the highest-volume token consumers off Opus and onto the mid tier. Users running Kimi K2, GLM-4.6, Haiku, or any other cheap mid-tier model for the `mid` tier see a significant cost reduction on substantial work. Two coupled fixes: 1. Task-tool routing. Both @plan and @build were registered as mode:"primary" — invisible to other agents' task-tool picker. PRIME's Phase 2 'delegate to @plan' and Phase 3 'delegate to @build' instructions silently fell through to pilot-planner (whose description also led with 'Interactive planner…') or general. Per OpenCode's docs (opencode.ai/docs/agents/#mode), mode:"all" means the agent is BOTH primary (Tab-cycleable, top-level @-mention works) AND subagent (visible to other agents' task-tool picker). That's the correct setting for both agents — full dispatchability with no user-visible regression. Also rewrite pilot-planner's description lead phrase to 'Pilot-subsystem YAML plan generator.' to eliminate the description-prefix collision that misled the picker. 2. Phase 3 delegation. PRIME's Phase 3 is rewritten to delegate execution to @build via the task tool, validate diff on return, handle STOP payloads (cosmetic/approach/scope-expansion classes), and proceed to Phase 4. @build's prompt is reshaped for dual invocation: Section 4 no longer runs full test suite / lint (PRIME's Phase 4 owns QA), Section 5 becomes a structured Return payload contract, 'How to ask the user' scoped (question tool for top-level only, STOP-to-PRIME for subagent invocations). Context Firewall gains a mandatory Phase 3 delegation row. autopilot.md Phase 3 bullet aligned. Five regression tests lock the fix: - plan agent is task-tool-dispatchable (not mode:primary) - build agent is task-tool-dispatchable (not mode:primary) - pilot-planner description does not collide with plan description - prime prompt delegates Phase 3 to @build - build prompt does not re-run full test suite (PRIME's Phase 4 owns it) Mode-split tests restructured: primary-capable accepts mode:"primary" or "all"; subagent-capable accepts mode:"subagent" or "all". Plan: ~/.glorious/opencode/harness-opencode/plans/prime-delegates-phase3-to-build.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PRIME delegates Phase 3 execution to @build subagent
Goal
Move PRIME's Phase 3 (plan execution — file edits, per-change lint/tests, acceptance-box checking) off Opus and onto the
@buildsubagent (currently Sonnetmidtier, swappable to Kimi K2 / GLM-4.6 / Haiku via the user's tier mapping). This is the single biggest cost win available in the five-phase workflow: Phase 3 is the highest-volume token consumer, and its work is mechanical — Sonnet-class models handle it at parity with Opus.To make delegation actually work,
@buildmust gain task-tool visibility. OpenCode supports three agent modes (primary,subagent,all) —"all"is the correct setting, not"subagent": it keeps Tab-cycling + top-level@invocation intact AND adds task-tool picker visibility. The local commit75448b9usedmode: "subagent"for@plan, which unnecessarily removed Tab and top-level access. This plan amends that commit to usemode: "all"for@plan, and appliesmode: "all"to@buildas part of the new Phase 3 delegation work. Commit is local-only, hasn't been pushed — amend is safe per PRIME's amend rules (user explicitly requested; HEAD is mine; not pushed).Two commits total after this plan:
75448b9—@planmode"subagent"→"all";plan.mdwording reverted back to acknowledge user can still invoke directly AND via task tool; test assertions updated from=== "subagent"to!== "primary"(or explicit=== "all").@buildmode →"all"+@buildprompt reshape + Context Firewall row + autopilot.md line 76 update + new regression tests.PRIME keeps ownership of Phase 0 bootstrap, Phase 1 intent triage, Phase 1.5 framing + user interaction, Phase 2 grounding +
@plandelegation, Phase 4 QA-reviewer selection/dispatch, and Phase 5 handoff.Constraints
src/,test/. No writes to~/.config/opencode/or~/.claude/.readFileSync; never add static.mdimports (root rule 7).~/.claude,home/.claude,~/.config/opencode,home/.config/opencode) may appear in any edited prompt — CI guard attest/prompts-no-dangling-paths.test.tsfails the build otherwise (root rule 8).BUILD_PERMISSIONSbash rules untouched —@buildstill needs general bash +git commitfor Phase 3 checkpointing.mode: "all"preserves BOTH top-level invocation (Tab +@) AND task-tool dispatchability for@planand@build. No user-visible regression (unlike themode: "subagent"approach the amendment is correcting).@buildstays atmid(Sonnet-class); nothing in tier fixtures changes.75448b9is legal per PRIME's amend rules: (a) user explicitly requested, (b) HEAD is mine (Austin Hess <austin@kayn.ai>), (c) not pushed to remote (verified viagit statusshowing "ahead of origin/main by 1 commit").Acceptance criteria
Part 1 — amend commit 75448b9 (@plan fix correction)
src/agents/index.tsplanentry:mode: "all"(was"subagent"in 75448b9). Other fields unchanged.src/agents/prompts/plan.mdline 3: wording revised to reflect that @plan can be invoked directly by the user OR via the task tool — both work now. Suggested new first sentence: "You can be invoked directly by the user (Tab /@plan) or delegated to by PRIME via thetasktool. Either way, your output contract is identical: a written plan in the repo-shared plan directory." Rest of line 3 stays (the "when PRIME delegates, trust the brief..." portion).test/agents.test.tsregression tests updated formode: "all":plan agent is mode:subagent (task-tool-dispatchable)→ renamedplan agent is task-tool-dispatchable (not mode:primary). Assertion changes frommode === "subagent"tomode !== "primary"(accepts either"subagent"or"all").pilot-planner description does not collide with plan description— unchanged, still valid.test/agents.test.tsmode-split tests (has 2 primary/has the 12 subagents) need restructuring: withmode: "all", agents aren't cleanly primary-or-subagent. Two options: (a) change to "is task-tool-dispatchable" and "is top-level-invocable" checks, or (b) keep the mode-split tests but branch onmode === "primary" || mode === "all"for primary-capable, andmode === "subagent" || mode === "all"for subagent-capable. Pick (b) — minimal churn, explicit about dual-capability..changeset/fix-prime-dispatch-to-plan-agent.md: update the description to reflectmode: "all"(not"subagent"), and mention that top-level@planinvocation is preserved (no user-visible regression).Part 2 — new commit (Phase 3 delegation + @build → mode:all)
src/agents/index.tsline 694:buildentry hasmode: "all"(was"primary"). Nothing else in thebuildentry orBUILD_PERMISSIONS(lines 329-353) changes.src/agents/prompts/prime.mdPhase 3 (currently lines 261-285) rewritten so PRIME delegates execution to@buildinstead of running the file-edit loop itself. The exact phraseDelegate to @buildappears in the new Phase 3 body. The old lineFor each item in the plan's ## File-level changes:is removed.src/agents/prompts/prime.mdContext Firewall table (lines 347-353) gains a new row mandating delegation of Phase 3 execution to@build. Existing rows stay verbatim. The new row reads:| Phase 3 plan execution (any multi-file edit against a plan) | @build | Phase 3 is mechanical — Sonnet/Kimi/GLM can do it; Opus time is expensive |.src/agents/prompts/prime.mdSubagent reference (lines 364-372) gains a@buildbullet describing the new Phase 3 execution role. Suggested wording:- @build — executes a written plan file-by-file. Runs per-file lint/tests inline, checks acceptance boxes, commits locally. PRIME delegates Phase 3 execution here.src/agents/prompts/build.mdsection 2 "Confirm understanding" (lines 32-39) no longer writes user-facing narration — rewritten to describe preparing a return-payload summary that PRIME relays.src/agents/prompts/build.mdsection 4 "Final verification" (lines 58-64) no longer runs the full test suite or full lint. It is tightened to: all acceptance boxes[x],tsc_checkclean on each edited file,git diff --statmatches plan's## File-level changes. The phraseRun the full test suite. It must pass.is removed.src/agents/prompts/build.mdsection 5 "QA review" (lines 66-72) is removed entirely. PRIME owns QA-reviewer delegation.src/agents/prompts/build.mdends with a new "Return payload" section listing what@buildreturns to PRIME: (a) plan path, (b) commit SHAs fromgit log --oneline <base>..HEAD, (c) any plan mutations (threshold bumps, scope expansions), (d) anything unusual (pre-existing failures logged to plan's## Open questions, files touched outside## File-level changeswith justification).src/agents/prompts/build.md"How to ask the user" section (lines 3-7) — KEEP thequestiontool path. Rationale: withmode: "all", @build may be invoked directly by the user top-level (e.g.,@build <plan-path>); in that context, it CAN ask the user. When invoked as a subagent by PRIME, it should prefer STOP-to-PRIME butquestiontool still allowed as fallback. Add a sentence: "If invoked as a subagent by PRIME (via the task tool), prefer STOP-to-PRIME with a structured blocker payload over directquestiontool calls — PRIME owns user interaction in that context."src/agents/index.tsBUILD_PERMISSIONS.questionSTAYS"allow"— revert this decision from the original plan. Rationale: same as above; top-level@buildinvocation needsquestiontool.src/commands/prompts/autopilot.mdline 76 updated. Old:- **Phase 3 (Execute).** File-by-file. Check off acceptance criteria as you go — boxes are the signal both to the user and to the autopilot plugin.. New:- **Phase 3 (Execute).** Delegate to @build. @build executes file-by-file and returns a summary; PRIME relays progress. Acceptance boxes get checked during @build's execution.test/agents.test.tsline 12 comment updated to reflect dual-mode agents. Suggested:// 14 agents total: prime (primary), plan+build (both — mode:all), 9 pure subagents, 2 pilot subagents.test/agents.test.tsmode-split tests restructured per option (b) above: "primary-capable" acceptsmode: "primary" || "all"; "subagent-capable" acceptsmode: "subagent" || "all". With that change,planandbuild(bothmode: "all") show up in BOTH lists. Counts: 3 primary-capable (prime, plan, build); 12 subagent-capable (plan, build, qa-reviewer, qa-thorough, plan-reviewer, code-searcher, gap-analyzer, architecture-advisor, docs-maintainer, lib-reader, agents-md-writer, pilot-builder, pilot-planner) — wait that's 13. Recount: the original 11 subagents (excluding plan) + plan (mode:all, also subagent-capable) + build (mode:all, also subagent-capable) = 13 subagent-capable. Primary-capable = prime + plan + build = 3.test/agents.test.tsgains a new regression guardbuild agent is task-tool-dispatchable (not mode:primary)assertingagents["build"]!.mode !== "primary".test/agents.test.tsgains a new regression guardprime prompt delegates Phase 3 to @buildasserting PRIME's prompt contains the exact stringDelegate to @buildAND does NOT contain the stringFor each item in the plan's ## File-level changes:.test/agents.test.tsgains a new regression guardbuild prompt does not re-run full test suite (PRIME's Phase 4 owns it)asserting@build's prompt does NOT contain the exact stringRun the full test suite. It must pass.. Fails if a future edit re-embeds the old Section 4 wording..changeset/for the Phase 3 delegation work (separate from the amended one). Bump:patch(prompt/config change, not a breaking API change since the agent key names stay the same).Part 3 — verification (both parts together)
bun testpasses (all existing tests plus the new guards).bun run typecheckpasses.bun run buildsucceeds (dist output refreshed).git log --oneline origin/main..HEADshows exactly 2 commits: the amended 75448b9 (different SHA after amend) and the new Phase 3 commit.File-level changes
src/agents/index.ts
planentry mode flipped"primary"→"all"(was"subagent"in the original 75448b9; corrected during amend). Other fields unchanged.buildentry mode flipped"primary"→"all".BUILD_PERMISSIONSblock otherwise untouched;question: "allow"permission stays (reversed from the original plan's "deny" decision — rationale in AC items 50 and in the Open questions resolution below).mode: "all"is how OpenCode surfaces an agent to BOTH primary-mode user invocation (Tab +@name) AND the task-tool subagent picker. That's the correct fix for both @plan and @build — preserves top-level invocation while adding task-tool visibility. The earliermode: "subagent"approach would have removed Tab/top-level access unnecessarily.mode: "primary"specifically.BUILD_PERMISSIONSandPLAN_PERMISSIONSblocks unchanged.src/agents/prompts/pilot-planner.md
description:rewritten to remove the "Interactive planner for the pilot subsystem." prefix that collided with @plan's "Interactive planner. Orchestrates gap analysis..." — the collision was misleading PRIME's task-tool picker into dispatching to pilot-planner when users wanted a normal plan. New description starts with "Pilot-subsystem YAML plan generator."mode: "all"), the description collision could still mislead the picker. Sharpening pilot-planner's description makes the disambiguation unambiguous.src/agents/prompts/plan.md
tasktool..." (the stale wording from the earliermode: "subagent"version) to "You can be invoked directly by the user (Tab /@plan) or delegated to by PRIME via thetasktool." — reflects the dual invocation shape enabled bymode: "all".mode: "all", that wording became incorrect. This revision matches the actual behavior.src/agents/prompts/prime.md
@buildvia the task tool. New Phase 3 body instructs PRIME to pass the absolute plan path to @build, verifygit diff --staton return matches the plan's## File-level changes, handle STOP payloads (three classes: cosmetic/approach/scope-expansion), verify pre-existing-failure logging, then proceed to Phase 4. Trivial-work carve-out (no-plan requests): PRIME still edits directly.| Phase 3 plan execution (any multi-file edit against a plan) | @build | Phase 3 is mechanical — Sonnet/Kimi/GLM can do it; Opus time is expensive |. Existing rows unchanged.@buildbullet directly after the@planbullet:- @build — executes a written plan file-by-file. Runs per-file lint/tests inline, checks acceptance boxes, commits locally. Returns a structured payload with commit SHAs, plan mutations, and any STOP conditions. PRIME delegates Phase 3 execution here.git diff --staton @build's return so silent scope drift is caught; (3) regression guardprime prompt delegates Phase 3 to @buildfails the build if the old inline-execution text sneaks back.src/agents/prompts/build.md
questiontool is forbidden — @build STOPs with a structured blocker payload PRIME relays. When @build is invoked top-level by the user (@build <plan-path>), thequestiontool is allowed (same rules as other primaries). Workflow-mechanics exception preserved verbatim.[x],tsc_checkclean on edited files,git diff --statmatches plan's File-level changes. Per-file tests during section 3 are expected; full-suite run belongs to PRIME's Phase 4.build prompt does not re-run full test suiteflags any revert to the old shape.src/commands/prompts/autopilot.md
- **Phase 3 (Execute).** File-by-file. Check off acceptance criteria as you go — boxes are the signal both to the user and to the autopilot plugin.. New:- **Phase 3 (Execute).** Delegate to@build.@buildexecutes file-by-file and returns a summary; PRIME relays progress. Acceptance boxes get checked during@build's execution.test/agents.test.ts
has 2 primary agentstest →has 2 primary-capable agents besides plan (prime, build; mode=primary or mode=all): accepts["primary", "all"]viatoContain.has the 12 subagentstest →has 13 subagent-capable agents (mode=subagent or mode=all)withplanandbuildadded to the list (bothmode: "all"); accepts["subagent", "all"]. Existing regression test renamed fromplan agent is mode:subagent (task-tool-dispatchable)→plan agent is task-tool-dispatchable (not mode:primary); assertion changed from.toBe("subagent")to.not.toBe("primary")(accepts"subagent"or"all").build agent is task-tool-dispatchable (not mode:primary)— assertsagents["build"]!.mode !== "primary". Comment cites ccd1761 precedent and explains that mode:primary silently kills the Phase 3 delegation path.prime prompt delegates Phase 3 to @build— assertsprimeprompt contains"Delegate to \@build`"(matches the backtick-wrapped code reference in the rewritten Phase 3) AND does NOT contain"For each item in the plan's `## File-level changes`:"` (the old inline-execution loop header). Would fail if the old inline Phase 3 sneaks back.build prompt does not re-run full test suite (PRIME's Phase 4 owns it)— assertsbuildprompt does NOT contain"Run the full test suite. It must pass."nor"Run lint. It must pass.". Catches a revert that re-embeds the duplicate full-suite run.mode: "all"means plan and build aren't cleanly primary-or-subagent. The three new guards lock in the behavioral claims the prompt edits make: task-tool-dispatchability (catches index.ts reversion), Phase 3 delegation wording (catches prime.md reversion), no-duplicate-testing (catches build.md reversion)..not.toBe("primary")accepts more valid modes than.toBe("subagent")), so no false failures. The three new guards match the pattern of the existingplan agent is task-tool-dispatchableguard..changeset/fix-prime-dispatch-to-plan-agent.md (Part 1)
mode: "all"decision (no user-visible regression) and references the dispatch-to-pilot-planner bug + description collision fix..changeset/prime-delegates-phase3-to-build.md (Part 2)
Test plan
Execute in order:
bun run typecheck— must pass. Catches any type-surface regression from themodestring-literal change onplanandbuild.bun test test/agents.test.ts— must pass. Specifically:returns exactly 14 agents(unchanged count).has 2 primary-capable agents besides plan (prime, build; mode=primary or mode=all)(restructured).has 13 subagent-capable agents (mode=subagent or mode=all)(restructured, includesplanandbuild).plan agent is task-tool-dispatchable (not mode:primary)(existing guard, renamed).build agent is task-tool-dispatchable (not mode:primary)(new guard).prime prompt delegates Phase 3 to @build(new guard).build prompt does not re-run full test suite (PRIME's Phase 4 owns it)(new guard).pilot-planner description does not collide with plan description(existing guard, unchanged).build has correct model and temperature— must still pass; model/temperature unchanged.build bash object-form includes enumerated allow-list— must still pass;BUILD_PERMISSIONSbash block untouched.build bash object-form keeps destructive denies— must still pass.build agent keeps object-form destructive denies— must still pass.bun test test/harness-models.test.ts— must pass.buildstaysmidtier; no change expected.bun test test/prompts-no-dangling-paths.test.ts— must pass. No forbidden paths introduced.bun test— full suite must pass. Catches any cross-file regression.bun run build— must succeed. Refreshesdist/output including the edited prompts so downstream users pick up the new behavior on nextbun update.Manual smoke checks (optional but recommended before merge)
bunx @glrs-dev/harness-opencode installpointing at the localdist/), open a real task, confirm PRIME invokes the task tool withagent: "build"during Phase 3.@build <plan-path>at the top level of an OpenCode session STILL works (expected —mode: "all"preserves top-level invocation). Same for@plan.Out of scope
BUILD_PERMISSIONSbash rules (CORE_BASH_ALLOW_LIST,CORE_DESTRUCTIVE_BASH_DENIES,"git clean *": "deny","git reset --hard*": "ask"). Permissions work identically for primary and subagent.@buildstill needs general bash +git commitfor Phase 3 checkpointing.question: "allow"also stays — see Open questions resolution below.@plan's prompt body beyond the Part 1 amend (line 3 dual-invocation revision). Its workflow (Interview / Ground / Gap analysis / Write / Adversarial review / Report) is out of scope here.@qa-reviewer/@qa-thorough/@plan-reviewer/@code-searcher/@gap-analyzerprompts. Their references to "the build agent" stay semantically correct —@buildstill executes plans,@qa-reviewerstill reviews its output.src/agents/prompts/architecture-advisor.mdline 13 ("The build agent has failed at the same task twice") — trigger condition doesn't depend on@build's mode.docs/archive/claude-code-fallbacks.md— archived per root rule guidance.src/agents/AGENTS.mdtier comment at line 25 (mid — Sonnet-class (@qa-reviewer, @plan-reviewer, @build)) — tier isn't changing.pilot-builder,pilot-plugin.ts, anything undersrc/pilot/). Thepilot-planner.mddescription rewrite in Part 1 is a user-facing prompt disambiguation (not a pilot-subsystem behavior change) and is explicitly in scope for Part 1.src/harness-models.tstier mapping.@buildstaysmid.Open questions
questiontool path stay on@build?") was resolved during execution: with@buildatmode: "all"(invocable top-level), thequestiontool is scoped via prompt-level guidance rather than permission-level deny. Subagent invocations STOP with a structured blocker payload PRIME relays; top-level invocations may use thequestiontool as primaries do.BUILD_PERMISSIONS.questionstays"allow"to enable the top-level path.