diff --git a/docs/AGENTS.md b/docs/AGENTS.md index 2d05b31fe6..6e7e1e6ab0 100644 --- a/docs/AGENTS.md +++ b/docs/AGENTS.md @@ -69,8 +69,6 @@ Core workflow: - If a PR has Codex review comments, address + resolve them, then re-request review by commenting `@codex review` on the PR. - Prefer `gh` CLI for GitHub interactions over manual web/curl flows. -- In Orchestrator mode, delegate implementation/verification commands to `exec` or `explore` sub-agents and integrate their patches; do not bypass delegation with direct local edits. -- In Orchestrator mode, route higher-complexity implementation tasks to `plan` sub-agents so they can research and produce a precise plan before auto-handoff to implementation. - User preference: when work is already on an open PR, push branch updates at the end of each completed change set so the PR stays current. - **PR creation gate:** Do **not** open/create a pull request unless the user explicitly asks (e.g., "open a PR", "create PR", "submit this"). By default, complete local validation, commit/push branch updates as requested, and let the user review before deciding whether to open a PR. @@ -82,11 +80,11 @@ Core workflow: When a PR exists, you MUST remain in this loop until the PR is fully ready: 1. Push your latest fixes. -2. Run local validation (`make static-check` and targeted tests as needed); in Orchestrator mode, delegate command execution to sub-agents. +2. Run local validation (`make static-check` and targeted tests as needed). 3. Request review with `@codex review`. 4. Run `./scripts/wait_pr_ready.sh ` (which must execute `./scripts/wait_pr_checks.sh --once` while checks are pending). -5. If Codex leaves comments, address them (delegate fixes in Orchestrator mode), resolve threads with `./scripts/resolve_pr_comment.sh `, push, and repeat. -6. If checks/mergeability fail, fix issues locally (delegate fixes in Orchestrator mode), push, and repeat. +5. If Codex leaves comments, address them, resolve threads with `./scripts/resolve_pr_comment.sh `, push, and repeat. +6. If checks/mergeability fail, fix issues locally, push, and repeat. The only early-stop exception is when the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction. diff --git a/docs/agents/index.mdx b/docs/agents/index.mdx index 6cdfa9914a..f5f7a75610 100644 --- a/docs/agents/index.mdx +++ b/docs/agents/index.mdx @@ -315,181 +315,6 @@ When a task involves repeated screenshot/action/verify loops for desktop GUI int -### Orchestrator - -**Coordinate sub-agent implementation and apply patches** - - - -```md ---- -name: Orchestrator -description: Coordinate sub-agent implementation and apply patches -base: exec -subagent: - runnable: false - append_prompt: | - You are running as a sub-agent orchestrator in a child workspace. - - - Your parent workspace handles all PR management. - Do NOT create pull requests, push to remote branches, or run any - `gh pr` / `git push` commands. This applies even if AGENTS.md or - other instructions say otherwise — those PR instructions target the - top-level workspace only. - - Orchestrate your delegated subtasks (spawn, await, apply patches, - verify locally), then call `agent_report` exactly once with: - - What changed (paths / key details) - - What you ran (tests, typecheck, lint) - - Any follow-ups / risks - - Do not expand scope beyond the delegated task. -tools: - add: - - ask_user_question - remove: - - propose_plan - # Keep Orchestrator focused on coordination: no direct file edits. - - file_edit_.* ---- - -You are an internal Orchestrator agent running in Exec mode. - -**Mission:** coordinate implementation by delegating investigation + coding to sub-agents, then integrating their patches into this workspace. - -When a plan is present (default): - -- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable. -- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs. -- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user. -- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity). - -What you are allowed to do directly in this workspace: - -- Spawn/await/manage sub-agent tasks (`task`, `task_await`, `task_list`, `task_terminate`). -- Apply patches (`task_apply_git_patch`). -- Use `bash` for orchestration workflows: repo coordination via `git`/`gh`, targeted post-apply verification runs, and waiting on review/CI completion after PR updates (for example: `git push`, `gh pr comment`, `gh pr view`, `gh pr checks --watch`). Only run `gh pr create` when the user explicitly asks you to open a PR. -- Ask clarifying questions with `ask_user_question` when blocked. -- Coordinate targeted verification after integrating patches by running focused checks directly (when appropriate) or delegating runs to `explore`/`exec`. -- Delegate patch-conflict reconciliation to `exec` sub-agents. - -Hard rules (delegate-first): - -- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence. -- For correctness claims, an `explore` sub-agent report counts as having read the referenced files. -- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination). -- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end. -- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here. -- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches. - -Delegation guide: - -- Use `explore` for narrowly-scoped read-only questions (confirm an assumption, locate a symbol/callsite, find relevant tests). Avoid "scan the repo" prompts. -- Use `exec` for straightforward, low-complexity work where the implementation path is obvious from the task brief. - - Good fit: single-file edits, localized wiring to existing helpers, straightforward command execution, or narrowly scoped follow-ups with clear acceptance. - - Provide a compact task brief (so the sub-agent can act without reading the full plan) with: - - Task: one sentence - - Background (why this matters): 1–3 bullets - - Scope / non-goals: what to change, and what not to change - - Starting points: relevant files/symbols/paths (from prior exploration) - - Acceptance: bullets / checks - - Deliverables: commits + verification commands to run - - Constraints: - - Do not expand scope. - - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation. - Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory. - If starting points + acceptance are already clear, skip initial explore and only explore when blocked. - - Create one or more git commits before `agent_report`. -- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach. - - Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified. - - For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins. - - Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery. - - Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes. - - For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions. -- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local. - -Recommended Orchestrator → Exec task brief template: - -- Task: -- Background (why this matters): - - -- Scope / non-goals: - - Scope: - - Non-goals: -- Starting points: -- Dependencies / assumptions: - - Assumes: - - If unmet: stop and report back; do not expand scope to create prerequisites. -- Acceptance: -- Deliverables: - - Commits: - - Verification: -- Constraints: - - Do not expand scope. - - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation. - Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory. - If starting points + acceptance are already clear, skip initial explore and only explore when blocked. - - Create one or more git commits before `agent_report`. - -Dependency analysis (required before spawning implementation tasks — `exec` or `plan`): - -- For each candidate subtask, write: - - Outputs: files/targets/artifacts introduced/renamed/generated - - Inputs / prerequisites (including for verification): what must already exist -- A subtask is "independent" only if its patch can be applied + verified on the current parent workspace HEAD, without any other pending patch. -- Parallelism is the default: maximize the size of each independent batch and run it in parallel. - Use the sequential protocol only when a subtask has a concrete prerequisite on another subtask's outputs. -- If task B depends on outputs from task A: - - Do not spawn B until A has completed and A's patch is applied in the parent workspace. - - If the dependency chain is tight (download → generate → wire-up), prefer one `exec` task rather than splitting. - -Example dependency chain (schema download → generation): - -- Task A outputs: a new download target + new schema files. -- Task B inputs: those schema files; verifies by running generation. -- Therefore: run Task A (await + apply patch) before spawning Task B. - -Patch integration loop (default): - -1. Identify a batch of independent subtasks. -2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity). -3. Await the batch via `task_await`. -4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time: - - Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`. - - Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results. - - Dry-run apply: `task_apply_git_patch` with `dry_run: true`. - - If dry-run succeeds, immediately apply for real: `task_apply_git_patch` with `dry_run: false`. - - Do not assume an inline `status: completed` result means the child changes are already present in this workspace. - - If dry-run fails, treat it as a patch conflict and delegate reconciliation: - 1. Do not attempt a real apply for that patch in this workspace. - 2. Spawn a dedicated `exec` task. In the brief, include the original failing `task_id` and instruct the sub-agent to replay that patch via `task_apply_git_patch`, resolve conflicts in its own workspace, run `git am --continue`, commit the resolved result, and report back with a new patch to apply cleanly. - - If real apply fails unexpectedly: - 1. Restore a clean working tree before delegating: run `git am --abort` via `bash` only when a git-am session is in progress; if abort reports no operation in progress, continue. - 2. Then follow the same delegated reconciliation flow above. -5. Verify + review: - - Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely. - - Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks. - - PASS: summary-only (no long logs). - - FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify. - -Sequential protocol (only for dependency chains): - -1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`. -2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application. -3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress). -4. Only after the patch is applied, spawn the dependent implementation task. -5. Repeat until the dependency chain is complete. - -Note: child workspaces are created at spawn time. Spawning dependents too early means they work from the wrong repo snapshot and get forced into scope expansion. - -Keep context minimal: - -- Do not request, paste, or restate large plans. -- Prefer short, actionable prompts, but include enough context that the sub-agent does not need your plan file. - - Child workspaces do not automatically have access to the parent's plan file; summarize just the relevant slice or provide file pointers. -- Prefer file paths/symbols over long prose. -``` - - - ### Plan **Create a plan before coding** diff --git a/docs/img/orchestrate-agents.webp b/docs/img/orchestrate-agents.webp deleted file mode 100644 index 1f8a18503a..0000000000 Binary files a/docs/img/orchestrate-agents.webp and /dev/null differ diff --git a/scripts/capture-readme-screenshots.ts b/scripts/capture-readme-screenshots.ts index 2ce3068535..a70fe60785 100644 --- a/scripts/capture-readme-screenshots.ts +++ b/scripts/capture-readme-screenshots.ts @@ -155,14 +155,6 @@ const STORIES: StoryDef[] = [ postProcess: async (pngBuffer: Buffer) => sharp(pngBuffer).webp({ quality: WEBP_QUALITY }).toBuffer(), }, - { - exportName: "OrchestrateAgents", - storyId: `${STORY_ID_PREFIX}orchestrate-agents`, - outputFile: "orchestrate-agents.webp", - // Narrower viewport makes the plan card + "Start Orchestrator" button more prominent - viewport: { width: 1200, height: 1188 }, - clip: { x: 0, y: 0, width: 1200, height: 1000 }, - }, ]; // --------------------------------------------------------------------------- diff --git a/src/browser/components/AgentModePicker/AgentModePicker.tsx b/src/browser/components/AgentModePicker/AgentModePicker.tsx index 46cb151d76..834c9b8635 100644 --- a/src/browser/components/AgentModePicker/AgentModePicker.tsx +++ b/src/browser/components/AgentModePicker/AgentModePicker.tsx @@ -1,5 +1,5 @@ import React, { useCallback, useEffect, useMemo, useRef, useState } from "react"; -import { Bot, ChevronDown, Route, SquareCode, Workflow } from "lucide-react"; +import { Bot, ChevronDown, Route, SquareCode } from "lucide-react"; import type { LucideIcon } from "lucide-react"; import { useAgent } from "@/browser/contexts/AgentContext"; @@ -47,7 +47,6 @@ interface AgentOption { const AGENT_ICONS: Record = { plan: Route, exec: SquareCode, - orchestrator: Workflow, }; const DEFAULT_AGENT_ICON: LucideIcon = Bot; diff --git a/src/browser/features/Settings/Sections/TasksSection.agents.ts b/src/browser/features/Settings/Sections/TasksSection.agents.ts index 23f08e372f..b7c6217fa8 100644 --- a/src/browser/features/Settings/Sections/TasksSection.agents.ts +++ b/src/browser/features/Settings/Sections/TasksSection.agents.ts @@ -91,16 +91,6 @@ export const FALLBACK_AGENTS: AgentDefinitionDescriptor[] = [ require: ["propose_name"], }, }, - { - id: "orchestrator", - scope: "built-in", - name: "Orchestrator", - description: "Coordinate sub-agent implementation and apply patches", - uiSelectable: true, - uiRoutable: true, - subagentRunnable: false, - base: "exec", - }, ]; function compareAgentsByName(a: AgentDefinitionDescriptor, b: AgentDefinitionDescriptor): number { diff --git a/src/browser/features/Settings/Sections/TasksSection.tsx b/src/browser/features/Settings/Sections/TasksSection.tsx index 770ee7d0b4..0f04d19f8b 100644 --- a/src/browser/features/Settings/Sections/TasksSection.tsx +++ b/src/browser/features/Settings/Sections/TasksSection.tsx @@ -34,11 +34,9 @@ import { import { DEFAULT_TASK_SETTINGS, TASK_SETTINGS_LIMITS, - isPlanSubagentExecutorRouting, normalizeSubagentAiDefaults, shouldMirrorAgentDefaultToLegacySubagent, normalizeTaskSettings, - type PlanSubagentExecutorRouting, type SubagentAiDefaults, type SubagentAiDefaultsEntry, type TaskSettings, @@ -280,9 +278,7 @@ function areTaskSettingsEqual(a: TaskSettings, b: TaskSettings): boolean { a.maxParallelAgentTasks === b.maxParallelAgentTasks && a.maxTaskNestingDepth === b.maxTaskNestingDepth && a.proposePlanImplementReplacesChatHistory === b.proposePlanImplementReplacesChatHistory && - a.preserveSubagentsUntilArchive === b.preserveSubagentsUntilArchive && - a.planSubagentExecutorRouting === b.planSubagentExecutorRouting && - a.planSubagentDefaultsToOrchestrator === b.planSubagentDefaultsToOrchestrator + a.preserveSubagentsUntilArchive === b.preserveSubagentsUntilArchive ); } @@ -742,25 +738,10 @@ export function TasksSection() { ); }; - const setPlanSubagentExecutorRouting = (value: string) => { - if (!isPlanSubagentExecutorRouting(value)) { - return; - } - - setTaskSettings((prev) => - normalizeTaskSettings({ - ...prev, - planSubagentExecutorRouting: value, - }) - ); - }; const setNewWorkspaceDefaultAgentId = (agentId: string) => { setGlobalDefaultAgentIdRaw(coerceAgentId(agentId)); }; - const planSubagentExecutorRouting: PlanSubagentExecutorRouting = - taskSettings.planSubagentExecutorRouting ?? "exec"; - const setAgentModel = (agentId: string, value: string) => { setAgentAiDefaults((prev) => updateAgentDefaultEntry(prev, agentId, (updated) => { @@ -1279,28 +1260,6 @@ export function TasksSection() { aria-label="Toggle preserve subagents until archive" /> - -
-
-
Plan sub-agents: executor routing
-
- Choose how plan sub-agent tasks route after propose_plan. -
-
- -
{saveError ?
{saveError}
: null} diff --git a/src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx b/src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx index 1406b811da..be85905d3d 100644 --- a/src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx +++ b/src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx @@ -2,9 +2,8 @@ import type { AppStory } from "@/browser/stories/meta.js"; import { appMeta, AppWithMocks } from "@/browser/stories/meta.js"; import { setupSimpleChatStory } from "@/browser/stories/helpers/chatSetup"; import { createAssistantMessage, createUserMessage } from "@/browser/stories/mocks/messages"; -import { createProposePlanTool, createTodoWriteTool } from "@/browser/stories/mocks/tools"; +import { createProposePlanTool } from "@/browser/stories/mocks/tools"; import { STABLE_TIMESTAMP } from "@/browser/stories/mocks/workspaces"; -import { PLAN_AUTO_ROUTING_STATUS_MESSAGE } from "@/common/constants/planAutoRoutingStatus"; const meta = { ...appMeta, title: "App/Chat/Tools/ProposePlan" }; export default meta; @@ -86,7 +85,7 @@ graph TD /** * Same as ProposePlan but with agent mode set to "plan". - * Shows Implement + Start Orchestrator buttons (no Continue in Auto). + * Shows the Implement button (no Continue in Auto). */ export const ProposePlanInPlanMode: AppStory = { render: () => ( @@ -154,79 +153,7 @@ graph TD description: { story: 'Same as ProposePlan but with agent mode set to "plan". ' + - "Shows Implement and Start Orchestrator buttons instead of Continue in Auto.", - }, - }, - }, -}; - -/** - * Captures the handoff pause after a plan is presented and before the executor stream starts. - * - * This reproduces the visual state where the sidebar shows "Deciding execution strategy…" - * while the proposed plan remains visible in the conversation. - */ -export const ProposePlanAutoRoutingDecisionGap: AppStory = { - render: () => ( - - setupSimpleChatStory({ - workspaceId: "ws-plan-auto-routing-gap", - workspaceName: "feature/plan-auto-routing", - messages: [ - createUserMessage( - "msg-1", - "Plan and implement a safe migration rollout for auth tokens.", - { - historySequence: 1, - timestamp: STABLE_TIMESTAMP - 240000, - } - ), - createAssistantMessage("msg-2", "Here is the implementation plan.", { - historySequence: 2, - timestamp: STABLE_TIMESTAMP - 230000, - toolCalls: [ - createProposePlanTool( - "call-plan-1", - `# Auth Token Migration Rollout - -## Goals - -- Migrate token validation to the new signing service. -- Maintain compatibility during rollout. -- Keep rollback simple and low risk. - -## Steps - -1. Add dual-read token validation behind a feature flag. -2. Ship telemetry for token verification outcomes. -3. Enable new validator for 10% of traffic. -4. Ramp to 100% after stability checks. -5. Remove legacy validator once metrics stay healthy. - -## Rollback - -- Disable the rollout flag to return to legacy validation immediately. -- Keep telemetry running to confirm recovery.` - ), - ], - }), - createAssistantMessage("msg-3", "Selecting the right executor for this plan.", { - historySequence: 3, - timestamp: STABLE_TIMESTAMP - 220000, - toolCalls: [createTodoWriteTool("call-status-1", PLAN_AUTO_ROUTING_STATUS_MESSAGE)], - }), - ], - }) - } - /> - ), - parameters: { - docs: { - description: { - story: - "Chromatic regression story for the plan auto-routing gap: after `propose_plan` succeeds, " + - "the sidebar stays in a working state with a 'Deciding execution strategy…' status before executor kickoff.", + "Shows the Implement button instead of Continue in Auto.", }, }, }, @@ -235,7 +162,7 @@ export const ProposePlanAutoRoutingDecisionGap: AppStory = { /** * Mobile viewport version of ProposePlan. * - * Verifies that on narrow screens the primary plan actions (Implement / Start Orchestrator) + * Verifies that on narrow screens the primary plan actions (Implement / Continue in Auto) * render as shortcut icons in the left action row (instead of right-aligned buttons). */ export const ProposePlanMobile: AppStory = { @@ -246,7 +173,7 @@ export const ProposePlanMobile: AppStory = { docs: { description: { story: - "Renders ProposePlan at an iPhone-sized viewport to verify that Implement / Start Orchestrator " + + "Renders ProposePlan at an iPhone-sized viewport to verify that Implement / Continue in Auto " + "appear as shortcut icons in the left action row (preventing right-side overflow on small screens).", }, }, diff --git a/src/browser/features/Tools/ProposePlanToolCall.test.tsx b/src/browser/features/Tools/ProposePlanToolCall.test.tsx index d91df275b2..3d95e36519 100644 --- a/src/browser/features/Tools/ProposePlanToolCall.test.tsx +++ b/src/browser/features/Tools/ProposePlanToolCall.test.tsx @@ -180,19 +180,6 @@ const TEST_AGENTS: AgentDefinitionDescriptor[] = [ thinkingLevel: "high", }, }, - { - id: "orchestrator", - scope: "built-in", - name: "Orchestrator", - uiSelectable: true, - uiRoutable: true, - subagentRunnable: true, - base: "exec", - aiDefaults: { - model: "openai:gpt-5.2-pro", - thinkingLevel: "medium", - }, - }, ]; const noop = () => { @@ -762,185 +749,4 @@ describe("ProposePlanToolCall", () => { expect(summaryMessage.parts?.[0]?.text).toContain("*Plan file preserved at:*"); expect(summaryMessage.parts?.[0]?.text).toContain(planPath); }); - - test("switches to orchestrator and sends a message when clicking Start Orchestrator", async () => { - const workspaceId = "ws-123"; - const planPath = "~/.mux/plans/demo/ws-123.md"; - const planModel = "anthropic:claude-sonnet-4-5"; - const planThinking = "high"; - const orchestratorModel = "openai:gpt-5.2-pro"; - const orchestratorThinking = "medium"; - - // Start in plan mode. - window.localStorage.setItem(getAgentIdKey(workspaceId), JSON.stringify("plan")); - updatePersistedState(getModelKey(workspaceId), planModel); - updatePersistedState(getThinkingLevelKey(workspaceId), planThinking); - updatePersistedState(AGENT_AI_DEFAULTS_KEY, { - orchestrator: { modelString: orchestratorModel, thinkingLevel: orchestratorThinking }, - }); - - const replaceChatHistoryCalls: unknown[] = []; - const sendMessageCalls: SendMessageArgs[] = []; - - mockApi = { - config: { - getConfig: () => - Promise.resolve({ - taskSettings: { maxParallelAgentTasks: 3, maxTaskNestingDepth: 3 }, - agentAiDefaults: {}, - subagentAiDefaults: {}, - }), - }, - workspace: { - getPlanContent: () => - Promise.resolve({ - success: true, - data: { content: "# My Plan\n\nDo the thing.", path: planPath }, - }), - replaceChatHistory: (args) => { - replaceChatHistoryCalls.push(args); - return Promise.resolve({ success: true, data: undefined }); - }, - sendMessage: (args: SendMessageArgs) => { - sendMessageCalls.push(args); - return Promise.resolve({ success: true, data: undefined }); - }, - }, - }; - - const view = renderToolCall( - - ); - - fireEvent.click(view.getByRole("button", { name: "Start Orchestrator" })); - - await waitFor(() => expect(sendMessageCalls.length).toBe(1)); - expect(sendMessageCalls[0]?.message).toBe( - "Start orchestrating the implementation of this plan." - ); - expect(sendMessageCalls[0]?.options.agentId).toBe("orchestrator"); - expect(sendMessageCalls[0]?.options.model).toBe(orchestratorModel); - expect(sendMessageCalls[0]?.options.thinkingLevel).toBe(orchestratorThinking); - expect(replaceChatHistoryCalls.length).toBe(0); - - // Clicking Start Orchestrator should switch the workspace agent to orchestrator. - const agentKey = getAgentIdKey(workspaceId); - const modelKey = getModelKey(workspaceId); - const thinkingKey = getThinkingLevelKey(workspaceId); - const updatePersistedStateMaybeMock = updatePersistedState as unknown as { - mock?: { calls: unknown[][] }; - }; - if (updatePersistedStateMaybeMock.mock) { - expect(updatePersistedState).toHaveBeenCalledWith(agentKey, "orchestrator"); - expect(updatePersistedState).toHaveBeenCalledWith(modelKey, orchestratorModel); - expect(updatePersistedState).toHaveBeenCalledWith(thinkingKey, orchestratorThinking); - } else { - expect(JSON.parse(window.localStorage.getItem(agentKey)!)).toBe("orchestrator"); - expect(JSON.parse(window.localStorage.getItem(modelKey)!)).toBe(orchestratorModel); - expect(JSON.parse(window.localStorage.getItem(thinkingKey)!)).toBe(orchestratorThinking); - } - }); - - test("replaces chat history before starting orchestrator when setting enabled", async () => { - const workspaceId = "ws-123"; - const planPath = "~/.mux/plans/demo/ws-123.md"; - - // Start in plan mode. - window.localStorage.setItem(getAgentIdKey(workspaceId), JSON.stringify("plan")); - - const calls: Array<"replaceChatHistory" | "sendMessage"> = []; - const replaceChatHistoryCalls: Array<{ - workspaceId: string; - summaryMessage: unknown; - mode?: "destructive" | "append-compaction-boundary" | null; - deletePlanFile?: boolean; - }> = []; - const sendMessageCalls: SendMessageArgs[] = []; - - mockApi = { - config: { - getConfig: () => - Promise.resolve({ - taskSettings: { - maxParallelAgentTasks: 3, - maxTaskNestingDepth: 3, - proposePlanImplementReplacesChatHistory: true, - }, - agentAiDefaults: {}, - subagentAiDefaults: {}, - }), - }, - workspace: { - getPlanContent: () => - Promise.resolve({ - success: true, - data: { content: "# My Plan\n\nDo the thing.", path: planPath }, - }), - replaceChatHistory: (args) => { - calls.push("replaceChatHistory"); - replaceChatHistoryCalls.push(args); - return Promise.resolve({ success: true, data: undefined }); - }, - sendMessage: (args: SendMessageArgs) => { - calls.push("sendMessage"); - sendMessageCalls.push(args); - return Promise.resolve({ success: true, data: undefined }); - }, - }, - }; - - const view = renderToolCall( - - ); - - fireEvent.click(view.getByRole("button", { name: "Start Orchestrator" })); - - await waitFor(() => expect(sendMessageCalls.length).toBe(1)); - expect(sendMessageCalls[0]?.message).toBe( - "Start orchestrating the implementation of this plan." - ); - expect(sendMessageCalls[0]?.options.agentId).toBe("orchestrator"); - - expect(replaceChatHistoryCalls.length).toBe(1); - expect(calls).toEqual(["replaceChatHistory", "sendMessage"]); - - const replaceArgs = replaceChatHistoryCalls[0]; - expect(replaceArgs?.deletePlanFile).toBe(false); - expect(replaceArgs?.mode).toBe("append-compaction-boundary"); - - const summaryMessage = replaceArgs?.summaryMessage as { - role?: string; - metadata?: { agentId?: string }; - parts?: Array<{ type?: string; text?: string }>; - }; - - expect(summaryMessage.role).toBe("assistant"); - expect(summaryMessage.parts?.[0]?.text).toContain( - "Note: This chat already contains the full plan" - ); - expect(summaryMessage.parts?.[0]?.text).not.toContain("Orchestrator mode"); - expect(summaryMessage.metadata?.agentId).toBe("plan"); - expect(summaryMessage.parts?.[0]?.text).toContain("*Plan file preserved at:*"); - expect(summaryMessage.parts?.[0]?.text).toContain(planPath); - }); }); diff --git a/src/browser/features/Tools/ProposePlanToolCall.tsx b/src/browser/features/Tools/ProposePlanToolCall.tsx index 524a480213..e4c1765e69 100644 --- a/src/browser/features/Tools/ProposePlanToolCall.tsx +++ b/src/browser/features/Tools/ProposePlanToolCall.tsx @@ -69,7 +69,6 @@ import { Pencil, Play, Sparkles, - Workflow, X, } from "lucide-react"; import { ShareMessagePopover } from "@/browser/components/ShareMessagePopover/ShareMessagePopover"; @@ -165,20 +164,17 @@ export const ProposePlanToolCall: React.FC = (props) = const { expanded, toggleExpanded } = useToolExpansion(true); // Expand by default const [showRaw, setShowRaw] = useState(false); const [annotateMode, setAnnotateMode] = useState(false); - const [isStartingOrchestrator, setIsStartingOrchestrator] = useState(false); const [isImplementing, setIsImplementing] = useState(false); const [isContinuingInAuto, setIsContinuingInAuto] = useState(false); const [implementReplacesChatHistory, setImplementReplacesChatHistory] = useState(false); - // On small screens, render the primary plan actions (Implement / Start Orchestrator / - // Continue in Auto) as shortcut icons alongside the other action buttons to avoid - // right-side overflow. + // On small screens, render the primary plan actions (Implement / Continue in Auto) as + // shortcut icons alongside the other action buttons to avoid right-side overflow. const [isNarrowScreen, setIsNarrowScreen] = useState(() => { if (typeof window === "undefined") return false; return window.innerWidth <= 768; }); - const isStartingOrchestratorRef = useRef(false); const isImplementingRef = useRef(false); const isContinuingInAutoRef = useRef(false); const isMountedRef = useRef(true); @@ -471,7 +467,7 @@ export const ProposePlanToolCall: React.FC = (props) = // uses the target agent defaults instead of stale planning-mode preferences. const resolveAndPersistTargetAgentSettings = (args: { workspaceId: string; - targetAgentId: "auto" | "exec" | "orchestrator"; + targetAgentId: "auto" | "exec"; }): { resolvedModel: string; resolvedThinking: ThinkingLevel } => { const modelKey = getModelKey(args.workspaceId); const thinkingKey = getThinkingLevelKey(args.workspaceId); @@ -509,61 +505,6 @@ export const ProposePlanToolCall: React.FC = (props) = return { resolvedModel, resolvedThinking }; }; - const handleStartOrchestrator = async () => { - if (!workspaceId || !api) return; - if (isStartingOrchestratorRef.current) return; - - isStartingOrchestratorRef.current = true; - if (isMountedRef.current) { - setIsStartingOrchestrator(true); - } - - try { - let shouldReplaceChatHistory = false; - - try { - const cfg = await api.config.getConfig(); - shouldReplaceChatHistory = - cfg.taskSettings.proposePlanImplementReplacesChatHistory ?? false; - } catch { - // Ignore config read errors (we'll default to old behavior). - } - - if (shouldReplaceChatHistory) { - await replaceChatHistoryWithPlan({ - idPrefix: "start-orchestrator", - errorContext: "Failed to replace chat history before starting orchestrator:", - }); - } - - const targetAgentId = "orchestrator"; - const { resolvedModel, resolvedThinking } = resolveAndPersistTargetAgentSettings({ - workspaceId, - targetAgentId, - }); - - const sendMessageOptions = getSendOptionsFromStorage(workspaceId); - - await api.workspace.sendMessage({ - workspaceId, - message: "Start orchestrating the implementation of this plan.", - options: { - ...sendMessageOptions, - agentId: targetAgentId, - model: resolvedModel, - thinkingLevel: resolvedThinking, - }, - }); - } catch (err) { - console.error("Failed to start orchestrator:", err); - } finally { - isStartingOrchestratorRef.current = false; - if (isMountedRef.current) { - setIsStartingOrchestrator(false); - } - } - }; - const handleImplement = async () => { if (!workspaceId || !api) return; if (isImplementingRef.current) return; @@ -739,7 +680,7 @@ export const ProposePlanToolCall: React.FC = (props) = ? { label: "Implement", onClick: () => void handleImplement(), - disabled: !api || isImplementing || isStartingOrchestrator || isContinuingInAuto, + disabled: !api || isImplementing || isContinuingInAuto, icon: , tooltip: implementReplacesChatHistory ? "Replace chat history with this plan, switch to Exec, and start implementing" @@ -747,25 +688,12 @@ export const ProposePlanToolCall: React.FC = (props) = } : null; - const orchestratorButton: ButtonConfig | null = - shouldShowPrimaryActions && !isAutoMode - ? { - label: "Start Orchestrator", - onClick: () => void handleStartOrchestrator(), - disabled: !api || isStartingOrchestrator || isImplementing || isContinuingInAuto, - icon: , - tooltip: implementReplacesChatHistory - ? "Replace chat history with this plan, switch to Orchestrator, and start delegating" - : "Switch to Orchestrator and start delegating", - } - : null; - const autoButton: ButtonConfig | null = shouldShowPrimaryActions && isAutoMode ? { label: "Continue in Auto", onClick: () => void handleContinueInAuto(), - disabled: !api || isContinuingInAuto || isImplementing || isStartingOrchestrator, + disabled: !api || isContinuingInAuto || isImplementing, icon: , tooltip: implementReplacesChatHistory ? "Replace chat history with this plan, switch to Auto, and let it decide the executor" @@ -918,15 +846,14 @@ export const ProposePlanToolCall: React.FC = (props) = {/* Mobile: icon-only plan actions, right-aligned, white */} - {isNarrowScreen && (implementButton ?? orchestratorButton ?? autoButton) && ( + {isNarrowScreen && (implementButton ?? autoButton) && (
{implementButton && } - {orchestratorButton && } {autoButton && }
)} - {!isNarrowScreen && (implementButton ?? orchestratorButton ?? autoButton) && ( + {!isNarrowScreen && (implementButton ?? autoButton) && (
{implementButton && ( @@ -949,27 +876,6 @@ export const ProposePlanToolCall: React.FC = (props) = )} - {orchestratorButton && ( - - - - - - {orchestratorButton.tooltip ?? orchestratorButton.label} - - - )} - {autoButton && ( diff --git a/src/browser/stories/App.readmeScreenshots.stories.tsx b/src/browser/stories/App.readmeScreenshots.stories.tsx index 62ccc67507..f729b66584 100644 --- a/src/browser/stories/App.readmeScreenshots.stories.tsx +++ b/src/browser/stories/App.readmeScreenshots.stories.tsx @@ -1226,194 +1226,3 @@ export const MobileServerMode: AppStory = { /> ), }; - -// README: docs/img/orchestrate-agents.webp -// Parent workspace is selected in plan mode while six running child workspaces -// show nested todo-derived progress in the expanded left sidebar. -export const OrchestrateAgents: AppStory = { - // Override the module-level 1900px decorator so the app itself renders at 1200px, - // matching the narrower capture viewport for a tighter orchestrator screenshot. - decorators: [ - (Story: () => JSX.Element) => ( -
- -
- ), - ], - render: () => ( - { - const workspaceId = "ws-orchestrator"; - - const parentWorkspace = createWorkspace({ - id: workspaceId, - name: "feature/parallel-auth", - projectName: README_PROJECT_NAME, - projectPath: README_PROJECT_PATH, - }); - - const subtaskFixtures = [ - { - id: "ws-sub-1", - name: "auth-middleware", - agentType: "exec" as const, - title: "Implement auth middleware", - todos: buildStoryTodos("Implementing auth middleware"), - assistantMessage: "Wiring auth middleware into each service entrypoint.", - }, - { - id: "ws-sub-2", - name: "token-service", - agentType: "exec" as const, - title: "Build token refresh service", - todos: buildStoryTodos("Reading token validation logic"), - assistantMessage: "Auditing refresh token validation before implementing rotation.", - }, - { - id: "ws-sub-3", - name: "rbac-policies", - agentType: "exec" as const, - title: "Add RBAC policy engine", - todos: buildStoryTodos("Writing policy evaluation tests"), - assistantMessage: "Building RBAC fixtures and policy matching assertions.", - }, - { - id: "ws-sub-4", - name: "session-store", - agentType: "exec" as const, - title: "Migrate session storage to Redis", - todos: buildStoryTodos("Running integration tests"), - assistantMessage: "Running Redis-backed session integration coverage now.", - }, - { - id: "ws-sub-5", - name: "api-gateway", - agentType: "exec" as const, - title: "Configure API gateway routes", - todos: buildStoryTodos("Wiring up rate limiting"), - assistantMessage: "Updating gateway route config with auth + throttling guards.", - }, - { - id: "ws-sub-6", - name: "audit-logging", - agentType: "explore" as const, - title: "Investigate audit log schema", - todos: buildStoryTodos("Reviewing existing log entries"), - assistantMessage: "Inspecting current audit log rows to document schema constraints.", - }, - ]; - - const childWorkspaces = subtaskFixtures.map((fixture, index) => ({ - ...createWorkspace({ - id: fixture.id, - name: fixture.name, - projectName: README_PROJECT_NAME, - projectPath: README_PROJECT_PATH, - createdAt: new Date(NOW - (index + 1) * 2_000).toISOString(), - }), - parentWorkspaceId: workspaceId, - agentType: fixture.agentType, - taskStatus: "running" as const, - title: fixture.title, - })); - - const workspaces = [parentWorkspace, ...childWorkspaces]; - - window.localStorage.setItem(LEFT_SIDEBAR_COLLAPSED_KEY, JSON.stringify(false)); - expandProjects([README_PROJECT_PATH]); - selectWorkspace(parentWorkspace); - collapseRightSidebar(); - - const parentPlanMarkdown = `# Feature: Parallel Auth Orchestration - -## Overview -Implement shared authentication across middleware, token refresh, RBAC policy checks, and audit logging. - -## Tasks - -### Task 1: Middleware integration -Implement shared auth middleware in every HTTP and RPC service boundary. - -### Task 2: Token refresh service -Build refresh-token rotation with revocation checks and expiry validation. - -### Task 3: RBAC policy engine -Add role-based policy evaluation with test coverage for allow/deny rules. - -### Task 4: Session storage migration -Move session persistence from local files to Redis-backed storage. - -### Task 5: API gateway route updates -Configure gateway auth guards, rate limits, and protected route wiring. - -### Task 6: Audit logging baseline -Document and validate audit log schema requirements before rollout. -`; - - const workspaceActivitySnapshots = Object.fromEntries( - subtaskFixtures.map((fixture, index) => { - const recency = NOW - (index + 1) * 2_000; - return [ - fixture.id, - { - recency, - streaming: false, - lastModel: null, - lastThinkingLevel: null, - todoStatus: deriveTodoStatus(fixture.todos) ?? null, - hasTodos: fixture.todos.length > 0, - }, - ]; - }) - ); - - const chatHandlers = new Map>([ - [ - workspaceId, - createStaticChatHandler([ - createUserMessage( - "msg-orchestrator-1", - "Implement auth across all services and split the work so multiple agents can run in parallel.", - { - historySequence: 1, - timestamp: STABLE_TIMESTAMP - 60_000, - } - ), - createAssistantMessage( - "msg-orchestrator-2", - "Here is a six-task execution plan. Start the orchestrator to launch subtasks.", - { - historySequence: 2, - timestamp: STABLE_TIMESTAMP - 50_000, - toolCalls: [ - createProposePlanTool("call-plan-orchestrator-1", parentPlanMarkdown), - ], - } - ), - ]), - ], - ...subtaskFixtures.map( - (fixture, index) => - [ - fixture.id, - createStaticChatHandler([ - createAssistantMessage(`msg-sub-${index + 1}`, fixture.assistantMessage, { - historySequence: 1, - timestamp: STABLE_TIMESTAMP - 45_000 + index * 1_000, - toolCalls: [createTodoWriteTool(`call-status-sub-${index + 1}`, fixture.todos)], - }), - ]), - ] as const - ), - ]); - - return createMockORPCClient({ - projects: groupWorkspacesByProject(workspaces), - workspaces, - workspaceActivitySnapshots, - onChat: createOnChatAdapter(chatHandlers), - }); - }} - /> - ), -}; diff --git a/src/browser/stories/mocks/orpc.ts b/src/browser/stories/mocks/orpc.ts index 410c5ea250..0fe6439a72 100644 --- a/src/browser/stories/mocks/orpc.ts +++ b/src/browser/stories/mocks/orpc.ts @@ -447,17 +447,6 @@ export function createMockORPCClient(options: MockORPCClientOptions = {}): APICl subagentRunnable: true, uiColor: "var(--color-exec-mode)", }, - { - id: "orchestrator", - scope: "built-in", - name: "Orchestrator", - description: "Coordinate multiple sub-agents to solve complex tasks in parallel.", - uiSelectable: true, - uiRoutable: true, - subagentRunnable: false, - base: "exec", - uiColor: "var(--color-exec-mode)", - }, { id: "compact", scope: "built-in", diff --git a/src/browser/utils/fuzzySearch.test.ts b/src/browser/utils/fuzzySearch.test.ts index e4d3787974..f649e4cfec 100644 --- a/src/browser/utils/fuzzySearch.test.ts +++ b/src/browser/utils/fuzzySearch.test.ts @@ -3,7 +3,7 @@ import { normalizeFuzzyText, splitQueryIntoTerms } from "./fuzzySearch"; describe("fuzzySearch", () => { test("normalizeFuzzyText lowercases and replaces common separators", () => { - expect(normalizeFuzzyText("Ask: check plan→orchestrator")).toBe("ask check plan orchestrator"); + expect(normalizeFuzzyText("Ask: check plan→exec")).toBe("ask check plan exec"); }); test("splitQueryIntoTerms splits on spaces and common punctuation", () => { diff --git a/src/browser/utils/messages/applyWorkspaceChatEventToAggregator.ts b/src/browser/utils/messages/applyWorkspaceChatEventToAggregator.ts index c946de6e52..ff146d17c0 100644 --- a/src/browser/utils/messages/applyWorkspaceChatEventToAggregator.ts +++ b/src/browser/utils/messages/applyWorkspaceChatEventToAggregator.ts @@ -193,7 +193,7 @@ export function applyWorkspaceChatEventToAggregator( if (allowSideEffects && shouldRefreshAgentsAfterToolCallEnd(event)) { // Keep agent discovery in sync when propose_plan succeeds so conditionally visible - // agents (for example, orchestrator with ui.requires: ["plan"]) appear immediately. + // agents (for example, those gated by `ui.requires: ["plan"]`) appear immediately. dispatchAgentsRefreshRequested(); } diff --git a/src/browser/utils/messages/modelMessageTransform.test.ts b/src/browser/utils/messages/modelMessageTransform.test.ts index 5aec5c1214..4175415327 100644 --- a/src/browser/utils/messages/modelMessageTransform.test.ts +++ b/src/browser/utils/messages/modelMessageTransform.test.ts @@ -1324,55 +1324,6 @@ describe("injectAgentTransition", () => { } }); - it("should include plan content when transitioning from plan to orchestrator", () => { - const messages: MuxMessage[] = [ - { - id: "user-1", - role: "user", - parts: [{ type: "text", text: "Let's plan a feature" }], - metadata: { timestamp: 1000 }, - }, - { - id: "assistant-1", - role: "assistant", - parts: [{ type: "text", text: "Here's the plan..." }], - metadata: { timestamp: 2000, agentId: "plan" }, - }, - { - id: "user-2", - role: "user", - parts: [{ type: "text", text: "Start orchestrating" }], - metadata: { timestamp: 3000 }, - }, - ]; - - const planContent = "# My Plan\n\n## Step 1\nDo something\n\n## Step 2\nDo more"; - const planFilePath = "~/.mux/plans/demo/ws-123.md"; - const result = injectAgentTransition( - messages, - "orchestrator", - undefined, - planContent, - planFilePath - ); - - expect(result.length).toBe(4); - const transitionMessage = result[2]; - const textPart = transitionMessage.parts[0]; - expect(textPart.type).toBe("text"); - if (textPart.type === "text") { - expect(textPart.text).toContain( - "[Agent switched from plan to orchestrator. Follow orchestrator agent instructions.]" - ); - expect(textPart.text).toContain(`Plan file path: ${planFilePath}`); - expect(textPart.text).toContain("orchestrate its implementation"); - expect(textPart.text).not.toContain("spawn sub-agents"); - expect(textPart.text).toContain(""); - expect(textPart.text).toContain(planContent); - expect(textPart.text).toContain(""); - } - }); - it("should NOT include plan content when transitioning from exec to plan", () => { const messages: MuxMessage[] = [ { diff --git a/src/browser/utils/messages/modelMessageTransform.ts b/src/browser/utils/messages/modelMessageTransform.ts index 09936ddeca..79a51c1080 100644 --- a/src/browser/utils/messages/modelMessageTransform.ts +++ b/src/browser/utils/messages/modelMessageTransform.ts @@ -143,14 +143,14 @@ export function addInterruptedSentinel(messages: MuxMessage[]): MuxMessage[] { * Inserts a synthetic user message before the final user message to signal the agent switch. * This provides temporal context that helps models understand they should follow new agent instructions. * - * When transitioning from plan → exec/orchestrator with plan content, includes the plan so the model + * When transitioning from plan → exec with plan content, includes the plan so the model * can evaluate its relevance to the current request. * * @param messages The conversation history * @param currentAgentId The agent id for the upcoming assistant response * @param toolNames Optional list of available tool names to include in transition message - * @param planContent Optional plan content to include when transitioning plan → exec/orchestrator - * @param planFilePath Optional plan file path to include when transitioning plan → exec/orchestrator + * @param planContent Optional plan content to include when transitioning plan → exec + * @param planFilePath Optional plan file path to include when transitioning plan → exec * @returns Messages with agent transition context injected if needed */ export function injectAgentTransition( @@ -213,25 +213,16 @@ export function injectAgentTransition( transitionText += "]"; } - // When transitioning from the plan agent to exec/orchestrator, include the plan for context. + // When transitioning from the plan agent to exec, include the plan for context. // This avoids wasting tokens on tool calls just to re-read the plan file. const transitioningFromPlan = lastAgentId === "plan"; - const transitioningToExecOrOrchestrator = - currentAgentId === "exec" || currentAgentId === "orchestrator"; - if (planContent && transitioningFromPlan && transitioningToExecOrOrchestrator) { + const transitioningToExec = currentAgentId === "exec"; + if (planContent && transitioningFromPlan && transitioningToExec) { const planFilePathText = planFilePath ? `Plan file path: ${planFilePath}\n\n` : ""; - const nextStepText = - currentAgentId === "orchestrator" - ? "orchestrate its implementation (do not re-plan)." - : "implement it directly (do not re-plan)."; - const followupText = - currentAgentId === "orchestrator" - ? "Only do extra exploration if the plan is missing critical details or conflicts with the repo:" - : "Only do extra exploration or spawn sub-agents if the plan is missing critical details or conflicts with the repo:"; transitionText += ` -${planFilePathText}The following plan was developed in the plan agent. Based on the user's message, determine if they have accepted the plan. If accepted and relevant, ${nextStepText} -${followupText} +${planFilePathText}The following plan was developed in the plan agent. Based on the user's message, determine if they have accepted the plan. If accepted and relevant, implement it directly (do not re-plan). +Only do extra exploration or spawn sub-agents if the plan is missing critical details or conflicts with the repo: ${planContent} diff --git a/src/browser/utils/workspaceModeAi.ts b/src/browser/utils/workspaceModeAi.ts index 9ebe1bd981..704eb71e48 100644 --- a/src/browser/utils/workspaceModeAi.ts +++ b/src/browser/utils/workspaceModeAi.ts @@ -11,7 +11,7 @@ function normalizeAgentId(agentId: string): string { } // Keep agent -> model/thinking precedence in one place so mode switches that send immediately -// (like propose_plan Implement / Start Orchestrator) resolve the same settings as sync effects. +// (like propose_plan Implement / Continue in Auto) resolve the same settings as sync effects. export function resolveWorkspaceAiSettingsForAgent(args: { agentId: string; agentAiDefaults: AgentAiDefaults; diff --git a/src/common/config/schemas/appConfigOnDisk.ts b/src/common/config/schemas/appConfigOnDisk.ts index 3ff8acd965..37ce4f5119 100644 --- a/src/common/config/schemas/appConfigOnDisk.ts +++ b/src/common/config/schemas/appConfigOnDisk.ts @@ -11,8 +11,8 @@ import { HEARTBEAT_MAX_INTERVAL_MS, HEARTBEAT_MIN_INTERVAL_MS } from "@/constant export { RuntimeEnablementOverridesSchema } from "../../schemas/runtimeEnablement"; export type { RuntimeEnablementOverrides } from "../../schemas/runtimeEnablement"; -export { PlanSubagentExecutorRoutingSchema, TaskSettingsSchema } from "./taskSettings"; -export type { PlanSubagentExecutorRouting, TaskSettings } from "./taskSettings"; +export { TaskSettingsSchema } from "./taskSettings"; +export type { TaskSettings } from "./taskSettings"; export const AgentAiDefaultsEntrySchema = z.object({ modelString: z.string().optional(), diff --git a/src/common/config/schemas/taskSettings.ts b/src/common/config/schemas/taskSettings.ts index 4c9e7a34a6..2ead3a550c 100644 --- a/src/common/config/schemas/taskSettings.ts +++ b/src/common/config/schemas/taskSettings.ts @@ -5,10 +5,6 @@ export const TASK_SETTINGS_LIMITS = { maxTaskNestingDepth: { min: 1, max: 5, default: 3 }, } as const; -export const PlanSubagentExecutorRoutingSchema = z.enum(["exec", "orchestrator", "auto"]); - -export type PlanSubagentExecutorRouting = z.infer; - export const TaskSettingsSchema = z.object({ maxParallelAgentTasks: z .number() @@ -24,8 +20,6 @@ export const TaskSettingsSchema = z.object({ .optional(), proposePlanImplementReplacesChatHistory: z.boolean().optional(), preserveSubagentsUntilArchive: z.boolean().optional(), - planSubagentExecutorRouting: PlanSubagentExecutorRoutingSchema.optional(), - planSubagentDefaultsToOrchestrator: z.boolean().optional(), }); export type TaskSettings = z.infer; diff --git a/src/common/constants/planAutoRoutingStatus.ts b/src/common/constants/planAutoRoutingStatus.ts deleted file mode 100644 index f23d34cbb5..0000000000 --- a/src/common/constants/planAutoRoutingStatus.ts +++ /dev/null @@ -1,4 +0,0 @@ -// Auto plan->executor routing can spend up to the router timeout selecting an executor. -// We surface this as a transient sidebar status so users know the handoff is still progressing. -export const PLAN_AUTO_ROUTING_STATUS_EMOJI = "🤔"; -export const PLAN_AUTO_ROUTING_STATUS_MESSAGE = "Deciding execution strategy…"; diff --git a/src/common/types/tasks.test.ts b/src/common/types/tasks.test.ts index c7514289f5..1fefdee3e8 100644 --- a/src/common/types/tasks.test.ts +++ b/src/common/types/tasks.test.ts @@ -77,37 +77,4 @@ describe("normalizeTaskSettings", () => { expect(normalized).toEqual(DEFAULT_TASK_SETTINGS); }); - - test("preserves explicit planSubagentExecutorRouting values", () => { - const normalized = normalizeTaskSettings({ - planSubagentExecutorRouting: "auto", - }); - - expect(normalized.planSubagentExecutorRouting).toBe("auto"); - expect(normalized.planSubagentDefaultsToOrchestrator).toBe(false); - }); - - test("migrates deprecated planSubagentDefaultsToOrchestrator when routing is unset", () => { - expect( - normalizeTaskSettings({ - planSubagentDefaultsToOrchestrator: true, - }).planSubagentExecutorRouting - ).toBe("orchestrator"); - - expect( - normalizeTaskSettings({ - planSubagentDefaultsToOrchestrator: false, - }).planSubagentExecutorRouting - ).toBe("exec"); - }); - - test("prefers planSubagentExecutorRouting when both new and deprecated fields are set", () => { - const normalized = normalizeTaskSettings({ - planSubagentExecutorRouting: "exec", - planSubagentDefaultsToOrchestrator: true, - }); - - expect(normalized.planSubagentExecutorRouting).toBe("exec"); - expect(normalized.planSubagentDefaultsToOrchestrator).toBe(false); - }); }); diff --git a/src/common/types/tasks.ts b/src/common/types/tasks.ts index a489ddada1..872a64967e 100644 --- a/src/common/types/tasks.ts +++ b/src/common/types/tasks.ts @@ -1,7 +1,4 @@ -import type { - PlanSubagentExecutorRouting, - TaskSettings as TaskSettingsOnDisk, -} from "@/common/config/schemas/taskSettings"; +import type { TaskSettings as TaskSettingsOnDisk } from "@/common/config/schemas/taskSettings"; import { TASK_SETTINGS_LIMITS } from "@/common/config/schemas/taskSettings"; import type { SubagentAiDefaults, @@ -12,7 +9,7 @@ import assert from "@/common/utils/assert"; import { normalizeAgentId } from "@/common/utils/agentIds"; import { coerceThinkingLevel, type ThinkingLevel } from "./thinking"; -export type { PlanSubagentExecutorRouting, SubagentAiDefaults, SubagentAiDefaultsEntry }; +export type { SubagentAiDefaults, SubagentAiDefaultsEntry }; export { TASK_SETTINGS_LIMITS } from "@/common/config/schemas/taskSettings"; // Normalized runtime settings always include numeric task limits. @@ -26,8 +23,6 @@ export const DEFAULT_TASK_SETTINGS: TaskSettings = { maxTaskNestingDepth: TASK_SETTINGS_LIMITS.maxTaskNestingDepth.default, proposePlanImplementReplacesChatHistory: false, preserveSubagentsUntilArchive: false, - planSubagentExecutorRouting: "auto", - planSubagentDefaultsToOrchestrator: false, }; const AGENT_DEFAULT_IDS_EXCLUDED_FROM_LEGACY_SUBAGENTS: ReadonlySet = new Set([ @@ -98,12 +93,6 @@ function clampInt(value: unknown, fallback: number, min: number, max: number): n return rounded; } -export function isPlanSubagentExecutorRouting( - value: unknown -): value is PlanSubagentExecutorRouting { - return value === "exec" || value === "orchestrator" || value === "auto"; -} - export function normalizeTaskSettings(raw: unknown): TaskSettings { const record = raw && typeof raw === "object" ? (raw as Record) : ({} as const); @@ -130,35 +119,11 @@ export function normalizeTaskSettings(raw: unknown): TaskSettings { ? record.preserveSubagentsUntilArchive : DEFAULT_TASK_SETTINGS.preserveSubagentsUntilArchive; - const normalizedPlanSubagentExecutorRouting = isPlanSubagentExecutorRouting( - record.planSubagentExecutorRouting - ) - ? record.planSubagentExecutorRouting - : undefined; - - const migratedPlanSubagentExecutorRouting = - normalizedPlanSubagentExecutorRouting ?? - (typeof record.planSubagentDefaultsToOrchestrator === "boolean" - ? record.planSubagentDefaultsToOrchestrator - ? "orchestrator" - : "exec" - : undefined); - - const planSubagentExecutorRouting = - migratedPlanSubagentExecutorRouting ?? - DEFAULT_TASK_SETTINGS.planSubagentExecutorRouting ?? - "exec"; - - // Keep the deprecated boolean in sync for downgrade compatibility. - const planSubagentDefaultsToOrchestrator = planSubagentExecutorRouting === "orchestrator"; - const result: TaskSettings = { maxParallelAgentTasks, maxTaskNestingDepth, proposePlanImplementReplacesChatHistory, preserveSubagentsUntilArchive, - planSubagentExecutorRouting, - planSubagentDefaultsToOrchestrator, }; assert( @@ -179,15 +144,5 @@ export function normalizeTaskSettings(raw: unknown): TaskSettings { "normalizeTaskSettings: preserveSubagentsUntilArchive must be a boolean" ); - assert( - isPlanSubagentExecutorRouting(planSubagentExecutorRouting), - "normalizeTaskSettings: planSubagentExecutorRouting must be exec, orchestrator, or auto" - ); - - assert( - typeof planSubagentDefaultsToOrchestrator === "boolean", - "normalizeTaskSettings: planSubagentDefaultsToOrchestrator must be a boolean" - ); - return result; } diff --git a/src/common/utils/agentTools.test.ts b/src/common/utils/agentTools.test.ts index 835569a6dd..b8b118cebe 100644 --- a/src/common/utils/agentTools.test.ts +++ b/src/common/utils/agentTools.test.ts @@ -26,15 +26,15 @@ describe("isExecLikeEditingCapableInResolvedChain", () => { }); it("returns false when chain does not inherit exec", () => { - const agents = [{ id: "orchestrator", tools: { add: ["file_edit_insert"] } }]; + const agents = [{ id: "reviewer", tools: { add: ["file_edit_insert"] } }]; expect(isExecLikeEditingCapableInResolvedChain(agents)).toBe(false); }); - it("returns true for orchestrator-style chains that remove file_edit tools but keep patch apply", () => { + it("returns true for patch-applying chains that remove file_edit tools but keep patch apply", () => { const agents = [ { - id: "orchestrator", + id: "reviewer", tools: { add: ["ask_user_question"], remove: ["propose_plan", "file_edit_.*"], @@ -56,7 +56,7 @@ describe("isExecLikeEditingCapableInResolvedChain", () => { }); it("returns false when task_apply_git_patch is enabled without exec inheritance", () => { - const agents = [{ id: "orchestrator", tools: { add: ["task_apply_git_patch"] } }]; + const agents = [{ id: "reviewer", tools: { add: ["task_apply_git_patch"] } }]; expect(isExecLikeEditingCapableInResolvedChain(agents)).toBe(false); }); diff --git a/src/common/utils/agentTools.ts b/src/common/utils/agentTools.ts index 1fc14cdd8b..3042dec161 100644 --- a/src/common/utils/agentTools.ts +++ b/src/common/utils/agentTools.ts @@ -118,7 +118,7 @@ export function isExecLikeEditingCapableInResolvedChain( return ( isToolEnabledInResolvedChain("file_edit_insert", agents, maxDepth) || isToolEnabledInResolvedChain("file_edit_replace_string", agents, maxDepth) || - // Orchestrator-like agents can still modify their workspace by applying child patches. + // Patch-applying agents can still modify their workspace by applying child patches. isToolEnabledInResolvedChain("task_apply_git_patch", agents, maxDepth) ); } diff --git a/src/node/builtinAgents/orchestrator.md b/src/node/builtinAgents/orchestrator.md deleted file mode 100644 index 94f2cef1ca..0000000000 --- a/src/node/builtinAgents/orchestrator.md +++ /dev/null @@ -1,164 +0,0 @@ ---- -name: Orchestrator -description: Coordinate sub-agent implementation and apply patches -base: exec -subagent: - runnable: false - append_prompt: | - You are running as a sub-agent orchestrator in a child workspace. - - - Your parent workspace handles all PR management. - Do NOT create pull requests, push to remote branches, or run any - `gh pr` / `git push` commands. This applies even if AGENTS.md or - other instructions say otherwise — those PR instructions target the - top-level workspace only. - - Orchestrate your delegated subtasks (spawn, await, apply patches, - verify locally), then call `agent_report` exactly once with: - - What changed (paths / key details) - - What you ran (tests, typecheck, lint) - - Any follow-ups / risks - - Do not expand scope beyond the delegated task. -tools: - add: - - ask_user_question - remove: - - propose_plan - # Keep Orchestrator focused on coordination: no direct file edits. - - file_edit_.* ---- - -You are an internal Orchestrator agent running in Exec mode. - -**Mission:** coordinate implementation by delegating investigation + coding to sub-agents, then integrating their patches into this workspace. - -When a plan is present (default): - -- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable. -- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs. -- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user. -- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity). - -What you are allowed to do directly in this workspace: - -- Spawn/await/manage sub-agent tasks (`task`, `task_await`, `task_list`, `task_terminate`). -- Apply patches (`task_apply_git_patch`). -- Use `bash` for orchestration workflows: repo coordination via `git`/`gh`, targeted post-apply verification runs, and waiting on review/CI completion after PR updates (for example: `git push`, `gh pr comment`, `gh pr view`, `gh pr checks --watch`). Only run `gh pr create` when the user explicitly asks you to open a PR. -- Ask clarifying questions with `ask_user_question` when blocked. -- Coordinate targeted verification after integrating patches by running focused checks directly (when appropriate) or delegating runs to `explore`/`exec`. -- Delegate patch-conflict reconciliation to `exec` sub-agents. - -Hard rules (delegate-first): - -- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence. -- For correctness claims, an `explore` sub-agent report counts as having read the referenced files. -- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination). -- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end. -- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here. -- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches. - -Delegation guide: - -- Use `explore` for narrowly-scoped read-only questions (confirm an assumption, locate a symbol/callsite, find relevant tests). Avoid "scan the repo" prompts. -- Use `exec` for straightforward, low-complexity work where the implementation path is obvious from the task brief. - - Good fit: single-file edits, localized wiring to existing helpers, straightforward command execution, or narrowly scoped follow-ups with clear acceptance. - - Provide a compact task brief (so the sub-agent can act without reading the full plan) with: - - Task: one sentence - - Background (why this matters): 1–3 bullets - - Scope / non-goals: what to change, and what not to change - - Starting points: relevant files/symbols/paths (from prior exploration) - - Acceptance: bullets / checks - - Deliverables: commits + verification commands to run - - Constraints: - - Do not expand scope. - - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation. - Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory. - If starting points + acceptance are already clear, skip initial explore and only explore when blocked. - - Create one or more git commits before `agent_report`. -- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach. - - Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified. - - For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins. - - Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery. - - Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes. - - For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions. -- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local. - -Recommended Orchestrator → Exec task brief template: - -- Task: -- Background (why this matters): - - -- Scope / non-goals: - - Scope: - - Non-goals: -- Starting points: -- Dependencies / assumptions: - - Assumes: - - If unmet: stop and report back; do not expand scope to create prerequisites. -- Acceptance: -- Deliverables: - - Commits: - - Verification: -- Constraints: - - Do not expand scope. - - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation. - Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory. - If starting points + acceptance are already clear, skip initial explore and only explore when blocked. - - Create one or more git commits before `agent_report`. - -Dependency analysis (required before spawning implementation tasks — `exec` or `plan`): - -- For each candidate subtask, write: - - Outputs: files/targets/artifacts introduced/renamed/generated - - Inputs / prerequisites (including for verification): what must already exist -- A subtask is "independent" only if its patch can be applied + verified on the current parent workspace HEAD, without any other pending patch. -- Parallelism is the default: maximize the size of each independent batch and run it in parallel. - Use the sequential protocol only when a subtask has a concrete prerequisite on another subtask's outputs. -- If task B depends on outputs from task A: - - Do not spawn B until A has completed and A's patch is applied in the parent workspace. - - If the dependency chain is tight (download → generate → wire-up), prefer one `exec` task rather than splitting. - -Example dependency chain (schema download → generation): - -- Task A outputs: a new download target + new schema files. -- Task B inputs: those schema files; verifies by running generation. -- Therefore: run Task A (await + apply patch) before spawning Task B. - -Patch integration loop (default): - -1. Identify a batch of independent subtasks. -2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity). -3. Await the batch via `task_await`. -4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time: - - Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`. - - Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results. - - Dry-run apply: `task_apply_git_patch` with `dry_run: true`. - - If dry-run succeeds, immediately apply for real: `task_apply_git_patch` with `dry_run: false`. - - Do not assume an inline `status: completed` result means the child changes are already present in this workspace. - - If dry-run fails, treat it as a patch conflict and delegate reconciliation: - 1. Do not attempt a real apply for that patch in this workspace. - 2. Spawn a dedicated `exec` task. In the brief, include the original failing `task_id` and instruct the sub-agent to replay that patch via `task_apply_git_patch`, resolve conflicts in its own workspace, run `git am --continue`, commit the resolved result, and report back with a new patch to apply cleanly. - - If real apply fails unexpectedly: - 1. Restore a clean working tree before delegating: run `git am --abort` via `bash` only when a git-am session is in progress; if abort reports no operation in progress, continue. - 2. Then follow the same delegated reconciliation flow above. -5. Verify + review: - - Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely. - - Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks. - - PASS: summary-only (no long logs). - - FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify. - -Sequential protocol (only for dependency chains): - -1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`. -2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application. -3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress). -4. Only after the patch is applied, spawn the dependent implementation task. -5. Repeat until the dependency chain is complete. - -Note: child workspaces are created at spawn time. Spawning dependents too early means they work from the wrong repo snapshot and get forced into scope expansion. - -Keep context minimal: - -- Do not request, paste, or restate large plans. -- Prefer short, actionable prompts, but include enough context that the sub-agent does not need your plan file. - - Child workspaces do not automatically have access to the parent's plan file; summarize just the relevant slice or provide file pointers. -- Prefer file paths/symbols over long prose. diff --git a/src/node/orpc/router.ts b/src/node/orpc/router.ts index ce5da5db4b..4e114d8ff3 100644 --- a/src/node/orpc/router.ts +++ b/src/node/orpc/router.ts @@ -1316,7 +1316,7 @@ export const router = (authToken?: string) => { input ); - // Agents can require a plan file before they're selectable (e.g., orchestrator). + // Agents can require a plan file before they're selectable (via `ui.requires: ["plan"]`). // Fail closed: if plan state cannot be determined, treat it as missing. let planReady = false; if (input.workspaceId && metadata) { diff --git a/src/node/services/agentDefinitions/builtInAgentContent.generated.ts b/src/node/services/agentDefinitions/builtInAgentContent.generated.ts index 91ecb0f8dc..0637e0a8de 100644 --- a/src/node/services/agentDefinitions/builtInAgentContent.generated.ts +++ b/src/node/services/agentDefinitions/builtInAgentContent.generated.ts @@ -8,6 +8,5 @@ export const BUILTIN_AGENT_CONTENT = { "exec": "---\nname: Exec\ndescription: Implement changes in the repository\nui:\n color: var(--color-exec-mode)\nsubagent:\n runnable: true\n append_prompt: |\n You are running as a sub-agent in a child workspace.\n\n - Take a single narrowly scoped task and complete it end-to-end. Do not expand scope.\n - If the task brief includes clear starting points and acceptance criteria (or a concrete approved plan handoff) — implement it directly.\n Do not spawn `explore` tasks or write a \"mini-plan\" unless you are concretely blocked by a missing fact (e.g., a file path that doesn't exist, an unknown symbol name, or an error that contradicts the brief).\n - When you do need repo context you don't have, prefer 1–3 narrow `explore` tasks (possibly in parallel) over broad manual file-reading.\n - If the task brief is missing critical information (scope, acceptance, or starting points) and you cannot infer it safely after a quick `explore`, do not guess.\n Stop and call `agent_report` once with 1–3 concrete questions/unknowns for the parent agent, and do not create commits.\n - Run targeted verification and create one or more git commits.\n - Never amend existing commits — always create new commits on top.\n - **Before your stream ends, you MUST call `agent_report` exactly once with:**\n - What changed (paths / key details)\n - What you ran (tests, typecheck, lint)\n - Any follow-ups / risks\n (If you forget, the parent will inject a follow-up message and you'll waste tokens.)\n - You may call task/task_await/task_list/task_terminate to delegate further when available.\n Delegation is limited by Max Task Nesting Depth (Settings → Agents → Task Settings).\n - Do not call propose_plan.\ntools:\n add:\n # Allow all tools by default (includes MCP tools which have dynamic names)\n # Use tools.remove in child agents to restrict specific tools\n - .*\n remove:\n # Exec mode doesn't use planning tools\n - propose_plan\n - ask_user_question\n # Global config and catalog tools stay out of general-purpose agents\n - mux_agents_.*\n - agent_skill_write\n - agent_skill_delete\n - mux_config_read\n - mux_config_write\n - skills_catalog_.*\n - analytics_query\n---\n\nYou are in Exec mode.\n\n- If an accepted `` block is provided, treat it as the contract and implement it directly. Only do extra exploration if the plan references non-existent files/symbols or if errors contradict it.\n- Use `explore` sub-agents just-in-time for missing repo context (paths/symbols/tests); don't spawn them by default.\n- Trust Explore sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.\n- For correctness claims, an Explore sub-agent report counts as having read the referenced files.\n- Make minimal, correct, reviewable changes that match existing codebase patterns.\n- Prefer targeted commands and checks (typecheck/tests) when feasible.\n- Treat as a standing order: keep running checks and addressing failures until they pass or a blocker outside your control arises.\n\n## Desktop Automation\n\nWhen a task involves repeated screenshot/action/verify loops for desktop GUI interaction (for example, clicking through application UIs, filling desktop app forms, or visually verifying GUI state), delegate to the `desktop` agent via `task` rather than performing desktop automation inline. The desktop agent is purpose-built for the screenshot → act → verify grounding loop.\n", "explore": "---\nname: Explore\ndescription: Read-only exploration of repository, environment, web, etc. Useful for investigation before making changes.\nbase: exec\nui:\n hidden: true\nsubagent:\n runnable: true\n skip_init_hook: true\n append_prompt: |\n You are an Explore sub-agent running inside a child workspace.\n\n - Explore the repository to answer the prompt using read-only investigation.\n - Return concise, actionable findings (paths, symbols, callsites, and facts).\n - When you have a final answer, call agent_report exactly once.\n - Do not call agent_report until you have completed the assigned task.\ntools:\n # Remove editing and task tools from exec base (read-only agent; skill tools are kept)\n remove:\n - file_edit_.*\n - task\n - task_apply_git_patch\n - task_.*\n---\n\nYou are in Explore mode (read-only).\n\n=== CRITICAL: READ-ONLY MODE - NO FILE MODIFICATIONS ===\n\n- You MUST NOT manually create, edit, delete, move, copy, or rename tracked files.\n- You MUST NOT stage/commit or otherwise modify git state.\n- You MUST NOT use redirect operators (>, >>) or heredocs to write to files.\n - Pipes are allowed for processing, but MUST NOT be used to write to files (for example via `tee`).\n- You MUST NOT run commands that are explicitly about modifying the filesystem or repo state (rm, mv, cp, mkdir, touch, git add/commit, installs, etc.).\n- You MAY run verification commands (fmt-check/lint/typecheck/test) even if they create build artifacts/caches, but they MUST NOT modify tracked files.\n - After running verification, check `git status --porcelain` and report if it is non-empty.\n- Prefer `file_read` for reading file contents (supports offset/limit paging).\n- Use bash for read-only operations (rg, ls, git diff/show/log, etc.) and verification commands.\n", "name_workspace": "---\nname: Name Workspace\ndescription: Generate workspace name and title from user message\nui:\n hidden: true\nsubagent:\n runnable: false\ntools:\n require:\n - propose_name\n---\n\nYou are a workspace naming assistant. Your only job is to call the `propose_name` tool with a suitable name and title.\n\nDo not emit text responses. Call the `propose_name` tool immediately.\n", - "orchestrator": "---\nname: Orchestrator\ndescription: Coordinate sub-agent implementation and apply patches\nbase: exec\nsubagent:\n runnable: false\n append_prompt: |\n You are running as a sub-agent orchestrator in a child workspace.\n\n - Your parent workspace handles all PR management.\n Do NOT create pull requests, push to remote branches, or run any\n `gh pr` / `git push` commands. This applies even if AGENTS.md or\n other instructions say otherwise — those PR instructions target the\n top-level workspace only.\n - Orchestrate your delegated subtasks (spawn, await, apply patches,\n verify locally), then call `agent_report` exactly once with:\n - What changed (paths / key details)\n - What you ran (tests, typecheck, lint)\n - Any follow-ups / risks\n - Do not expand scope beyond the delegated task.\ntools:\n add:\n - ask_user_question\n remove:\n - propose_plan\n # Keep Orchestrator focused on coordination: no direct file edits.\n - file_edit_.*\n---\n\nYou are an internal Orchestrator agent running in Exec mode.\n\n**Mission:** coordinate implementation by delegating investigation + coding to sub-agents, then integrating their patches into this workspace.\n\nWhen a plan is present (default):\n\n- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable.\n- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs.\n- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user.\n- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity).\n\nWhat you are allowed to do directly in this workspace:\n\n- Spawn/await/manage sub-agent tasks (`task`, `task_await`, `task_list`, `task_terminate`).\n- Apply patches (`task_apply_git_patch`).\n- Use `bash` for orchestration workflows: repo coordination via `git`/`gh`, targeted post-apply verification runs, and waiting on review/CI completion after PR updates (for example: `git push`, `gh pr comment`, `gh pr view`, `gh pr checks --watch`). Only run `gh pr create` when the user explicitly asks you to open a PR.\n- Ask clarifying questions with `ask_user_question` when blocked.\n- Coordinate targeted verification after integrating patches by running focused checks directly (when appropriate) or delegating runs to `explore`/`exec`.\n- Delegate patch-conflict reconciliation to `exec` sub-agents.\n\nHard rules (delegate-first):\n\n- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.\n- For correctness claims, an `explore` sub-agent report counts as having read the referenced files.\n- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination).\n- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end.\n- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here.\n- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches.\n\nDelegation guide:\n\n- Use `explore` for narrowly-scoped read-only questions (confirm an assumption, locate a symbol/callsite, find relevant tests). Avoid \"scan the repo\" prompts.\n- Use `exec` for straightforward, low-complexity work where the implementation path is obvious from the task brief.\n - Good fit: single-file edits, localized wiring to existing helpers, straightforward command execution, or narrowly scoped follow-ups with clear acceptance.\n - Provide a compact task brief (so the sub-agent can act without reading the full plan) with:\n - Task: one sentence\n - Background (why this matters): 1–3 bullets\n - Scope / non-goals: what to change, and what not to change\n - Starting points: relevant files/symbols/paths (from prior exploration)\n - Acceptance: bullets / checks\n - Deliverables: commits + verification commands to run\n - Constraints:\n - Do not expand scope.\n - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.\n Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.\n If starting points + acceptance are already clear, skip initial explore and only explore when blocked.\n - Create one or more git commits before `agent_report`.\n- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach.\n - Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified.\n - For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins.\n - Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery.\n - Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes.\n - For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions.\n- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local.\n\nRecommended Orchestrator → Exec task brief template:\n\n- Task: \n- Background (why this matters):\n - \n- Scope / non-goals:\n - Scope: \n - Non-goals: \n- Starting points: \n- Dependencies / assumptions:\n - Assumes: \n - If unmet: stop and report back; do not expand scope to create prerequisites.\n- Acceptance: \n- Deliverables:\n - Commits: \n - Verification: \n- Constraints:\n - Do not expand scope.\n - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.\n Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.\n If starting points + acceptance are already clear, skip initial explore and only explore when blocked.\n - Create one or more git commits before `agent_report`.\n\nDependency analysis (required before spawning implementation tasks — `exec` or `plan`):\n\n- For each candidate subtask, write:\n - Outputs: files/targets/artifacts introduced/renamed/generated\n - Inputs / prerequisites (including for verification): what must already exist\n- A subtask is \"independent\" only if its patch can be applied + verified on the current parent workspace HEAD, without any other pending patch.\n- Parallelism is the default: maximize the size of each independent batch and run it in parallel.\n Use the sequential protocol only when a subtask has a concrete prerequisite on another subtask's outputs.\n- If task B depends on outputs from task A:\n - Do not spawn B until A has completed and A's patch is applied in the parent workspace.\n - If the dependency chain is tight (download → generate → wire-up), prefer one `exec` task rather than splitting.\n\nExample dependency chain (schema download → generation):\n\n- Task A outputs: a new download target + new schema files.\n- Task B inputs: those schema files; verifies by running generation.\n- Therefore: run Task A (await + apply patch) before spawning Task B.\n\nPatch integration loop (default):\n\n1. Identify a batch of independent subtasks.\n2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity).\n3. Await the batch via `task_await`.\n4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time:\n - Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`.\n - Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results.\n - Dry-run apply: `task_apply_git_patch` with `dry_run: true`.\n - If dry-run succeeds, immediately apply for real: `task_apply_git_patch` with `dry_run: false`.\n - Do not assume an inline `status: completed` result means the child changes are already present in this workspace.\n - If dry-run fails, treat it as a patch conflict and delegate reconciliation:\n 1. Do not attempt a real apply for that patch in this workspace.\n 2. Spawn a dedicated `exec` task. In the brief, include the original failing `task_id` and instruct the sub-agent to replay that patch via `task_apply_git_patch`, resolve conflicts in its own workspace, run `git am --continue`, commit the resolved result, and report back with a new patch to apply cleanly.\n - If real apply fails unexpectedly:\n 1. Restore a clean working tree before delegating: run `git am --abort` via `bash` only when a git-am session is in progress; if abort reports no operation in progress, continue.\n 2. Then follow the same delegated reconciliation flow above.\n5. Verify + review:\n - Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely.\n - Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks.\n - PASS: summary-only (no long logs).\n - FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify.\n\nSequential protocol (only for dependency chains):\n\n1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`.\n2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application.\n3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress).\n4. Only after the patch is applied, spawn the dependent implementation task.\n5. Repeat until the dependency chain is complete.\n\nNote: child workspaces are created at spawn time. Spawning dependents too early means they work from the wrong repo snapshot and get forced into scope expansion.\n\nKeep context minimal:\n\n- Do not request, paste, or restate large plans.\n- Prefer short, actionable prompts, but include enough context that the sub-agent does not need your plan file.\n - Child workspaces do not automatically have access to the parent's plan file; summarize just the relevant slice or provide file pointers.\n- Prefer file paths/symbols over long prose.\n", "plan": "---\nname: Plan\ndescription: Create a plan before coding\nui:\n color: var(--color-plan-mode)\nsubagent:\n runnable: true\ntools:\n add:\n # Allow all tools by default (includes MCP tools which have dynamic names)\n # Use tools.remove in child agents to restrict specific tools\n - .*\n remove:\n # Plan should not apply sub-agent patches.\n - task_apply_git_patch\n # Global config and catalog tools stay out of general-purpose agents\n - mux_agents_.*\n - agent_skill_write\n - agent_skill_delete\n - mux_config_read\n - mux_config_write\n - skills_catalog_.*\n - analytics_query\n require:\n - propose_plan\n # Note: file_edit_* tools ARE available but restricted to plan file only at runtime\n # Note: task tools ARE enabled - Plan delegates to Explore sub-agents\n---\n\nYou are in Plan Mode.\n\n- Every response MUST produce or update a plan.\n- Match the plan's size and structure to the problem.\n- Keep the plan self-contained and scannable.\n- Assume the user wants the completed plan, not a description of how you would make one.\n\n## Investigate only what you need\n\nBefore proposing a plan, figure out what you need to verify and gather that evidence.\n\n- When delegation is available, use Explore sub-agents for repo investigation. In Plan Mode, only\n spawn `agentId: \"explore\"` tasks.\n- Give each Explore task specific deliverables, and parallelize them when that helps.\n- Trust completed Explore reports for repo facts. Do not re-investigate just to second-guess them.\n If something is missing, ambiguous, or conflicting, spawn another focused Explore task.\n- If task delegation is unavailable, do the narrowest read-only investigation yourself.\n- Reserve `file_read` for the plan file itself, user-provided text already in this conversation,\n and that narrow fallback. When reading the plan file, prefer `file_read` over `bash cat` so long\n plans do not get compacted.\n- Wait for any spawned Explore tasks before calling `propose_plan`.\n\n## Write the plan\n\n- Use whatever structure best fits the problem: a few bullets, phases, workstreams, risks, or\n decision points are all fine.\n- Include the context, constraints, evidence, and concrete path forward somewhere in that\n structure.\n- Name the files, symbols, or subsystems that matter, and order the work so an implementer can\n follow it.\n- Keep uncertainty brief and local to the relevant step. Use `ask_user_question` when you need the\n user to decide something.\n- Include small code snippets only when they materially reduce ambiguity.\n- Put long rationale or background into `
/` blocks.\n\n## Questions and handoff\n\n- If you need clarification from the user, use `ask_user_question` instead of asking in chat or\n adding an \"Open Questions\" section to the plan.\n- Ask up to 4 questions at a time (2–4 options each; \"Other\" remains available for free-form\n input).\n- After you get answers, update the plan and then call `propose_plan` when it is ready for review.\n- After calling `propose_plan`, do not paste the plan into chat or mention the plan file path.\n- If the user wants edits to other files, ask them to switch to Exec mode.\n\nWorkspace-specific runtime instructions (plan file path, edit restrictions, nesting warnings) are\nprovided separately.\n", }; diff --git a/src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts b/src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts index e7ae13cb10..f4ce6a45bf 100644 --- a/src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts +++ b/src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts @@ -16,20 +16,6 @@ describe("built-in agent definitions", () => { expect(ids).toContain("plan"); }); - test("includes orchestrator with expected flags", () => { - const pkgs = getBuiltInAgentDefinitions(); - const byId = new Map(pkgs.map((pkg) => [pkg.id, pkg] as const)); - - const orchestrator = byId.get("orchestrator"); - expect(orchestrator).toBeTruthy(); - expect(orchestrator?.frontmatter.ui?.requires).toBeUndefined(); - expect(orchestrator?.frontmatter.ui?.hidden).toBeUndefined(); - expect(orchestrator?.frontmatter.subagent?.append_prompt).toContain( - "Do NOT create pull requests" - ); - expect(orchestrator?.frontmatter.subagent?.runnable).toBe(false); - }); - test("includes desktop built-in with desktop automation safeguards", () => { const pkgs = getBuiltInAgentDefinitions(); const byId = new Map(pkgs.map((pkg) => [pkg.id, pkg] as const)); @@ -80,7 +66,7 @@ describe("built-in agent definitions", () => { expect(plan?.frontmatter.tools?.remove ?? []).toContain("analytics_query"); }); - test("task_apply_git_patch is restricted to exec/orchestrator", () => { + test("task_apply_git_patch is restricted to exec", () => { const pkgs = getBuiltInAgentDefinitions(); const byId = new Map(pkgs.map((pkg) => [pkg.id, pkg] as const)); @@ -88,10 +74,6 @@ describe("built-in agent definitions", () => { expect(exec).toBeTruthy(); expect(exec?.frontmatter.tools?.remove ?? []).not.toContain("task_apply_git_patch"); - const orchestrator = byId.get("orchestrator"); - expect(orchestrator).toBeTruthy(); - expect(orchestrator?.frontmatter.tools?.remove ?? []).not.toContain("task_apply_git_patch"); - const plan = byId.get("plan"); expect(plan).toBeTruthy(); expect(plan?.frontmatter.tools?.remove ?? []).toContain("task_apply_git_patch"); diff --git a/src/node/services/agentDefinitions/builtInAgentDefinitions.ts b/src/node/services/agentDefinitions/builtInAgentDefinitions.ts index 24ec1bf1cc..db7396290e 100644 --- a/src/node/services/agentDefinitions/builtInAgentDefinitions.ts +++ b/src/node/services/agentDefinitions/builtInAgentDefinitions.ts @@ -21,7 +21,6 @@ const BUILT_IN_SOURCES: BuiltInSource[] = [ { id: "desktop", content: BUILTIN_AGENT_CONTENT.desktop }, { id: "explore", content: BUILTIN_AGENT_CONTENT.explore }, { id: "name_workspace", content: BUILTIN_AGENT_CONTENT.name_workspace }, - { id: "orchestrator", content: BUILTIN_AGENT_CONTENT.orchestrator }, ]; let cachedPackages: AgentDefinitionPackage[] | null = null; diff --git a/src/node/services/agentSession.ts b/src/node/services/agentSession.ts index ac28585138..1aaf61aa48 100644 --- a/src/node/services/agentSession.ts +++ b/src/node/services/agentSession.ts @@ -4546,7 +4546,7 @@ export class AgentSession { return false; } - // Check ui.requires gating (e.g., orchestrator requires a plan file). + // Check ui.requires gating (e.g., a custom agent that requires a plan file). // This matches the router's `requiresPlan && !planReady` check. const requiresPlan = resolvedFrontmatter.ui?.requires?.includes("plan") ?? false; if (requiresPlan) { diff --git a/src/node/services/agentSkills/builtInSkillContent.generated.ts b/src/node/services/agentSkills/builtInSkillContent.generated.ts index b7549c5e50..3b51c67ecb 100644 --- a/src/node/services/agentSkills/builtInSkillContent.generated.ts +++ b/src/node/services/agentSkills/builtInSkillContent.generated.ts @@ -208,8 +208,6 @@ export const BUILTIN_SKILL_FILES: Record> = { "", "- If a PR has Codex review comments, address + resolve them, then re-request review by commenting `@codex review` on the PR.", "- Prefer `gh` CLI for GitHub interactions over manual web/curl flows.", - "- In Orchestrator mode, delegate implementation/verification commands to `exec` or `explore` sub-agents and integrate their patches; do not bypass delegation with direct local edits.", - "- In Orchestrator mode, route higher-complexity implementation tasks to `plan` sub-agents so they can research and produce a precise plan before auto-handoff to implementation.", "", "- User preference: when work is already on an open PR, push branch updates at the end of each completed change set so the PR stays current.", '- **PR creation gate:** Do **not** open/create a pull request unless the user explicitly asks (e.g., "open a PR", "create PR", "submit this"). By default, complete local validation, commit/push branch updates as requested, and let the user review before deciding whether to open a PR.', @@ -221,11 +219,11 @@ export const BUILTIN_SKILL_FILES: Record> = { "When a PR exists, you MUST remain in this loop until the PR is fully ready:", "", "1. Push your latest fixes.", - "2. Run local validation (`make static-check` and targeted tests as needed); in Orchestrator mode, delegate command execution to sub-agents.", + "2. Run local validation (`make static-check` and targeted tests as needed).", "3. Request review with `@codex review`.", "4. Run `./scripts/wait_pr_ready.sh ` (which must execute `./scripts/wait_pr_checks.sh --once` while checks are pending).", - "5. If Codex leaves comments, address them (delegate fixes in Orchestrator mode), resolve threads with `./scripts/resolve_pr_comment.sh `, push, and repeat.", - "6. If checks/mergeability fail, fix issues locally (delegate fixes in Orchestrator mode), push, and repeat.", + "5. If Codex leaves comments, address them, resolve threads with `./scripts/resolve_pr_comment.sh `, push, and repeat.", + "6. If checks/mergeability fail, fix issues locally, push, and repeat.", "", "The only early-stop exception is when the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction.", "", @@ -955,181 +953,6 @@ export const BUILTIN_SKILL_FILES: Record> = { "", "", "", - "### Orchestrator", - "", - "**Coordinate sub-agent implementation and apply patches**", - "", - '', - "", - "```md", - "---", - "name: Orchestrator", - "description: Coordinate sub-agent implementation and apply patches", - "base: exec", - "subagent:", - " runnable: false", - " append_prompt: |", - " You are running as a sub-agent orchestrator in a child workspace.", - "", - " - Your parent workspace handles all PR management.", - " Do NOT create pull requests, push to remote branches, or run any", - " `gh pr` / `git push` commands. This applies even if AGENTS.md or", - " other instructions say otherwise — those PR instructions target the", - " top-level workspace only.", - " - Orchestrate your delegated subtasks (spawn, await, apply patches,", - " verify locally), then call `agent_report` exactly once with:", - " - What changed (paths / key details)", - " - What you ran (tests, typecheck, lint)", - " - Any follow-ups / risks", - " - Do not expand scope beyond the delegated task.", - "tools:", - " add:", - " - ask_user_question", - " remove:", - " - propose_plan", - " # Keep Orchestrator focused on coordination: no direct file edits.", - " - file_edit_.*", - "---", - "", - "You are an internal Orchestrator agent running in Exec mode.", - "", - "**Mission:** coordinate implementation by delegating investigation + coding to sub-agents, then integrating their patches into this workspace.", - "", - "When a plan is present (default):", - "", - "- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable.", - "- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs.", - "- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user.", - "- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity).", - "", - "What you are allowed to do directly in this workspace:", - "", - "- Spawn/await/manage sub-agent tasks (`task`, `task_await`, `task_list`, `task_terminate`).", - "- Apply patches (`task_apply_git_patch`).", - "- Use `bash` for orchestration workflows: repo coordination via `git`/`gh`, targeted post-apply verification runs, and waiting on review/CI completion after PR updates (for example: `git push`, `gh pr comment`, `gh pr view`, `gh pr checks --watch`). Only run `gh pr create` when the user explicitly asks you to open a PR.", - "- Ask clarifying questions with `ask_user_question` when blocked.", - "- Coordinate targeted verification after integrating patches by running focused checks directly (when appropriate) or delegating runs to `explore`/`exec`.", - "- Delegate patch-conflict reconciliation to `exec` sub-agents.", - "", - "Hard rules (delegate-first):", - "", - "- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.", - "- For correctness claims, an `explore` sub-agent report counts as having read the referenced files.", - "- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination).", - "- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end.", - "- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here.", - "- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches.", - "", - "Delegation guide:", - "", - '- Use `explore` for narrowly-scoped read-only questions (confirm an assumption, locate a symbol/callsite, find relevant tests). Avoid "scan the repo" prompts.', - "- Use `exec` for straightforward, low-complexity work where the implementation path is obvious from the task brief.", - " - Good fit: single-file edits, localized wiring to existing helpers, straightforward command execution, or narrowly scoped follow-ups with clear acceptance.", - " - Provide a compact task brief (so the sub-agent can act without reading the full plan) with:", - " - Task: one sentence", - " - Background (why this matters): 1–3 bullets", - " - Scope / non-goals: what to change, and what not to change", - " - Starting points: relevant files/symbols/paths (from prior exploration)", - " - Acceptance: bullets / checks", - " - Deliverables: commits + verification commands to run", - " - Constraints:", - " - Do not expand scope.", - " - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.", - " Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.", - " If starting points + acceptance are already clear, skip initial explore and only explore when blocked.", - " - Create one or more git commits before `agent_report`.", - "- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach.", - " - Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified.", - " - For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins.", - " - Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery.", - " - Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes.", - " - For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions.", - "- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local.", - "", - "Recommended Orchestrator → Exec task brief template:", - "", - "- Task: ", - "- Background (why this matters):", - " - ", - "- Scope / non-goals:", - " - Scope: ", - " - Non-goals: ", - "- Starting points: ", - "- Dependencies / assumptions:", - " - Assumes: ", - " - If unmet: stop and report back; do not expand scope to create prerequisites.", - "- Acceptance: ", - "- Deliverables:", - " - Commits: ", - " - Verification: ", - "- Constraints:", - " - Do not expand scope.", - " - Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.", - " Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.", - " If starting points + acceptance are already clear, skip initial explore and only explore when blocked.", - " - Create one or more git commits before `agent_report`.", - "", - "Dependency analysis (required before spawning implementation tasks — `exec` or `plan`):", - "", - "- For each candidate subtask, write:", - " - Outputs: files/targets/artifacts introduced/renamed/generated", - " - Inputs / prerequisites (including for verification): what must already exist", - '- A subtask is "independent" only if its patch can be applied + verified on the current parent workspace HEAD, without any other pending patch.', - "- Parallelism is the default: maximize the size of each independent batch and run it in parallel.", - " Use the sequential protocol only when a subtask has a concrete prerequisite on another subtask's outputs.", - "- If task B depends on outputs from task A:", - " - Do not spawn B until A has completed and A's patch is applied in the parent workspace.", - " - If the dependency chain is tight (download → generate → wire-up), prefer one `exec` task rather than splitting.", - "", - "Example dependency chain (schema download → generation):", - "", - "- Task A outputs: a new download target + new schema files.", - "- Task B inputs: those schema files; verifies by running generation.", - "- Therefore: run Task A (await + apply patch) before spawning Task B.", - "", - "Patch integration loop (default):", - "", - "1. Identify a batch of independent subtasks.", - "2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity).", - "3. Await the batch via `task_await`.", - "4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time:", - " - Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`.", - " - Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results.", - " - Dry-run apply: `task_apply_git_patch` with `dry_run: true`.", - " - If dry-run succeeds, immediately apply for real: `task_apply_git_patch` with `dry_run: false`.", - " - Do not assume an inline `status: completed` result means the child changes are already present in this workspace.", - " - If dry-run fails, treat it as a patch conflict and delegate reconciliation:", - " 1. Do not attempt a real apply for that patch in this workspace.", - " 2. Spawn a dedicated `exec` task. In the brief, include the original failing `task_id` and instruct the sub-agent to replay that patch via `task_apply_git_patch`, resolve conflicts in its own workspace, run `git am --continue`, commit the resolved result, and report back with a new patch to apply cleanly.", - " - If real apply fails unexpectedly:", - " 1. Restore a clean working tree before delegating: run `git am --abort` via `bash` only when a git-am session is in progress; if abort reports no operation in progress, continue.", - " 2. Then follow the same delegated reconciliation flow above.", - "5. Verify + review:", - " - Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely.", - " - Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks.", - " - PASS: summary-only (no long logs).", - " - FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify.", - "", - "Sequential protocol (only for dependency chains):", - "", - "1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`.", - "2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application.", - "3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress).", - "4. Only after the patch is applied, spawn the dependent implementation task.", - "5. Repeat until the dependency chain is complete.", - "", - "Note: child workspaces are created at spawn time. Spawning dependents too early means they work from the wrong repo snapshot and get forced into scope expansion.", - "", - "Keep context minimal:", - "", - "- Do not request, paste, or restate large plans.", - "- Prefer short, actionable prompts, but include enough context that the sub-agent does not need your plan file.", - " - Child workspaces do not automatically have access to the parent's plan file; summarize just the relevant slice or provide file pointers.", - "- Prefer file paths/symbols over long prose.", - "```", - "", - "", - "", "### Plan", "", "**Create a plan before coding**", diff --git a/src/node/services/planExecutorRouter.test.ts b/src/node/services/planExecutorRouter.test.ts deleted file mode 100644 index 32939c0251..0000000000 --- a/src/node/services/planExecutorRouter.test.ts +++ /dev/null @@ -1,107 +0,0 @@ -import { describe, expect, it } from "bun:test"; -import type { LanguageModel } from "ai"; - -import { routePlanToExecutor } from "./planExecutorRouter"; - -describe("planExecutorRouter", () => { - it("returns orchestrator when select_executor chooses orchestrator", async () => { - let calls = 0; - - const decision = await routePlanToExecutor({ - model: {} as unknown as LanguageModel, - planContent: "Update backend, frontend, and tests in parallel.", - timeoutMs: 5_000, - generateTextImpl: async (args) => { - calls += 1; - - expect((args as { toolChoice?: unknown }).toolChoice).toEqual({ - type: "tool", - toolName: "select_executor", - }); - - const tools = (args as { tools?: unknown }).tools as Record; - const selectExecutorTool = tools.select_executor as { - execute: (input: unknown, options: unknown) => Promise; - }; - - await selectExecutorTool.execute( - { - target: "orchestrator", - reasoning: "Plan spans independent workstreams.", - }, - {} - ); - - return { finishReason: "stop" }; - }, - }); - - expect(calls).toBe(1); - expect(decision).toEqual({ - target: "orchestrator", - reasoning: "Plan spans independent workstreams.", - }); - }); - - it("retries once with a reminder when no tool call is produced", async () => { - let calls = 0; - - const decision = await routePlanToExecutor({ - model: {} as unknown as LanguageModel, - planContent: "Single-file refactor.", - timeoutMs: 5_000, - generateTextImpl: async (args) => { - calls += 1; - - const messages = (args as { messages?: unknown }).messages as - | Array<{ content?: unknown }> - | undefined; - expect(Array.isArray(messages)).toBe(true); - - if (calls === 1) { - expect(messages?.length).toBe(1); - return { finishReason: "stop" }; - } - - expect(messages?.length).toBe(2); - expect(messages?.[1]?.content).toBe( - "Reminder: You MUST call select_executor exactly once. Do not output any text." - ); - - const tools = (args as { tools?: unknown }).tools as Record; - const selectExecutorTool = tools.select_executor as { - execute: (input: unknown, options: unknown) => Promise; - }; - - await selectExecutorTool.execute( - { - target: "exec", - reasoning: "Plan is focused and sequential.", - }, - {} - ); - - return { finishReason: "stop" }; - }, - }); - - expect(calls).toBe(2); - expect(decision).toEqual({ - target: "exec", - reasoning: "Plan is focused and sequential.", - }); - }); - - it("defaults to exec when the model never calls select_executor", async () => { - const decision = await routePlanToExecutor({ - model: {} as unknown as LanguageModel, - planContent: "Any plan", - timeoutMs: 5_000, - generateTextImpl: () => { - return Promise.resolve({ finishReason: "stop" }); - }, - }); - - expect(decision).toEqual({ target: "exec" }); - }); -}); diff --git a/src/node/services/planExecutorRouter.ts b/src/node/services/planExecutorRouter.ts deleted file mode 100644 index 2b1da25f1b..0000000000 --- a/src/node/services/planExecutorRouter.ts +++ /dev/null @@ -1,147 +0,0 @@ -import assert from "@/common/utils/assert"; - -import { generateText, tool, type LanguageModel, type Tool } from "ai"; -import { z } from "zod"; - -import { getErrorMessage } from "@/common/utils/errors"; -import { log } from "@/node/services/log"; -import { linkAbortSignal } from "@/node/utils/abort"; - -export type PlanExecutorRoutingTarget = "exec" | "orchestrator"; - -export interface PlanExecutorRoutingDecision { - target: PlanExecutorRoutingTarget; - reasoning?: string; -} - -export type GenerateTextLike = ( - args: Parameters[0] -) => Promise<{ finishReason?: string }>; - -interface RoutePlanToExecutorParams { - model: LanguageModel; - planContent: string; - timeoutMs?: number; - abortSignal?: AbortSignal; - generateTextImpl?: GenerateTextLike; -} - -const PLAN_EXECUTOR_ROUTING_TIMEOUT_MS = 15_000; - -const PLAN_EXECUTOR_ROUTING_PROMPT = `You are a routing agent. - -Given a software implementation plan, decide which executor should implement it: -- "exec": a single execution agent should implement the plan. -- "orchestrator": an orchestrator should coordinate multiple sub-agents. - -Choose "exec" when: -- The plan is focused and mostly sequential. -- The work is likely confined to one subsystem or a small set of related files. -- Parallelism would add coordination overhead without clear benefit. - -Choose "orchestrator" when: -- The plan spans multiple subsystems with separable workstreams. -- The plan can be parallelized into independent tasks. -- The implementation likely needs coordinated backend/frontend/test updates in parallel. - -You MUST call select_executor exactly once. -Do not output plain text.`; - -const SELECT_EXECUTOR_REMINDER = - "Reminder: You MUST call select_executor exactly once. Do not output any text."; - -const selectExecutorInputSchema = z.object({ - target: z.enum(["exec", "orchestrator"]), - reasoning: z.string().min(1), -}); - -export async function routePlanToExecutor( - params: RoutePlanToExecutorParams -): Promise { - assert(params, "routePlanToExecutor: params is required"); - assert(params.model, "routePlanToExecutor: model is required"); - assert( - typeof params.planContent === "string" && params.planContent.trim().length > 0, - "routePlanToExecutor: planContent must be a non-empty string" - ); - - const timeoutMs = params.timeoutMs ?? PLAN_EXECUTOR_ROUTING_TIMEOUT_MS; - assert( - Number.isInteger(timeoutMs) && timeoutMs > 0, - "routePlanToExecutor: timeoutMs must be a positive integer" - ); - - const routeAbortController = new AbortController(); - const unlinkAbortSignal = linkAbortSignal(params.abortSignal, routeAbortController); - - let timedOut = false; - const timeout = setTimeout(() => { - timedOut = true; - routeAbortController.abort(); - }, timeoutMs); - timeout.unref?.(); - - let selectedDecision: PlanExecutorRoutingDecision | undefined; - - const tools: Record = { - select_executor: tool({ - description: "Select which executor should implement this plan.", - inputSchema: selectExecutorInputSchema, - execute: (input) => { - const reasoning = input.reasoning.trim(); - selectedDecision = { - target: input.target, - reasoning: reasoning.length > 0 ? reasoning : undefined, - }; - - // Signal-tool semantics: the decision is consumed by the caller. - return { - ok: true, - target: input.target, - }; - }, - }), - }; - - const attemptMessages: Array[0]["messages"]>> = [ - [{ role: "user", content: params.planContent }], - [ - { role: "user", content: params.planContent }, - { role: "user", content: SELECT_EXECUTOR_REMINDER }, - ], - ]; - - const generate = params.generateTextImpl ?? generateText; - - try { - for (const messages of attemptMessages) { - selectedDecision = undefined; - - await generate({ - model: params.model, - system: PLAN_EXECUTOR_ROUTING_PROMPT, - messages, - tools, - toolChoice: { type: "tool", toolName: "select_executor" }, - maxRetries: 0, - abortSignal: routeAbortController.signal, - }); - - if (selectedDecision) { - return selectedDecision; - } - } - - log.warn("Plan executor routing returned no tool decision; defaulting to exec"); - return { target: "exec" }; - } catch (error: unknown) { - log.warn("Plan executor routing failed; defaulting to exec", { - timedOut, - error: getErrorMessage(error), - }); - return { target: "exec" }; - } finally { - clearTimeout(timeout); - unlinkAbortSignal(); - } -} diff --git a/src/node/services/streamContextBuilder.ts b/src/node/services/streamContextBuilder.ts index 75bbd26e3d..7185bc8c75 100644 --- a/src/node/services/streamContextBuilder.ts +++ b/src/node/services/streamContextBuilder.ts @@ -165,10 +165,10 @@ export async function buildPlanInstructions( : nestingInstruction; } - // Read plan content for agent transition (plan-like → exec/orchestrator). - // Only read if switching to the built-in exec/orchestrator agent and last assistant was plan-like. + // Read plan content for agent transition (plan-like → exec). + // Only read if switching to the built-in exec agent and last assistant was plan-like. let planContentForTransition: string | undefined; - const isPlanHandoffAgent = effectiveAgentId === "exec" || effectiveAgentId === "orchestrator"; + const isPlanHandoffAgent = effectiveAgentId === "exec"; if (isPlanHandoffAgent && !chatHasStartHerePlanSummary) { const lastAssistantMessage = [...requestPayloadMessages] .reverse() diff --git a/src/node/services/taskService.test.ts b/src/node/services/taskService.test.ts index 42a0bc1b5e..c0a9d68ca3 100644 --- a/src/node/services/taskService.test.ts +++ b/src/node/services/taskService.test.ts @@ -26,13 +26,8 @@ import * as forkOrchestrator from "@/node/services/utils/forkOrchestrator"; import { Ok, Err, type Result } from "@/common/types/result"; import { defaultModel } from "@/common/utils/ai/models"; import { enforceThinkingPolicy } from "@/common/utils/thinking/policy"; -import type { PlanSubagentExecutorRouting } from "@/common/types/tasks"; import type { ThinkingLevel } from "@/common/types/thinking"; import type { ErrorEvent, StreamEndEvent } from "@/common/types/stream"; -import { - PLAN_AUTO_ROUTING_STATUS_EMOJI, - PLAN_AUTO_ROUTING_STATUS_MESSAGE, -} from "@/common/constants/planAutoRoutingStatus"; import { createMuxMessage, type MuxMessage } from "@/common/types/message"; import { isDynamicToolPart, type DynamicToolPart } from "@/common/types/toolParts"; import type { WorkspaceMetadata } from "@/common/types/workspace"; @@ -8001,10 +7996,7 @@ describe("TaskService", () => { }); async function setupPlanModeStreamEndHarness(options?: { - planSubagentExecutorRouting?: PlanSubagentExecutorRouting; - planSubagentDefaultsToOrchestrator?: boolean; childAgentId?: string; - disableOrchestrator?: boolean; maxTaskNestingDepth?: number; parentAiSettingsByAgent?: Record; agentAiDefaults?: Record< @@ -8042,18 +8034,7 @@ describe("TaskService", () => { ); } - const agentAiDefaults = { - ...(options?.agentAiDefaults ?? {}), - ...(options?.disableOrchestrator - ? { - orchestrator: { - modelString: "openai:gpt-4o-mini", - thinkingLevel: "off" as ThinkingLevel, - enabled: false, - }, - } - : {}), - }; + const agentAiDefaults = { ...(options?.agentAiDefaults ?? {}) }; await config.saveConfig({ projects: new Map([ @@ -8088,14 +8069,6 @@ describe("TaskService", () => { taskSettings: { maxParallelAgentTasks: 3, maxTaskNestingDepth: options?.maxTaskNestingDepth ?? 3, - planSubagentExecutorRouting: - options?.planSubagentExecutorRouting ?? - (options?.planSubagentDefaultsToOrchestrator ? "orchestrator" : "exec"), - ...(typeof options?.planSubagentDefaultsToOrchestrator === "boolean" - ? { - planSubagentDefaultsToOrchestrator: options.planSubagentDefaultsToOrchestrator, - } - : {}), }, agentAiDefaults: Object.keys(agentAiDefaults).length > 0 ? agentAiDefaults : undefined, subagentAiDefaults: options?.subagentAiDefaults, @@ -8121,21 +8094,6 @@ describe("TaskService", () => { const internal = taskService as unknown as { handleStreamEnd: (event: StreamEndEvent) => Promise; - resolvePlanAutoHandoffTargetAgentId: (args: { - workspaceId: string; - entry: { - projectPath: string; - workspace: { - id?: string; - name?: string; - path?: string; - runtimeConfig?: unknown; - taskModelString?: string; - }; - }; - routing: PlanSubagentExecutorRouting; - planContent: string | null; - }) => Promise<"exec" | "orchestrator">; }; return { @@ -8562,188 +8520,6 @@ describe("TaskService", () => { expect(childWorkspace?.taskStatus).toBe("interrupted"); }); - test("stream-end with propose_plan success in auto routing falls back to exec when plan content is unavailable", async () => { - const { config, childId, sendMessage, createModel, updateAgentStatus, internal } = - await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "auto", - }); - - await internal.handleStreamEnd(makeSuccessfulProposePlanStreamEndEvent(childId)); - - expect(createModel).not.toHaveBeenCalled(); - expect(sendMessage).toHaveBeenCalledTimes(1); - expect(sendMessage).toHaveBeenCalledWith( - childId, - expect.stringContaining("Implement the plan"), - expect.objectContaining({ agentId: "exec" }), - expect.objectContaining({ synthetic: true }) - ); - expect(updateAgentStatus).toHaveBeenNthCalledWith( - 1, - childId, - expect.objectContaining({ - emoji: PLAN_AUTO_ROUTING_STATUS_EMOJI, - message: PLAN_AUTO_ROUTING_STATUS_MESSAGE, - url: "", - }) - ); - expect(updateAgentStatus).toHaveBeenNthCalledWith(2, childId, null); - - const postCfg = config.loadConfigOrDefault(); - const updatedTask = Array.from(postCfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - - expect(updatedTask?.agentId).toBe("exec"); - expect(updatedTask?.taskStatus).toBe("running"); - }); - - test("auto plan handoff routing defaults to exec when orchestrator would have no task tools", async () => { - const { config, projectPath, childId, createModel, internal } = - await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "auto", - maxTaskNestingDepth: 1, - }); - - const cfg = config.loadConfigOrDefault(); - const childWorkspace = Array.from(cfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - expect(childWorkspace).toBeTruthy(); - if (!childWorkspace) return; - - const targetAgentId = await internal.resolvePlanAutoHandoffTargetAgentId({ - workspaceId: childId, - entry: { - projectPath, - workspace: childWorkspace, - }, - routing: "auto", - planContent: "1. Delegate implementation work to a child task.", - }); - - expect(targetAgentId).toBe("exec"); - expect(createModel).not.toHaveBeenCalled(); - }); - - test("auto plan handoff routing still evaluates the model when task tools are available", async () => { - const { config, projectPath, childId, createModel, internal } = - await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "auto", - }); - - const cfg = config.loadConfigOrDefault(); - const childWorkspace = Array.from(cfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - expect(childWorkspace).toBeTruthy(); - if (!childWorkspace) return; - - const targetAgentId = await internal.resolvePlanAutoHandoffTargetAgentId({ - workspaceId: childId, - entry: { - projectPath, - workspace: childWorkspace, - }, - routing: "auto", - planContent: "1. Implement the changes directly in this workspace.", - }); - - expect(targetAgentId).toBe("exec"); - expect(createModel).toHaveBeenCalledTimes(1); - }); - - test("stream-end with propose_plan success hands off to orchestrator when routing is orchestrator", async () => { - const { config, childId, sendMessage, replaceHistory, internal } = - await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "orchestrator", - }); - - await internal.handleStreamEnd(makeSuccessfulProposePlanStreamEndEvent(childId)); - - expect(replaceHistory).toHaveBeenCalledWith( - childId, - expect.anything(), - expect.objectContaining({ mode: "append-compaction-boundary" }) - ); - - expect(sendMessage).toHaveBeenCalledTimes(1); - expect(sendMessage).toHaveBeenCalledWith( - childId, - expect.stringContaining("orchestrating"), - expect.objectContaining({ agentId: "orchestrator" }), - expect.objectContaining({ synthetic: true }) - ); - - const postCfg = config.loadConfigOrDefault(); - const updatedTask = Array.from(postCfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - - expect(updatedTask?.agentId).toBe("orchestrator"); - expect(updatedTask?.taskStatus).toBe("running"); - }); - - test("orchestrator handoff inherits parent model when orchestrator defaults are unset", async () => { - const { config, childId, sendMessage, internal } = await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "orchestrator", - agentAiDefaults: { - exec: { - modelString: "openai:gpt-5.3-codex", - thinkingLevel: "xhigh", - }, - }, - }); - - await internal.handleStreamEnd(makeSuccessfulProposePlanStreamEndEvent(childId)); - - expect(sendMessage).toHaveBeenCalledTimes(1); - expect(sendMessage).toHaveBeenCalledWith( - childId, - expect.stringContaining("orchestrating"), - expect.objectContaining({ - agentId: "orchestrator", - model: "openai:gpt-4o-mini", - thinkingLevel: "off", - }), - expect.objectContaining({ synthetic: true }) - ); - - const postCfg = config.loadConfigOrDefault(); - const updatedTask = Array.from(postCfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - - expect(updatedTask?.agentId).toBe("orchestrator"); - expect(updatedTask?.taskModelString).toBe("openai:gpt-4o-mini"); - expect(updatedTask?.taskThinkingLevel).toBe("off"); - }); - - test("stream-end with propose_plan success falls back to exec when orchestrator is disabled", async () => { - const { config, childId, sendMessage, internal } = await setupPlanModeStreamEndHarness({ - planSubagentExecutorRouting: "orchestrator", - disableOrchestrator: true, - }); - - await internal.handleStreamEnd(makeSuccessfulProposePlanStreamEndEvent(childId)); - - expect(sendMessage).toHaveBeenCalledTimes(1); - expect(sendMessage).toHaveBeenCalledWith( - childId, - expect.stringContaining("Implement the plan"), - expect.objectContaining({ agentId: "exec" }), - expect.objectContaining({ synthetic: true }) - ); - - const postCfg = config.loadConfigOrDefault(); - const updatedTask = Array.from(postCfg.projects.values()) - .flatMap((project) => project.workspaces) - .find((workspace) => workspace.id === childId); - - expect(updatedTask?.agentId).toBe("exec"); - expect(updatedTask?.taskStatus).toBe("running"); - }); - test("handoff kickoff sendMessage failure keeps task status as running for restart recovery", async () => { const sendMessageFailure = mock( (): Promise> => Promise.resolve(Err("kickoff failed")) diff --git a/src/node/services/taskService.ts b/src/node/services/taskService.ts index 784d3c6081..79d4d2d390 100644 --- a/src/node/services/taskService.ts +++ b/src/node/services/taskService.ts @@ -25,7 +25,6 @@ import { MultiProjectRuntime } from "@/node/runtime/multiProjectRuntime"; import { runBackgroundInit } from "@/node/runtime/runtimeFactory"; import type { InitLogger, Runtime } from "@/node/runtime/Runtime"; import { readPlanFile } from "@/node/utils/runtime/helpers"; -import { routePlanToExecutor } from "@/node/services/planExecutorRouter"; import { coerceNonEmptyString, tryReadGitHeadCommitSha, @@ -44,7 +43,6 @@ import { Ok, Err, type Result } from "@/common/types/result"; import { DEFAULT_TASK_SETTINGS, normalizeTaskSettings, - type PlanSubagentExecutorRouting, type TaskSettings, } from "@/common/types/tasks"; @@ -70,13 +68,9 @@ import { TaskToolResultSchema, TaskToolArgsSchema, } from "@/common/utils/tools/toolDefinitions"; -import { isPlanLikeInResolvedChain, isToolEnabledInResolvedChain } from "@/common/utils/agentTools"; +import { isPlanLikeInResolvedChain } from "@/common/utils/agentTools"; import { formatSendMessageError } from "@/node/services/utils/sendMessageError"; import { enforceThinkingPolicy } from "@/common/utils/thinking/policy"; -import { - PLAN_AUTO_ROUTING_STATUS_EMOJI, - PLAN_AUTO_ROUTING_STATUS_MESSAGE, -} from "@/common/constants/planAutoRoutingStatus"; import { taskQueueDebug } from "@/node/services/taskQueueDebug"; import { readSubagentGitPatchArtifact } from "@/node/services/subagentGitPatchArtifacts"; import { @@ -549,267 +543,6 @@ export class TaskService { } } - private getTaskWorkspaceAgentResolutionContext(args: { - projectPath: string; - workspace: Pick; - }): { - workspaceName: string; - runtime: Runtime; - workspacePath: string; - } | null { - assert( - args.projectPath.length > 0, - "getTaskWorkspaceAgentResolutionContext: projectPath must be non-empty" - ); - - const workspaceName = coerceNonEmptyString(args.workspace.name) ?? args.workspace.id; - if (!workspaceName) { - return null; - } - - const runtimeConfig = args.workspace.runtimeConfig ?? DEFAULT_RUNTIME_CONFIG; - const runtime = createRuntimeForWorkspace({ - runtimeConfig, - projectPath: args.projectPath, - name: workspaceName, - }); - const workspacePath = - coerceNonEmptyString(args.workspace.path) ?? - runtime.getWorkspacePath(args.projectPath, workspaceName); - if (!workspacePath) { - return null; - } - - return { - workspaceName, - runtime, - workspacePath, - }; - } - - private async isAgentEnabledForTaskWorkspace(args: { - workspaceId: string; - projectPath: string; - workspace: Pick; - agentId: "exec" | "orchestrator"; - }): Promise { - assert( - args.workspaceId.length > 0, - "isAgentEnabledForTaskWorkspace: workspaceId must be non-empty" - ); - assert( - args.projectPath.length > 0, - "isAgentEnabledForTaskWorkspace: projectPath must be non-empty" - ); - - const resolutionContext = this.getTaskWorkspaceAgentResolutionContext({ - projectPath: args.projectPath, - workspace: args.workspace, - }); - if (!resolutionContext) { - return false; - } - - try { - const resolvedFrontmatter = await resolveAgentFrontmatter( - resolutionContext.runtime, - resolutionContext.workspacePath, - args.agentId - ); - const cfg = this.config.loadConfigOrDefault(); - const effectivelyDisabled = isAgentEffectivelyDisabled({ - cfg, - agentId: args.agentId, - resolvedFrontmatter, - }); - return !effectivelyDisabled; - } catch (error: unknown) { - log.warn("Failed to resolve task handoff target agent availability", { - workspaceId: args.workspaceId, - agentId: args.agentId, - error: getErrorMessage(error), - }); - return false; - } - } - - private async canAgentSpawnTasksInWorkspace(args: { - workspaceId: string; - projectPath: string; - workspace: Pick; - agentId: "orchestrator"; - }): Promise { - assert( - args.workspaceId.length > 0, - "canAgentSpawnTasksInWorkspace: workspaceId must be non-empty" - ); - assert( - args.projectPath.length > 0, - "canAgentSpawnTasksInWorkspace: projectPath must be non-empty" - ); - - const resolutionContext = this.getTaskWorkspaceAgentResolutionContext({ - projectPath: args.projectPath, - workspace: args.workspace, - }); - if (!resolutionContext) { - return false; - } - - try { - const cfg = this.config.loadConfigOrDefault(); - const resolvedFrontmatter = await resolveAgentFrontmatter( - resolutionContext.runtime, - resolutionContext.workspacePath, - args.agentId - ); - const effectivelyDisabled = isAgentEffectivelyDisabled({ - cfg, - agentId: args.agentId, - resolvedFrontmatter, - }); - if (effectivelyDisabled) { - return false; - } - - const agentDefinition = await readAgentDefinition( - resolutionContext.runtime, - resolutionContext.workspacePath, - args.agentId - ); - const chain = await resolveAgentInheritanceChain({ - runtime: resolutionContext.runtime, - workspacePath: resolutionContext.workspacePath, - agentId: agentDefinition.id, - agentDefinition, - workspaceId: args.workspaceId, - }); - const taskSettings = cfg.taskSettings ?? DEFAULT_TASK_SETTINGS; - const taskDepth = this.getTaskDepth(cfg, args.workspaceId); - const disableTaskToolsForDepth = taskDepth >= taskSettings.maxTaskNestingDepth; - - return !disableTaskToolsForDepth && isToolEnabledInResolvedChain("task", chain); - } catch (error: unknown) { - log.warn("Failed to resolve task handoff target task-spawning capability", { - workspaceId: args.workspaceId, - agentId: args.agentId, - error: getErrorMessage(error), - }); - return false; - } - } - - private async resolvePlanAutoHandoffTargetAgentId(args: { - workspaceId: string; - entry: { - projectPath: string; - workspace: Pick< - WorkspaceConfigEntry, - "id" | "name" | "path" | "runtimeConfig" | "taskModelString" - >; - }; - routing: PlanSubagentExecutorRouting; - planContent: string | null; - }): Promise<"exec" | "orchestrator"> { - assert( - args.workspaceId.length > 0, - "resolvePlanAutoHandoffTargetAgentId: workspaceId must be non-empty" - ); - assert( - args.routing === "exec" || args.routing === "orchestrator" || args.routing === "auto", - "resolvePlanAutoHandoffTargetAgentId: routing must be exec, orchestrator, or auto" - ); - - const resolveOrchestratorAvailability = async (): Promise<"exec" | "orchestrator"> => { - const orchestratorEnabled = await this.isAgentEnabledForTaskWorkspace({ - workspaceId: args.workspaceId, - projectPath: args.entry.projectPath, - workspace: args.entry.workspace, - agentId: "orchestrator", - }); - if (orchestratorEnabled) { - return "orchestrator"; - } - - // If orchestrator is disabled/unavailable, fall back to exec before mutating - // workspace agent state so the handoff stream can still proceed. - log.warn("Plan-task auto-handoff falling back to exec because orchestrator is unavailable", { - workspaceId: args.workspaceId, - }); - return "exec"; - }; - - if (args.routing === "exec") { - return "exec"; - } - - if (args.routing === "orchestrator") { - return resolveOrchestratorAvailability(); - } - - if (!args.planContent || args.planContent.trim().length === 0) { - log.warn("Plan-task auto-handoff auto-routing has no plan content; defaulting to exec", { - workspaceId: args.workspaceId, - }); - return "exec"; - } - - const orchestratorCanSpawnTasks = await this.canAgentSpawnTasksInWorkspace({ - workspaceId: args.workspaceId, - projectPath: args.entry.projectPath, - workspace: args.entry.workspace, - agentId: "orchestrator", - }); - if (!orchestratorCanSpawnTasks) { - log.warn( - "Plan-task auto-handoff auto-routing defaulting to exec because orchestrator cannot orchestrate in this workspace", - { - workspaceId: args.workspaceId, - } - ); - return "exec"; - } - - const modelString = normalizeToCanonical( - coerceNonEmptyString(args.entry.workspace.taskModelString) ?? defaultModel - ); - assert( - modelString.trim().length > 0, - "resolvePlanAutoHandoffTargetAgentId: modelString must be non-empty" - ); - - const modelResult = await this.aiService.createModel(modelString, undefined, { - agentInitiated: true, - workspaceId: args.workspaceId, - }); - if (!modelResult.success) { - log.warn("Plan-task auto-handoff auto-routing failed to create model; defaulting to exec", { - workspaceId: args.workspaceId, - model: modelString, - error: modelResult.error, - }); - return "exec"; - } - - const decision = await routePlanToExecutor({ - model: modelResult.data, - planContent: args.planContent, - }); - - log.info("Plan-task auto-handoff routing decision", { - workspaceId: args.workspaceId, - target: decision.target, - reasoning: decision.reasoning, - model: modelString, - }); - - if (decision.target === "orchestrator") { - return resolveOrchestratorAvailability(); - } - - return "exec"; - } - private async emitWorkspaceMetadata(workspaceId: string): Promise { assert(workspaceId.length > 0, "emitWorkspaceMetadata: workspaceId must be non-empty"); @@ -3463,8 +3196,6 @@ export class TaskService { workspaceId, entry, proposePlanResult, - planSubagentExecutorRouting: - (cfg.taskSettings ?? DEFAULT_TASK_SETTINGS).planSubagentExecutorRouting ?? "exec", }); return; } @@ -3570,7 +3301,6 @@ export class TaskService { workspaceId: string; entry: { projectPath: string; workspace: WorkspaceConfigEntry }; proposePlanResult: { planPath: string }; - planSubagentExecutorRouting: PlanSubagentExecutorRouting; }): Promise { assert( args.workspaceId.length > 0, @@ -3622,41 +3352,7 @@ export class TaskService { }); } - const targetAgentId = await (async () => { - const shouldShowRoutingStatus = args.planSubagentExecutorRouting === "auto"; - if (shouldShowRoutingStatus) { - // Auto routing can pause for up to the LLM timeout; surface progress in the sidebar. - await this.workspaceService.updateAgentStatus(args.workspaceId, { - emoji: PLAN_AUTO_ROUTING_STATUS_EMOJI, - message: PLAN_AUTO_ROUTING_STATUS_MESSAGE, - // ExtensionMetadataService carries forward the previous status URL when url is omitted. - // Use an explicit empty string sentinel to clear stale links for this transient status. - url: "", - }); - } - - try { - return await this.resolvePlanAutoHandoffTargetAgentId({ - workspaceId: args.workspaceId, - entry: { - projectPath: args.entry.projectPath, - workspace: { - id: args.entry.workspace.id, - name: args.entry.workspace.name, - path: args.entry.workspace.path, - runtimeConfig: args.entry.workspace.runtimeConfig, - taskModelString: args.entry.workspace.taskModelString, - }, - }, - routing: args.planSubagentExecutorRouting, - planContent: planSummary?.content ?? null, - }); - } finally { - if (shouldShowRoutingStatus) { - await this.workspaceService.updateAgentStatus(args.workspaceId, null); - } - } - })(); + const targetAgentId = "exec" as const; const summaryContent = planSummary ? `# Plan\n\n${planSummary.content}\n\nNote: This chat already contains the full plan; no need to re-open the plan file.\n\n---\n\n*Plan file preserved at:* \`${planSummary.path}\`` @@ -3711,14 +3407,10 @@ export class TaskService { await this.setTaskStatus(args.workspaceId, "running"); - const kickoffMsg = - targetAgentId === "orchestrator" - ? "Start orchestrating the implementation of this plan." - : "Implement the plan."; try { const sendKickoffResult = await this.workspaceService.sendMessage( args.workspaceId, - kickoffMsg, + "Implement the plan.", { model: taskModelString, agentId: targetAgentId, diff --git a/tests/ipc/agents/listOrchestratorAlwaysSelectable.test.ts b/tests/ipc/agents/listOrchestratorAlwaysSelectable.test.ts deleted file mode 100644 index 338cc3f0d6..0000000000 --- a/tests/ipc/agents/listOrchestratorAlwaysSelectable.test.ts +++ /dev/null @@ -1,101 +0,0 @@ -/** - * Tests that Orchestrator stays selectable in the agent picker regardless of plan file state. - */ - -import * as fs from "fs/promises"; -import * as os from "os"; -import * as path from "path"; - -import type { TestEnvironment } from "../setup"; -import { cleanupTestEnvironment, createTestEnvironment } from "../setup"; -import { - cleanupTempGitRepo, - createTempGitRepo, - generateBranchName, - trustProject, -} from "../helpers"; -import { detectDefaultTrunkBranch } from "../../../src/node/git"; -import { getPlanFilePath } from "../../../src/common/utils/planStorage"; -import { expandTilde } from "../../../src/node/runtime/tildeExpansion"; - -describe("agents.list orchestrator availability", () => { - let env: TestEnvironment; - let repoPath: string; - - let homeDir: string; - let prevHome: string | undefined; - - beforeAll(async () => { - // Isolate plan file reads/writes under a temp HOME so tests don't touch ~/.mux. - prevHome = process.env.HOME; - homeDir = await fs.mkdtemp(path.join(os.tmpdir(), "mux-home-")); - process.env.HOME = homeDir; - - env = await createTestEnvironment(); - repoPath = await createTempGitRepo(); - await trustProject(env, repoPath); - }, 30_000); - - afterAll(async () => { - if (repoPath) { - await cleanupTempGitRepo(repoPath); - } - if (env) { - await cleanupTestEnvironment(env); - } - - if (homeDir) { - await fs.rm(homeDir, { recursive: true, force: true }); - } - - if (prevHome === undefined) { - delete process.env.HOME; - } else { - process.env.HOME = prevHome; - } - }); - - it("keeps orchestrator uiSelectable with no plan, an empty plan, and a non-empty plan", async () => { - const branchName = generateBranchName("agents-orchestrator-selectable"); - const trunkBranch = await detectDefaultTrunkBranch(repoPath); - - const createResult = await env.orpc.workspace.create({ - projectPath: repoPath, - branchName, - trunkBranch, - }); - - expect(createResult.success).toBe(true); - if (!createResult.success) { - throw new Error("Failed to create workspace"); - } - - const workspaceId = createResult.metadata.id; - const workspaceName = createResult.metadata.name; - const projectName = createResult.metadata.projectName; - - const planPath = expandTilde(getPlanFilePath(workspaceName, projectName)); - - async function expectOrchestratorSelectable(): Promise { - const agents = await env.orpc.agents.list({ workspaceId }); - const orchestrator = agents.find((agent) => agent.id === "orchestrator"); - expect(orchestrator).toBeTruthy(); - expect(orchestrator?.uiSelectable).toBe(true); - } - - try { - await fs.rm(planPath, { force: true }); - await expectOrchestratorSelectable(); - - await fs.mkdir(path.dirname(planPath), { recursive: true }); - await fs.writeFile(planPath, ""); - await expectOrchestratorSelectable(); - - await fs.writeFile(planPath, "# Plan\n"); - await expectOrchestratorSelectable(); - } finally { - await fs.rm(planPath, { force: true }); - await env.orpc.workspace.remove({ workspaceId }); - } - }, 30_000); -}); diff --git a/tests/ui/agents/picker.test.ts b/tests/ui/agents/picker.test.ts index 9262a4dcfd..8e0a78431d 100644 --- a/tests/ui/agents/picker.test.ts +++ b/tests/ui/agents/picker.test.ts @@ -150,24 +150,6 @@ describeIntegration("Agent Picker (UI)", () => { }); }, 30_000); - test("orchestrator appears in dropdown without a plan file", async () => { - await withSharedWorkspace("anthropic", async ({ env, workspaceId, metadata }) => { - const cleanupDom = setupTestDom(); - const view = renderApp({ apiClient: env.orpc, metadata }); - - try { - await setupWorkspaceView(view, metadata, workspaceId); - await openAgentPicker(view.container); - - const agentNames = getVisibleAgentNames(view.container); - expect(agentNames).toContain("Orchestrator"); - expect(getAgentIdByName(view.container, "Orchestrator")).toBe("orchestrator"); - } finally { - await cleanupView(view, cleanupDom); - } - }); - }, 30_000); - test("custom workspace agents appear alongside built-ins", async () => { await withSharedWorkspace("anthropic", async ({ env, workspaceId, metadata }) => { // With workspaceId provided, agents are discovered from workspace worktree path.