Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions docs/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,6 @@ Core workflow:

- If a PR has Codex review comments, address + resolve them, then re-request review by commenting `@codex review` on the PR.
- Prefer `gh` CLI for GitHub interactions over manual web/curl flows.
- In Orchestrator mode, delegate implementation/verification commands to `exec` or `explore` sub-agents and integrate their patches; do not bypass delegation with direct local edits.
- In Orchestrator mode, route higher-complexity implementation tasks to `plan` sub-agents so they can research and produce a precise plan before auto-handoff to implementation.

- User preference: when work is already on an open PR, push branch updates at the end of each completed change set so the PR stays current.
- **PR creation gate:** Do **not** open/create a pull request unless the user explicitly asks (e.g., "open a PR", "create PR", "submit this"). By default, complete local validation, commit/push branch updates as requested, and let the user review before deciding whether to open a PR.
Expand All @@ -82,11 +80,11 @@ Core workflow:
When a PR exists, you MUST remain in this loop until the PR is fully ready:

1. Push your latest fixes.
2. Run local validation (`make static-check` and targeted tests as needed); in Orchestrator mode, delegate command execution to sub-agents.
2. Run local validation (`make static-check` and targeted tests as needed).
3. Request review with `@codex review`.
4. Run `./scripts/wait_pr_ready.sh <pr_number>` (which must execute `./scripts/wait_pr_checks.sh <pr_number> --once` while checks are pending).
5. If Codex leaves comments, address them (delegate fixes in Orchestrator mode), resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
6. If checks/mergeability fail, fix issues locally (delegate fixes in Orchestrator mode), push, and repeat.
5. If Codex leaves comments, address them, resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
6. If checks/mergeability fail, fix issues locally, push, and repeat.

The only early-stop exception is when the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction.

Expand Down
175 changes: 0 additions & 175 deletions docs/agents/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -315,181 +315,6 @@ When a task involves repeated screenshot/action/verify loops for desktop GUI int

</Accordion>

### Orchestrator

**Coordinate sub-agent implementation and apply patches**

<Accordion title="View orchestrator.md">

```md
---
name: Orchestrator
description: Coordinate sub-agent implementation and apply patches
base: exec
subagent:
runnable: false
append_prompt: |
You are running as a sub-agent orchestrator in a child workspace.

- Your parent workspace handles all PR management.
Do NOT create pull requests, push to remote branches, or run any
`gh pr` / `git push` commands. This applies even if AGENTS.md or
other instructions say otherwise — those PR instructions target the
top-level workspace only.
- Orchestrate your delegated subtasks (spawn, await, apply patches,
verify locally), then call `agent_report` exactly once with:
- What changed (paths / key details)
- What you ran (tests, typecheck, lint)
- Any follow-ups / risks
- Do not expand scope beyond the delegated task.
tools:
add:
- ask_user_question
remove:
- propose_plan
# Keep Orchestrator focused on coordination: no direct file edits.
- file_edit_.*
---

You are an internal Orchestrator agent running in Exec mode.

**Mission:** coordinate implementation by delegating investigation + coding to sub-agents, then integrating their patches into this workspace.

When a plan is present (default):

- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable.
- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs.
- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user.
- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity).

What you are allowed to do directly in this workspace:

- Spawn/await/manage sub-agent tasks (`task`, `task_await`, `task_list`, `task_terminate`).
- Apply patches (`task_apply_git_patch`).
- Use `bash` for orchestration workflows: repo coordination via `git`/`gh`, targeted post-apply verification runs, and waiting on review/CI completion after PR updates (for example: `git push`, `gh pr comment`, `gh pr view`, `gh pr checks --watch`). Only run `gh pr create` when the user explicitly asks you to open a PR.
- Ask clarifying questions with `ask_user_question` when blocked.
- Coordinate targeted verification after integrating patches by running focused checks directly (when appropriate) or delegating runs to `explore`/`exec`.
- Delegate patch-conflict reconciliation to `exec` sub-agents.

Hard rules (delegate-first):

- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.
- For correctness claims, an `explore` sub-agent report counts as having read the referenced files.
- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination).
- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end.
- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here.
- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches.

Delegation guide:

- Use `explore` for narrowly-scoped read-only questions (confirm an assumption, locate a symbol/callsite, find relevant tests). Avoid "scan the repo" prompts.
- Use `exec` for straightforward, low-complexity work where the implementation path is obvious from the task brief.
- Good fit: single-file edits, localized wiring to existing helpers, straightforward command execution, or narrowly scoped follow-ups with clear acceptance.
- Provide a compact task brief (so the sub-agent can act without reading the full plan) with:
- Task: one sentence
- Background (why this matters): 1–3 bullets
- Scope / non-goals: what to change, and what not to change
- Starting points: relevant files/symbols/paths (from prior exploration)
- Acceptance: bullets / checks
- Deliverables: commits + verification commands to run
- Constraints:
- Do not expand scope.
- Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.
Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
- Create one or more git commits before `agent_report`.
- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach.
- Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified.
- For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins.
- Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery.
- Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes.
- For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions.
- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local.

Recommended Orchestrator → Exec task brief template:

- Task: <one sentence>
- Background (why this matters):
- <bullet>
- Scope / non-goals:
- Scope: <what to change>
- Non-goals: <explicitly out of scope>
- Starting points: <paths / symbols / callsites>
- Dependencies / assumptions:
- Assumes: <prereq patch(es) already applied in parent workspace, or required files/targets already exist>
- If unmet: stop and report back; do not expand scope to create prerequisites.
- Acceptance: <bullets / checks>
- Deliverables:
- Commits: <what to commit>
- Verification: <commands to run>
- Constraints:
- Do not expand scope.
- Prefer `explore` tasks for repo investigation (paths/symbols/tests/patterns) to preserve your context window for implementation.
Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
- Create one or more git commits before `agent_report`.

Dependency analysis (required before spawning implementation tasks — `exec` or `plan`):

- For each candidate subtask, write:
- Outputs: files/targets/artifacts introduced/renamed/generated
- Inputs / prerequisites (including for verification): what must already exist
- A subtask is "independent" only if its patch can be applied + verified on the current parent workspace HEAD, without any other pending patch.
- Parallelism is the default: maximize the size of each independent batch and run it in parallel.
Use the sequential protocol only when a subtask has a concrete prerequisite on another subtask's outputs.
- If task B depends on outputs from task A:
- Do not spawn B until A has completed and A's patch is applied in the parent workspace.
- If the dependency chain is tight (download → generate → wire-up), prefer one `exec` task rather than splitting.

Example dependency chain (schema download → generation):

- Task A outputs: a new download target + new schema files.
- Task B inputs: those schema files; verifies by running generation.
- Therefore: run Task A (await + apply patch) before spawning Task B.

Patch integration loop (default):

1. Identify a batch of independent subtasks.
2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity).
3. Await the batch via `task_await`.
4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time:
- Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`.
- Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results.
- Dry-run apply: `task_apply_git_patch` with `dry_run: true`.
- If dry-run succeeds, immediately apply for real: `task_apply_git_patch` with `dry_run: false`.
- Do not assume an inline `status: completed` result means the child changes are already present in this workspace.
- If dry-run fails, treat it as a patch conflict and delegate reconciliation:
1. Do not attempt a real apply for that patch in this workspace.
2. Spawn a dedicated `exec` task. In the brief, include the original failing `task_id` and instruct the sub-agent to replay that patch via `task_apply_git_patch`, resolve conflicts in its own workspace, run `git am --continue`, commit the resolved result, and report back with a new patch to apply cleanly.
- If real apply fails unexpectedly:
1. Restore a clean working tree before delegating: run `git am --abort` via `bash` only when a git-am session is in progress; if abort reports no operation in progress, continue.
2. Then follow the same delegated reconciliation flow above.
5. Verify + review:
- Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely.
- Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks.
- PASS: summary-only (no long logs).
- FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify.

Sequential protocol (only for dependency chains):

1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`.
2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application.
3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress).
4. Only after the patch is applied, spawn the dependent implementation task.
5. Repeat until the dependency chain is complete.

Note: child workspaces are created at spawn time. Spawning dependents too early means they work from the wrong repo snapshot and get forced into scope expansion.

Keep context minimal:

- Do not request, paste, or restate large plans.
- Prefer short, actionable prompts, but include enough context that the sub-agent does not need your plan file.
- Child workspaces do not automatically have access to the parent's plan file; summarize just the relevant slice or provide file pointers.
- Prefer file paths/symbols over long prose.
```

</Accordion>

### Plan

**Create a plan before coding**
Expand Down
Binary file removed docs/img/orchestrate-agents.webp
Binary file not shown.
8 changes: 0 additions & 8 deletions scripts/capture-readme-screenshots.ts
Original file line number Diff line number Diff line change
Expand Up @@ -155,14 +155,6 @@ const STORIES: StoryDef[] = [
postProcess: async (pngBuffer: Buffer) =>
sharp(pngBuffer).webp({ quality: WEBP_QUALITY }).toBuffer(),
},
{
exportName: "OrchestrateAgents",
storyId: `${STORY_ID_PREFIX}orchestrate-agents`,
outputFile: "orchestrate-agents.webp",
// Narrower viewport makes the plan card + "Start Orchestrator" button more prominent
viewport: { width: 1200, height: 1188 },
clip: { x: 0, y: 0, width: 1200, height: 1000 },
},
];

// ---------------------------------------------------------------------------
Expand Down
3 changes: 1 addition & 2 deletions src/browser/components/AgentModePicker/AgentModePicker.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import React, { useCallback, useEffect, useMemo, useRef, useState } from "react";
import { Bot, ChevronDown, Route, SquareCode, Workflow } from "lucide-react";
import { Bot, ChevronDown, Route, SquareCode } from "lucide-react";
import type { LucideIcon } from "lucide-react";

import { useAgent } from "@/browser/contexts/AgentContext";
Expand Down Expand Up @@ -47,7 +47,6 @@ interface AgentOption {
const AGENT_ICONS: Record<string, LucideIcon> = {
plan: Route,
exec: SquareCode,
orchestrator: Workflow,
};
const DEFAULT_AGENT_ICON: LucideIcon = Bot;

Expand Down
10 changes: 0 additions & 10 deletions src/browser/features/Settings/Sections/TasksSection.agents.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,16 +91,6 @@ export const FALLBACK_AGENTS: AgentDefinitionDescriptor[] = [
require: ["propose_name"],
},
},
{
id: "orchestrator",
scope: "built-in",
name: "Orchestrator",
description: "Coordinate sub-agent implementation and apply patches",
uiSelectable: true,
uiRoutable: true,
subagentRunnable: false,
base: "exec",
},
];

function compareAgentsByName(a: AgentDefinitionDescriptor, b: AgentDefinitionDescriptor): number {
Expand Down
43 changes: 1 addition & 42 deletions src/browser/features/Settings/Sections/TasksSection.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,9 @@ import {
import {
DEFAULT_TASK_SETTINGS,
TASK_SETTINGS_LIMITS,
isPlanSubagentExecutorRouting,
normalizeSubagentAiDefaults,
shouldMirrorAgentDefaultToLegacySubagent,
normalizeTaskSettings,
type PlanSubagentExecutorRouting,
type SubagentAiDefaults,
type SubagentAiDefaultsEntry,
type TaskSettings,
Expand Down Expand Up @@ -280,9 +278,7 @@ function areTaskSettingsEqual(a: TaskSettings, b: TaskSettings): boolean {
a.maxParallelAgentTasks === b.maxParallelAgentTasks &&
a.maxTaskNestingDepth === b.maxTaskNestingDepth &&
a.proposePlanImplementReplacesChatHistory === b.proposePlanImplementReplacesChatHistory &&
a.preserveSubagentsUntilArchive === b.preserveSubagentsUntilArchive &&
a.planSubagentExecutorRouting === b.planSubagentExecutorRouting &&
a.planSubagentDefaultsToOrchestrator === b.planSubagentDefaultsToOrchestrator
a.preserveSubagentsUntilArchive === b.preserveSubagentsUntilArchive
);
}

Expand Down Expand Up @@ -742,25 +738,10 @@ export function TasksSection() {
);
};

const setPlanSubagentExecutorRouting = (value: string) => {
if (!isPlanSubagentExecutorRouting(value)) {
return;
}

setTaskSettings((prev) =>
normalizeTaskSettings({
...prev,
planSubagentExecutorRouting: value,
})
);
};
const setNewWorkspaceDefaultAgentId = (agentId: string) => {
setGlobalDefaultAgentIdRaw(coerceAgentId(agentId));
};

const planSubagentExecutorRouting: PlanSubagentExecutorRouting =
taskSettings.planSubagentExecutorRouting ?? "exec";

const setAgentModel = (agentId: string, value: string) => {
setAgentAiDefaults((prev) =>
updateAgentDefaultEntry(prev, agentId, (updated) => {
Expand Down Expand Up @@ -1279,28 +1260,6 @@ export function TasksSection() {
aria-label="Toggle preserve subagents until archive"
/>
</div>

<div className="flex items-center justify-between gap-4">
<div className="flex-1">
<div className="text-foreground text-sm">Plan sub-agents: executor routing</div>
<div className="text-muted text-xs">
Choose how plan sub-agent tasks route after propose_plan.
</div>
</div>
<Select
value={planSubagentExecutorRouting}
onValueChange={setPlanSubagentExecutorRouting}
>
<SelectTrigger className="border-border-medium bg-background-secondary h-9 w-44">
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="exec">Exec</SelectItem>
<SelectItem value="orchestrator">Orchestrator</SelectItem>
<SelectItem value="auto">Auto (Agent chooses)</SelectItem>
</SelectContent>
</Select>
</div>
</div>

{saveError ? <div className="text-danger-light mt-4 text-xs">{saveError}</div> : null}
Expand Down
Loading
Loading