diff --git a/.agents/skills/writing-agent-relay-workflows/SKILL.md b/.agents/skills/writing-agent-relay-workflows/SKILL.md index 7787cfc67..5c89d7f34 100644 --- a/.agents/skills/writing-agent-relay-workflows/SKILL.md +++ b/.agents/skills/writing-agent-relay-workflows/SKILL.md @@ -1,6 +1,6 @@ --- name: writing-agent-relay-workflows -description: Use when building multi-agent workflows with the relay broker-sdk - covers the WorkflowBuilder API, DAG step dependencies, agent definitions, step output chaining via {{steps.X.output}}, verification gates, evidence-based completion, owner decisions, dedicated channels, dynamic channel management (subscribe/unsubscribe/mute/unmute), swarm patterns, error handling, event listeners, step sizing rules, authoring best practices, and the lead+workers team pattern for complex steps +description: Use when building multi-agent workflows with the relay broker-sdk - covers conversation-shape vs pipeline-shape coordination, the WorkflowBuilder API, DAG step dependencies, agent definitions, step output chaining via {{steps.X.output}}, verification gates, evidence-based completion, owner decisions, dedicated channels, dynamic channel management (subscribe/unsubscribe/mute/unmute), swarm patterns, chat-native coordination recipes (Q/A, broadcast-ack, peer review, standup, hand-off), error handling, event listeners, step sizing rules, authoring best practices, and the lead+workers team pattern for complex steps --- ### Overview @@ -19,9 +19,24 @@ The relay broker-sdk workflow system orchestrates multiple AI agents (Claude, Co - Needing verification gates, retries, or step output chaining - Dynamic channel management: agents joining/leaving/muting channels mid-workflow -### Quick Reference +### Choose Your Coordination Style — Conversation vs Pipeline -#### > **Note:** this Quick Reference assumes an **ESM** workflow file (the host `package.json` has `"type": "module"`). For CJS repos, see rule #1 in **Critical TypeScript rules** below — convert `import { workflow } from '@agent-relay/sdk/workflows'` to `const { workflow } = require('@agent-relay/sdk/workflows')` and wrap the workflow in `async function main() { ... } main().catch(console.error)` since CJS does not support top-level `await`. **Always check `package.json` before copy-pasting the snippet.** +Before writing the workflow, decide _how the agents will coordinate_. The relay primitive supports two very different shapes, and picking the wrong one wastes the most valuable thing the SDK gives you. + +| Shape | What it is | Use when | +| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ----------- | +| **Conversation** (chat-native) | Interactive agents share a channel; messages, `@-mentions`, and ambient awareness drive coordination. Lead and workers spawn in parallel and self-organize. The relay is the coordination layer, not just transport. | Multi-file work, peer review loops, cross-agent feedback, dynamic re-planning, multi-PR coordination, anything with a human-in-the-loop escape, swarms where workers pick up each other's output. | +| **Pipeline** (one-shot DAG) | Each step runs as a one-shot subprocess (`claude -p`, `codex exec`); steps hand off via `{{steps.X.output}}` text injection. No agents are alive at the same time; no chat happens. | Linear, well-specified transformations; deterministic data passing; no review loop expected; the work could be expressed as a `bash | bash | bash` pipe. | + +**Default to Conversation for any non-trivial work.** Pipeline DAGs are simpler to reason about but they do not exercise the relay primitive — they are a Unix pipe with extra steps. If you would happily write the same task as a single shell pipeline, pipeline-shape is fine. Otherwise, you almost certainly want a Conversation shape. + +The two shapes can mix within one workflow: pipeline-style deterministic preflight → conversation in the middle → pipeline-style commit-and-PR at the end. See **Quick Reference (Conversation)** below and **[Common Patterns → Interactive Team](#interactive-team-lead--workers-on-shared-channel)** for the canonical recipe. + +> **A blunt rule of thumb:** if your workflow only uses `agent` steps with `preset: 'worker'` chained by `{{steps.X.output}}`, you are not using the relay — you are using `claude -p | codex exec`. That may still be the right answer; just make it a deliberate choice. + +### Quick Reference (Pipeline shape) + +#### > Use this when steps are linear, well-specified, and need no agent-to-agent feedback. For anything with iteration, review, or coordination, jump to **Quick Reference (Conversation shape)** below. ```typescript import { workflow } from '@agent-relay/sdk/workflows'; @@ -55,6 +70,82 @@ const result = await workflow('my-workflow') console.log('Result:', result.status); ``` +### Quick Reference (Conversation shape) + +#### > Use this for any non-trivial work — peer review, multi-file edits, cross-agent feedback, dynamic re-planning. Lead and workers spawn **in parallel** on a shared channel and self-organize via messages. The relay primitive does the coordinating; verification gates downstream of the lead close the workflow. + +```typescript +import { workflow } from '@agent-relay/sdk/workflows'; +import { ClaudeModels, CodexModels } from '@agent-relay/config'; + +const result = await workflow('my-workflow') + .description('Multi-file change with peer review') + .pattern('dag') + .channel('wf-my-feature') // dedicated channel — agents share it + .maxConcurrency(4) + .timeout(3_600_000) + + // Interactive agents — no preset, they live on the channel + .agent('lead', { + cli: 'claude', + model: ClaudeModels.OPUS, + role: 'Architect + reviewer. Plans, assigns, reviews, posts feedback.', + retries: 1, + }) + .agent('impl-a', { + cli: 'codex', + model: CodexModels.GPT_5_4, + role: 'Implementer. Listens on channel for assignments and feedback.', + retries: 2, + }) + .agent('impl-b', { + cli: 'codex', + model: CodexModels.GPT_5_4, + role: 'Implementer. Listens on channel for assignments and feedback.', + retries: 2, + }) + + // Deterministic context — pre-reads files once, posts to the channel for everyone + .step('context', { + type: 'deterministic', + command: 'git ls-files src/', + captureOutput: true, + }) + + // Lead and workers all depend on `context` — they start CONCURRENTLY. + // They coordinate over #wf-my-feature, not via {{steps.X.output}}. + .step('lead-coordinate', { + agent: 'lead', + dependsOn: ['context'], + task: `You are the lead on #wf-my-feature. Workers: impl-a, impl-b. +Post the plan. Assign files. Review their PRs/diffs. Post feedback in-channel. +Workers iterate based on your feedback. Exit when both files pass review.`, + }) + .step('impl-a-work', { + agent: 'impl-a', + dependsOn: ['context'], // SAME dep as lead → starts in parallel, no deadlock + task: `You are impl-a on #wf-my-feature. Wait for the lead's plan. +Implement your assigned file. Post a completion message. Address feedback.`, + }) + .step('impl-b-work', { + agent: 'impl-b', + dependsOn: ['context'], // SAME dep as lead + task: `You are impl-b on #wf-my-feature. Wait for the lead's plan. +Implement your assigned file. Post a completion message. Address feedback.`, + }) + + // Downstream gates on the lead — lead exits when satisfied. + .step('verify', { + type: 'deterministic', + dependsOn: ['lead-coordinate'], + command: 'npm run typecheck && npm test', + failOnError: true, + }) + + .onError('retry', { maxRetries: 2, retryDelayMs: 10_000 }) + .run({ cwd: process.cwd() }); +``` + ### ⚡ Parallelism — Design for Speed #### Cross-Workflow Parallelism: Wave Planning @@ -214,6 +305,10 @@ export function applyCloudRepoSetup(wf: T, opts: CloudRepoSetupOptions): T { - Show that the original failure no longer occurs - **Record residual risks** - Call out what was not covered +- **Ship the result as a PR** +- Open the pull request from the workflow itself with `createGitHubStep` +- See [Shipping the Result — Open a PR via `createGitHubStep`](#shipping-the-result--open-a-pr-via-creategithubstep) below +- A workflow that fixes a bug and stops short of the PR has only done half the loop - disposable sandbox / cloud workspace - Docker / containerized environment - fresh local shell with isolated paths @@ -222,6 +317,72 @@ export function applyCloudRepoSetup(wf: T, opts: CloudRepoSetupOptions): T { - chooses the best swarm pattern - then authors the final fix/validation workflow +### Shipping the Result — Open a PR via `createGitHubStep` + +#### The minimal "open a PR" recipe + +```typescript +import { workflow } from '@agent-relay/sdk/workflows'; +import { createGitHubStep } from '@agent-relay/sdk/github'; + +const REPO = 'AgentWorkforce/cloud'; +const BRANCH = `agent-relay/run-${Date.now()}`; + +await workflow('feature-x') + // ... your real steps that produce code changes ... + .step('write-marker', { + type: 'deterministic', + command: `echo "fix landed at $(date -u)" >> CHANGELOG.md`, + }) + + // Branch off main on the remote. + .step( + 'create-branch', + createGitHubStep({ + dependsOn: ['write-marker'], + action: 'createBranch', + repo: REPO, + params: { branch: BRANCH, source: 'main' }, + }) + ) + + // Commit the change to the branch via Contents API. + .step( + 'commit-change', + createGitHubStep({ + dependsOn: ['create-branch'], + action: 'createFile', + repo: REPO, + params: { + path: 'CHANGELOG.md', + branch: BRANCH, + content: '', + message: 'chore: changelog entry', + }, + }) + ) + + // Open the PR. This is the load-bearing step. + .step( + 'open-pr', + createGitHubStep({ + dependsOn: ['commit-change'], + action: 'createPR', + repo: REPO, + params: { + title: 'feat: ship feature X', + head: BRANCH, + base: 'main', + body: '## Summary\n\n- ...\n\n## Test plan\n\n- [x] ...', + draft: false, + }, + output: { mode: 'data', format: 'json', path: 'html_url' }, + }) + ) + + .run({ cwd: process.cwd() }); +``` + ### Key Concepts #### Verification Gates @@ -376,6 +537,78 @@ Edit files as assigned. Report completion. Fix issues from feedback.`, .step('verify', { type: 'deterministic', dependsOn: ['lead-coordinate'], ... }) ``` +#### 1. Question / Answer (blocking ask) + +```typescript +.step('integrate', { + agent: 'integrator', + dependsOn: ['context'], + task: `You are the integrator on #wf-feature. +Before writing code, post a direct question to @schema-owner asking which +table owns the new field. Do NOT proceed until @schema-owner replies in +channel. If no reply arrives in 5 minutes, @-mention the lead.`, +}) +``` + +#### 2. Broadcast / Ack + +```typescript +.step('lead-coordinate', { + agent: 'lead', + dependsOn: ['context'], + task: `Post the plan to #wf-feature, then @impl-a @impl-b @impl-c. +Wait for each to reply with "ACK " before issuing assignments. +If any worker hasn't acked in 3 minutes, re-post and ping again. +Only after all three have acked, post per-worker assignments.`, +}) +``` + +#### 3. Peer Review Handoff + +```typescript +.step('impl-a-work', { + agent: 'impl-a', + dependsOn: ['context'], + task: `Implement src/foo.ts per the lead's assignment. +When done, post to #wf-feature: "@reviewer ready: src/foo.ts" — include the +commit SHA. Then wait for @reviewer's verdict in channel. +- If "APPROVED", you're done. +- If "CHANGES_REQUESTED ", apply the notes and re-post. +- If no verdict in 5 min, @-mention the lead.`, +}) +``` + +#### 4. Standup / Status Probe + +```typescript +.step('lead-coordinate', { + agent: 'lead', + task: `... coordinate the team ... + +Every 10 minutes, post a status probe: "@impl-a @impl-b status?" +Each worker should reply with one of: + - "RUNNING " (still working) + - "BLOCKED " (@-mention the lead with the blocker) + - "DONE " (ready for review) + +If a worker is silent for two probes in a row, mark them stalled and +reassign their work to a peer.`, +}) +``` + +#### 5. Hand-Off with Context + +```typescript +.step('impl-a-work', { + agent: 'impl-a', + task: `... finish your part ... + +When done, post a handoff to #wf-feature targeting the next worker: +"@impl-b HANDOFF: src/foo.ts ready. Touched: . Open question: . +Tests: . Commit: ."`, +}) +``` + #### Pipeline (sequential handoff) ```typescript @@ -546,48 +779,50 @@ When you set `.pattern('supervisor')` (or `hub-spoke`, `fan-out`), the runner au ### Common Mistakes -| Mistake | Fix | -| --------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| All workflows run sequentially | Group independent workflows into parallel waves (4-7x speedup) | -| Every step depends on the previous one | Only add `dependsOn` when there's a real data dependency | -| Self-review step with no timeout | Set `timeout: 300_000` (5 min) — Codex hangs in non-interactive review | -| One giant workflow per feature | Split into smaller workflows that can run in parallel waves | -| Adding exit instructions to tasks | Runner handles self-termination automatically | -| Setting `timeoutMs` on agents/steps | Use global `.timeout()` only | -| Using `general` channel | Set `.channel('wf-name')` for isolation | -| `{{steps.X.output}}` without `dependsOn: ['X']` | Output won't be available yet | -| Requiring exact sentinel as only completion gate | Use `exit_code` or `file_exists` verification | -| Writing 100-line task prompts | Split into lead + workers on a channel | -| `maxConcurrency: 16` with many parallel steps | Cap at 5-6 | -| Non-interactive agent reading large files via tools | Pre-read in deterministic step, inject via `{{steps.X.output}}` | -| Workers depending on lead step (deadlock) | Both depend on shared context step | -| `fan-out`/`hub-spoke` for simple parallel workers | Use `dag` instead | -| `pipeline` but expecting auto-supervisor | Only hub patterns auto-harden. Use `.pattern('supervisor')` | -| Workers without `preset: 'worker'` in one-shot DAG lead+worker flows | Add preset for clean stdout when chaining `{{steps.X.output}}` (not needed for interactive team patterns) | -| Using `_` in YAML numbers (`timeoutMs: 1_200_000`) | YAML doesn't support `_` separators | -| Workflow timeout under 30 min for complex workflows | Use `3600000` (1 hour) as default | -| Using `require()` in ESM projects | Check `package.json` for `"type": "module"` — use `import` if ESM | -| Wrapping in `async function main()` in ESM | ESM supports top-level `await` — no wrapper needed | -| Using `createWorkflowRenderer` | Does not exist. Use `.run({ cwd: process.cwd() })` | -| `export default workflow(...)...build()` | No `.build()`. Chain ends with `.run()` — the file must call `.run()`, not just export config | -| Relative import `'../workflows/builder.js'` | Use `import { workflow } from '@agent-relay/sdk/workflows'` | -| Hardcoded model strings (`model: 'opus'`) | Use constants: `import { ClaudeModels } from '@agent-relay/config'` → `model: ClaudeModels.OPUS` | -| Thinking `agent-relay run` inspects exports | It executes the file as a subprocess. Only `.run()` invocations trigger steps | -| `pattern('single')` on cloud runner | Not supported — use `dag` | -| `pattern('supervisor')` with one agent | Same agent is owner + specialist. Use `dag` | -| Invalid verification type (`type: 'deterministic'`) | Only `exit_code`, `output_contains`, `file_exists`, `custom` are valid | -| Chaining `{{steps.X.output}}` from interactive agents | PTY output is garbled. Use deterministic steps or `preset: 'worker'` | -| Single step editing 4+ files | Agents modify 1-2 then exit. Split to one file per step with verify gates | -| Relying on agents to `git commit` | Agents emit markers without running git. Use deterministic commit step | -| File-writing steps without `file_exists` verification | `exit_code` auto-passes even if no file written | -| Manual peer fanout in `handleChannelMessage()` | Use broker-managed channel subscriptions — broker fans out to all subscribers automatically | -| Client-side `personaNames.has(from)` filtering | Use `relay.subscribe()`/`relay.unsubscribe()` — only subscribed agents receive messages | -| Agents receiving noisy cross-channel messages during focused work | Use `relay.mute({ agent, channel })` to silence non-primary channels without leaving them | -| Hardcoding all channels at spawn time | Use `agent.subscribe()` / `agent.unsubscribe()` for dynamic channel membership post-spawn | -| Using `preset: 'worker'` for Codex in _interactive team_ patterns when coordination is needed | Codex interactive mode works fine with PTY channel injection. Drop the preset for interactive team patterns (keep it for one-shot DAG workers where clean stdout matters) | -| Separate reviewer agent from lead in interactive team | Merge lead + reviewer into one interactive Claude agent — reviews between rounds, fewer agents | -| Not printing PR URL after `gh pr create` | Add a final deterministic step: `echo "PR: $(cat pr-url.txt)"` or capture in the `gh pr create` command | -| Workflow ending without worktree + PR for cross-repo changes | Add `setup-worktree` at start and `push-and-pr` + `cleanup-worktree` at end | +| Mistake | Fix | +| ---------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ----------- | +| Treating relay as transport, not as a coordination layer (every step is `preset: 'worker'`, every handoff is `{{steps.X.output}}`) | Default to **Conversation shape** for non-trivial work — interactive agents on a shared channel. Pipeline-shape is only correct when the work could be expressed as a `bash | bash | bash` pipe. | +| Interactive agents on a channel whose task strings don't tell them to talk to each other | Pick a [Chat-Native Coordination Recipe](#chat-native-coordination-recipes) (Q/A, Broadcast/Ack, Peer Review, Standup, Hand-Off) and bake it into the task prompt — otherwise you're paying for a chat substrate you're not using | +| All workflows run sequentially | Group independent workflows into parallel waves (4-7x speedup) | +| Every step depends on the previous one | Only add `dependsOn` when there's a real data dependency | +| Self-review step with no timeout | Set `timeout: 300_000` (5 min) — Codex hangs in non-interactive review | +| One giant workflow per feature | Split into smaller workflows that can run in parallel waves | +| Adding exit instructions to tasks | Runner handles self-termination automatically | +| Setting `timeoutMs` on agents/steps | Use global `.timeout()` only | +| Using `general` channel | Set `.channel('wf-name')` for isolation | +| `{{steps.X.output}}` without `dependsOn: ['X']` | Output won't be available yet | +| Requiring exact sentinel as only completion gate | Use `exit_code` or `file_exists` verification | +| Writing 100-line task prompts | Split into lead + workers on a channel | +| `maxConcurrency: 16` with many parallel steps | Cap at 5-6 | +| Non-interactive agent reading large files via tools | Pre-read in deterministic step, inject via `{{steps.X.output}}` | +| Workers depending on lead step (deadlock) | Both depend on shared context step | +| `fan-out`/`hub-spoke` for simple parallel workers | Use `dag` instead | +| `pipeline` but expecting auto-supervisor | Only hub patterns auto-harden. Use `.pattern('supervisor')` | +| Workers without `preset: 'worker'` in one-shot DAG lead+worker flows | Add preset for clean stdout when chaining `{{steps.X.output}}` (not needed for interactive team patterns) | +| Using `_` in YAML numbers (`timeoutMs: 1_200_000`) | YAML doesn't support `_` separators | +| Workflow timeout under 30 min for complex workflows | Use `3600000` (1 hour) as default | +| Using `require()` in ESM projects | Check `package.json` for `"type": "module"` — use `import` if ESM | +| Wrapping in `async function main()` in ESM | ESM supports top-level `await` — no wrapper needed | +| Using `createWorkflowRenderer` | Does not exist. Use `.run({ cwd: process.cwd() })` | +| `export default workflow(...)...build()` | No `.build()`. Chain ends with `.run()` — the file must call `.run()`, not just export config | +| Relative import `'../workflows/builder.js'` | Use `import { workflow } from '@agent-relay/sdk/workflows'` | +| Hardcoded model strings (`model: 'opus'`) | Use constants: `import { ClaudeModels } from '@agent-relay/config'` → `model: ClaudeModels.OPUS` | +| Thinking `agent-relay run` inspects exports | It executes the file as a subprocess. Only `.run()` invocations trigger steps | +| `pattern('single')` on cloud runner | Not supported — use `dag` | +| `pattern('supervisor')` with one agent | Same agent is owner + specialist. Use `dag` | +| Invalid verification type (`type: 'deterministic'`) | Only `exit_code`, `output_contains`, `file_exists`, `custom` are valid | +| Chaining `{{steps.X.output}}` from interactive agents | PTY output is garbled. Use deterministic steps or `preset: 'worker'` | +| Single step editing 4+ files | Agents modify 1-2 then exit. Split to one file per step with verify gates | +| Relying on agents to `git commit` | Agents emit markers without running git. Use deterministic commit step | +| File-writing steps without `file_exists` verification | `exit_code` auto-passes even if no file written | +| Manual peer fanout in `handleChannelMessage()` | Use broker-managed channel subscriptions — broker fans out to all subscribers automatically | +| Client-side `personaNames.has(from)` filtering | Use `relay.subscribe()`/`relay.unsubscribe()` — only subscribed agents receive messages | +| Agents receiving noisy cross-channel messages during focused work | Use `relay.mute({ agent, channel })` to silence non-primary channels without leaving them | +| Hardcoding all channels at spawn time | Use `agent.subscribe()` / `agent.unsubscribe()` for dynamic channel membership post-spawn | +| Using `preset: 'worker'` for Codex in _interactive team_ patterns when coordination is needed | Codex interactive mode works fine with PTY channel injection. Drop the preset for interactive team patterns (keep it for one-shot DAG workers where clean stdout matters) | +| Separate reviewer agent from lead in interactive team | Merge lead + reviewer into one interactive Claude agent — reviews between rounds, fewer agents | +| Not printing PR URL after `createGitHubStep({ action: 'createPR' })` | Capture `html_url` with `output: { mode: 'data', format: 'json', path: 'html_url' }` and echo or write it in a final deterministic step | +| Workflow ending without worktree + PR for cross-repo changes | Add `setup-worktree` at start and `push-and-pr` + `cleanup-worktree` at end | ### YAML Alternative diff --git a/.claude/skills/writing-agent-relay-workflows/SKILL.md b/.claude/skills/writing-agent-relay-workflows/SKILL.md index 73762176e..8f67c3f91 100644 --- a/.claude/skills/writing-agent-relay-workflows/SKILL.md +++ b/.claude/skills/writing-agent-relay-workflows/SKILL.md @@ -1,6 +1,6 @@ --- name: writing-agent-relay-workflows -description: Use when building multi-agent workflows with the relay broker-sdk - covers the WorkflowBuilder API, DAG step dependencies, agent definitions, step output chaining via {{steps.X.output}}, verification gates, evidence-based completion, owner decisions, dedicated channels, dynamic channel management (subscribe/unsubscribe/mute/unmute), swarm patterns, error handling, event listeners, step sizing rules, authoring best practices, and the lead+workers team pattern for complex steps +description: Use when building multi-agent workflows with the relay broker-sdk - covers conversation-shape vs pipeline-shape coordination, the WorkflowBuilder API, DAG step dependencies, agent definitions, step output chaining via {{steps.X.output}}, verification gates, evidence-based completion, owner decisions, dedicated channels, dynamic channel management (subscribe/unsubscribe/mute/unmute), swarm patterns, chat-native coordination recipes (Q/A, broadcast-ack, peer review, standup, hand-off), error handling, event listeners, step sizing rules, authoring best practices, and the lead+workers team pattern for complex steps --- # Writing Agent Relay Workflows @@ -21,8 +21,25 @@ The relay broker-sdk workflow system orchestrates multiple AI agents (Claude, Co - Needing verification gates, retries, or step output chaining - Dynamic channel management: agents joining/leaving/muting channels mid-workflow -## Quick Reference +## Choose Your Coordination Style — Conversation vs Pipeline +Before writing the workflow, decide _how the agents will coordinate_. The relay primitive supports two very different shapes, and picking the wrong one wastes the most valuable thing the SDK gives you. + +| Shape | What it is | Use when | +| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ----------- | +| **Conversation** (chat-native) | Interactive agents share a channel; messages, `@-mentions`, and ambient awareness drive coordination. Lead and workers spawn in parallel and self-organize. The relay is the coordination layer, not just transport. | Multi-file work, peer review loops, cross-agent feedback, dynamic re-planning, multi-PR coordination, anything with a human-in-the-loop escape, swarms where workers pick up each other's output. | +| **Pipeline** (one-shot DAG) | Each step runs as a one-shot subprocess (`claude -p`, `codex exec`); steps hand off via `{{steps.X.output}}` text injection. No agents are alive at the same time; no chat happens. | Linear, well-specified transformations; deterministic data passing; no review loop expected; the work could be expressed as a `bash | bash | bash` pipe. | + +**Default to Conversation for any non-trivial work.** Pipeline DAGs are simpler to reason about but they do not exercise the relay primitive — they are a Unix pipe with extra steps. If you would happily write the same task as a single shell pipeline, pipeline-shape is fine. Otherwise, you almost certainly want a Conversation shape. + +The two shapes can mix within one workflow: pipeline-style deterministic preflight → conversation in the middle → pipeline-style commit-and-PR at the end. See **Quick Reference (Conversation)** below and **[Common Patterns → Interactive Team](#interactive-team-lead--workers-on-shared-channel)** for the canonical recipe. + +> **A blunt rule of thumb:** if your workflow only uses `agent` steps with `preset: 'worker'` chained by `{{steps.X.output}}`, you are not using the relay — you are using `claude -p | codex exec`. That may still be the right answer; just make it a deliberate choice. + +## Quick Reference (Pipeline shape) + +> Use this when steps are linear, well-specified, and need no agent-to-agent feedback. For anything with iteration, review, or coordination, jump to **Quick Reference (Conversation shape)** below. +> > **Note:** this Quick Reference assumes an **ESM** workflow file (the host `package.json` has `"type": "module"`). For CJS repos, see rule #1 in **Critical TypeScript rules** below — convert `import { workflow } from '@agent-relay/sdk/workflows'` to `const { workflow } = require('@agent-relay/sdk/workflows')` and wrap the workflow in `async function main() { ... } main().catch(console.error)` since CJS does not support top-level `await`. **Always check `package.json` before copy-pasting the snippet.** ```typescript @@ -57,6 +74,99 @@ const result = await workflow('my-workflow') console.log('Result:', result.status); ``` +## Quick Reference (Conversation shape) + +> Use this for any non-trivial work — peer review, multi-file edits, cross-agent feedback, dynamic re-planning. Lead and workers spawn **in parallel** on a shared channel and self-organize via messages. The relay primitive does the coordinating; verification gates downstream of the lead close the workflow. + +```typescript +import { workflow } from '@agent-relay/sdk/workflows'; +import { ClaudeModels, CodexModels } from '@agent-relay/config'; + +const result = await workflow('my-workflow') + .description('Multi-file change with peer review') + .pattern('dag') + .channel('wf-my-feature') // dedicated channel — agents share it + .maxConcurrency(4) + .timeout(3_600_000) + + // Interactive agents — no preset, they live on the channel + .agent('lead', { + cli: 'claude', + model: ClaudeModels.OPUS, + role: 'Architect + reviewer. Plans, assigns, reviews, posts feedback.', + retries: 1, + }) + .agent('impl-a', { + cli: 'codex', + model: CodexModels.GPT_5_4, + role: 'Implementer. Listens on channel for assignments and feedback.', + retries: 2, + }) + .agent('impl-b', { + cli: 'codex', + model: CodexModels.GPT_5_4, + role: 'Implementer. Listens on channel for assignments and feedback.', + retries: 2, + }) + + // Deterministic context — pre-reads files once, posts to the channel for everyone + .step('context', { + type: 'deterministic', + command: 'git ls-files src/', + captureOutput: true, + }) + + // Lead and workers all depend on `context` — they start CONCURRENTLY. + // They coordinate over #wf-my-feature, not via {{steps.X.output}}. + .step('lead-coordinate', { + agent: 'lead', + dependsOn: ['context'], + task: `You are the lead on #wf-my-feature. Workers: impl-a, impl-b. +Post the plan. Assign files. Review their PRs/diffs. Post feedback in-channel. +Workers iterate based on your feedback. Exit when both files pass review.`, + }) + .step('impl-a-work', { + agent: 'impl-a', + dependsOn: ['context'], // SAME dep as lead → starts in parallel, no deadlock + task: `You are impl-a on #wf-my-feature. Wait for the lead's plan. +Implement your assigned file. Post a completion message. Address feedback.`, + }) + .step('impl-b-work', { + agent: 'impl-b', + dependsOn: ['context'], // SAME dep as lead + task: `You are impl-b on #wf-my-feature. Wait for the lead's plan. +Implement your assigned file. Post a completion message. Address feedback.`, + }) + + // Downstream gates on the lead — lead exits when satisfied. + .step('verify', { + type: 'deterministic', + dependsOn: ['lead-coordinate'], + command: 'npm run typecheck && npm test', + failOnError: true, + }) + + .onError('retry', { maxRetries: 2, retryDelayMs: 10_000 }) + .run({ cwd: process.cwd() }); +``` + +**What this exercises that pipeline-shape does not:** + +- **Ambient awareness** — workers see each other's completion messages and start dependent work without the lead relaying. +- **Lead-as-reviewer** — the lead reads actual files between rounds and posts diff-aware feedback in chat. One agent does coordination + review; no separate reviewer step. +- **Iterative correction** — when the lead pings _"impl-a, the type on line 42 is wrong"_, impl-a fixes and re-posts. No new step, no re-spawn, no `{{output}}` chaining. + +**Critical workflow rules for this shape:** + +1. Lead and workers MUST share the same `dependsOn` (e.g., both depend on `context`). If a worker depends on the lead, you have a deadlock — the lead is waiting for worker output, the worker is waiting for the lead step to "complete." +2. Drop `preset: 'worker'` on the implementer agents — interactive mode is what lets them receive channel messages via PTY injection. +3. Downstream gates depend on the **lead step**, not the workers. The lead exits when it's satisfied; that's the workflow's signal of completion. +4. Use a dedicated `.channel('wf-...')` so the team is isolated from other workflows and the global `general` channel. + +See [Common Patterns → Interactive Team](#interactive-team-lead--workers-on-shared-channel) for production notes from real runs and decision criteria for picking this shape over one-shot DAG. + +--- + **Critical TypeScript rules:** 1. Check the project's `package.json` for `"type": "module"` — if ESM, use `import` and top-level `await`. If CJS, use `require()` and wrap in `async function main()`. @@ -173,6 +283,25 @@ runWorkflow().catch((error) => { Do not end workflow files with bare top-level `await workflow(...).run(...)`. +### 1b. Make commit and PR boundaries explicit + +Workflows do **not** get a PR for free just because they pass validation. If the intended deliverable is a branch, commit, push, or GitHub PR, the workflow itself must own that boundary explicitly and document the expected file scope. + +Use this pattern only when the workflow is supposed to own repository delivery: + +1. Preflight the git state and fail on unexpected staged changes. +2. Create or verify the intended branch. +3. Run implementation, review, soft validation, fix, and hard validation gates. +4. Stage only the declared target files and review/signoff artifacts. +5. Commit with a deterministic message. +6. Push the branch. +7. Use `createGitHubStep({ action: 'createPR', ... })` from `@agent-relay/sdk/github` to open the PR. +8. Verify the PR URL/state deterministically and write it into the final signoff artifact. + +Do not hide commit/PR work in agent prose. Model it as deterministic steps whenever possible. For PR creation, issue updates, file reads, or any GitHub operation, prefer `createGitHubStep` over shelling out to `gh`; it is bundled with `@agent-relay/sdk`. The downstream hard gate must still verify the PR exists before signoff. + +If commit or PR creation is intentionally outside the workflow, say that directly in the workflow description and signoff so the operator knows to do it after completion. + ### 2. Avoid raw fenced code blocks inside workflow task template literals Raw triple-backtick code fences inside large inline `task: \`...\``template strings are fragile and can break outer TypeScript parsing, especially when they contain language tags like`swift`or`diff`. @@ -224,7 +353,7 @@ The battle-tested template: - **Use `grep -vE "^(...)$"` for full-line match.** Substring matches bleed across unrelated files (e.g., `setup.ts` would also match `packages/core/src/bootstrap/setup.ts`). - **Append `|| true` to the grep.** Without it, an empty result triggers `set -e` and the whole preflight fails before the `if` can even run. - **Check the staging area separately.** A dirty index is different from a dirty working tree and both must be clean (modulo allow-list). -- **Check `gh auth status` early** if any downstream step uses `gh pr create` or similar. Failing on auth at the end of a long DAG is painful. +- **Check `gh auth status` early** if downstream GitHub operations will use the local transport. Failing on auth at the end of a long DAG is painful. **Never use `git diff --quiet` alone as your "clean tree" check.** It fails on any dirty file, including the ones the workflow is expected to rewrite, which causes false failures on every resume / re-run. @@ -267,7 +396,7 @@ command: [ Results in `set -e && cat > /tmp/f < **Phase C interaction (cloud only):** `agent-relay cloud run` already auto-pushes per-`paths[]` diffs as separate PRs after the workflow callback when the repos are allowlisted (see `pushedTo` in the run record). Phase C is the _catch-all_ — if your workflow does nothing else, you still get one PR per declared path. Use `createGitHubStep` **on top of** that when you need PRs the catch-all can't produce: cross-cutting issues, follow-up tracking issues, opening one PR that spans multiple paths, draft PRs you want labeled/assigned in specific ways, or PRs against a repo you didn't `paths[]` in. + +### The minimal "open a PR" recipe + +```typescript +import { workflow } from '@agent-relay/sdk/workflows'; +import { createGitHubStep } from '@agent-relay/sdk/github'; + +const REPO = 'AgentWorkforce/cloud'; +const BRANCH = `agent-relay/run-${Date.now()}`; + +await workflow('feature-x') + // ... your real steps that produce code changes ... + .step('write-marker', { + type: 'deterministic', + command: `echo "fix landed at $(date -u)" >> CHANGELOG.md`, + }) + + // Branch off main on the remote. + .step( + 'create-branch', + createGitHubStep({ + dependsOn: ['write-marker'], + action: 'createBranch', + repo: REPO, + params: { branch: BRANCH, source: 'main' }, + }) + ) + + // Commit the change to the branch via Contents API. + .step( + 'commit-change', + createGitHubStep({ + dependsOn: ['create-branch'], + action: 'createFile', + repo: REPO, + params: { + path: 'CHANGELOG.md', + branch: BRANCH, + content: '', + message: 'chore: changelog entry', + }, + }) + ) + + // Open the PR. This is the load-bearing step. + .step( + 'open-pr', + createGitHubStep({ + dependsOn: ['commit-change'], + action: 'createPR', + repo: REPO, + params: { + title: 'feat: ship feature X', + head: BRANCH, + base: 'main', + body: '## Summary\n\n- ...\n\n## Test plan\n\n- [x] ...', + draft: false, + }, + output: { mode: 'data', format: 'json', path: 'html_url' }, + }) + ) + + .run({ cwd: process.cwd() }); +``` + +`createGitHubStep` is bundled with `@agent-relay/sdk`; do not add a separate install. Its actions are stable across runtimes: `getRepo`, `createBranch`, `createFile`, `updateFile`, `createPR`, `updatePR`, `getPR`, `listPRs`, `mergePR`, `createIssue`, etc. See the SDK GitHub primitive docs for the full enum. + +### Authoring rules for PR-shipping workflows + +1. **Open the PR from the workflow, not from the operator's shell.** "Tell the user to run `gh pr create`" is a regression to a manual step the workflow could have done. The whole point of running this in cloud is that there is no operator's shell. +2. **One PR per workflow, by default.** A workflow that opens five PRs from one run is almost always wrong — humans review one PR at a time. If you genuinely need multiple, prefer a tracking issue + linked PRs, or split into separate workflows. +3. **Branch name encodes the run.** `agent-relay/run-${runId}` or `agent-relay/${workflow-name}-${timestamp}` so reviewers can tell the PR apart from other automation, and so reruns don't clash. +4. **`draft: true` while iterating.** Once the workflow is stable end-to-end, flip to `draft: false`. +5. **Body is a real PR description.** Summary + Test plan, generated from the workflow's own evidence (verification step output, diff stats, test run output). If you find yourself writing a placeholder body, the workflow isn't done — capture the real evidence in an earlier step and template it in. +6. **Don't use `createGitHubStep` to substitute for `paths[]` push-back in cloud.** If the diff lives in a tarballed `paths[]` mount, let cloud's Phase C push-back open that PR (it handles the patch generation, branch lifecycle, and per-repo allowlist). Use `createGitHubStep` when you need a PR against a repo or branch outside the `paths[]` set, or when you want to add an extra PR (e.g. a tracking issue, a follow-up against a sibling repo, a docs-only PR). +7. **Failure is a real failure.** If `createPR` errors (auth, permissions, branch conflict), the workflow should fail the step, not warn-and-continue. A "successful" workflow that silently failed to open the PR is the worst-case outcome — the human thinks the work shipped. + +### Where this fits in the bug-fix phases + +[End-to-End Bug Fix Workflows](#end-to-end-bug-fix-workflows) lists "Ship the result as a PR" as phase 9. Concretely that means: after phase 7 (compare before/after evidence) succeeds, the workflow's next step is `createPR` with that evidence templated into the body. The PR opening **is** the ship — there is no further manual step. + ## Key Concepts ### Step Output Chaining Use `{{steps.STEP_NAME.output}}` in a downstream step's task to inject the prior step's terminal output. +> **Mental model:** this is a **Unix pipe**, not agent communication. `{{steps.A.output}}` flowing into step B is `A | B` — A is dead by the time B reads its stdout. There is no chat, no feedback, no addressing. If your workflow's coordination story is _only_ output chaining, you're using the relay as transport, not as a coordination layer. See **[Choose Your Coordination Style](#choose-your-coordination-style--conversation-vs-pipeline)** before defaulting to this. + **Only chain output from clean sources:** - Deterministic steps (shell commands — always clean) - Non-interactive agents (`preset: 'worker'` — clean stdout) -**Never chain from interactive agents** (`cli: 'claude'` without preset) — PTY output includes spinners, ANSI codes, and TUI chrome. Instead, have the agent write to a file, then read it in a deterministic step. +**Never chain from interactive agents** (`cli: 'claude'` without preset) — PTY output includes spinners, ANSI codes, and TUI chrome. Instead, have the agent write to a file, then read it in a deterministic step. (Or: don't use chaining at all — let the agents coordinate over the channel.) ### Verification Gates @@ -764,12 +997,119 @@ Edit files as assigned. Report completion. Fix issues from feedback.`, **When to use interactive team vs one-shot DAG:** -| Scenario | Pattern | -| ---------------------------------- | ------------------------------------ | -| 4+ files, likely needs iteration | Interactive team | -| Simple edits, well-specified | One-shot DAG with `preset: 'worker'` | -| Cross-agent review feedback loop | Interactive team | -| Independent tasks, no coordination | Fan-out with non-interactive workers | +| Scenario | Pattern | +| ------------------------------------------------- | ------------------------------------ | ---------------- | +| 4+ files, likely needs iteration | Interactive team | +| Simple edits, well-specified | One-shot DAG with `preset: 'worker'` | +| Cross-agent review feedback loop | Interactive team | +| Independent tasks, no coordination | Fan-out with non-interactive workers | +| Anything where the answer to "could this be `cmd1 | cmd2`?" is _no_ | Interactive team | + +### Chat-Native Coordination Recipes + +Once you're in the Interactive Team shape, the channel is your coordination medium. These are recipes for using it well — they are _prompt-authoring patterns_, not new SDK surface. All of them assume interactive agents (no `preset`) sharing a `.channel('wf-...')`. + +#### 1. Question / Answer (blocking ask) + +When agent A needs information only agent B has, instruct A to **post a direct question and wait for a reply** rather than guessing or proceeding. + +```typescript +.step('integrate', { + agent: 'integrator', + dependsOn: ['context'], + task: `You are the integrator on #wf-feature. +Before writing code, post a direct question to @schema-owner asking which +table owns the new field. Do NOT proceed until @schema-owner replies in +channel. If no reply arrives in 5 minutes, @-mention the lead.`, +}) +``` + +**Why it beats `{{steps.X.output}}`:** the answer depends on something only an agent (or human) can decide at runtime; encoding it as a prior step's stdout is wrong. + +#### 2. Broadcast / Ack + +When a lead needs _N workers to confirm receipt_ before proceeding (e.g., to make sure the plan was actually read), require explicit acks. + +```typescript +.step('lead-coordinate', { + agent: 'lead', + dependsOn: ['context'], + task: `Post the plan to #wf-feature, then @impl-a @impl-b @impl-c. +Wait for each to reply with "ACK " before issuing assignments. +If any worker hasn't acked in 3 minutes, re-post and ping again. +Only after all three have acked, post per-worker assignments.`, +}) +``` + +**Why it matters:** in the Codex history, the most common silent failure is a worker step that started but never read the channel. An ack gate makes "did you actually receive this?" deterministic without a separate verification step. + +#### 3. Peer Review Handoff + +The substantive form of "review my work." Worker pings reviewer in-channel with a concrete artifact reference; reviewer reads the actual files (not the chat); reviewer replies with a verdict. + +```typescript +.step('impl-a-work', { + agent: 'impl-a', + dependsOn: ['context'], + task: `Implement src/foo.ts per the lead's assignment. +When done, post to #wf-feature: "@reviewer ready: src/foo.ts" — include the +commit SHA. Then wait for @reviewer's verdict in channel. +- If "APPROVED", you're done. +- If "CHANGES_REQUESTED ", apply the notes and re-post. +- If no verdict in 5 min, @-mention the lead.`, +}) +``` + +**Pattern note:** the reviewer must read the files themselves — never let the worker paste the diff into chat. Channel messages are for _coordination_, not _content_. That's also what keeps you under output-token limits. + +#### 4. Standup / Status Probe + +For long-running workflows, have the lead post periodic `@-mention` probes so silently-stuck workers surface fast. + +```typescript +.step('lead-coordinate', { + agent: 'lead', + task: `... coordinate the team ... + +Every 10 minutes, post a status probe: "@impl-a @impl-b status?" +Each worker should reply with one of: + - "RUNNING " (still working) + - "BLOCKED " (@-mention the lead with the blocker) + - "DONE " (ready for review) + +If a worker is silent for two probes in a row, mark them stalled and +reassign their work to a peer.`, +}) +``` + +#### 5. Hand-Off with Context + +When work flows from agent A to agent B _during_ a workflow (not just between steps), have A post a structured handoff message so B doesn't re-derive context. + +```typescript +.step('impl-a-work', { + agent: 'impl-a', + task: `... finish your part ... + +When done, post a handoff to #wf-feature targeting the next worker: +"@impl-b HANDOFF: src/foo.ts ready. Touched: . Open question: . +Tests: . Commit: ."`, +}) +``` + +**Vs `{{steps.X.output}}`:** an output-chain forces B to parse A's entire stdout. A handoff message is a curated summary A writes for B — much higher signal, no PTY/ANSI noise. + +#### Picking a recipe + +| Need | Recipe | +| ------------------------------------------------- | ------------------------- | +| One agent needs an answer from another at runtime | **Q/A** | +| Lead needs to confirm workers received the plan | **Broadcast/Ack** | +| Agent-to-agent code review | **Peer Review Handoff** | +| Long-running team, want stalled-worker visibility | **Standup/Probe** | +| Sequential agent work that needs context curation | **Hand-Off with Context** | + +> **Authoring rule:** if your workflow has interactive agents on a channel but their task strings don't _instruct them to talk to each other_, you're not using the chat primitive — you've just paid the overhead of starting it. Either add an explicit recipe above, or drop to `preset: 'worker'` and pipeline-shape. ### Pipeline (sequential handoff) @@ -961,48 +1301,50 @@ When you set `.pattern('supervisor')` (or `hub-spoke`, `fan-out`), the runner au ## Common Mistakes -| Mistake | Fix | -| --------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| All workflows run sequentially | Group independent workflows into parallel waves (4-7x speedup) | -| Every step depends on the previous one | Only add `dependsOn` when there's a real data dependency | -| Self-review step with no timeout | Set `timeout: 300_000` (5 min) — Codex hangs in non-interactive review | -| One giant workflow per feature | Split into smaller workflows that can run in parallel waves | -| Adding exit instructions to tasks | Runner handles self-termination automatically | -| Setting `timeoutMs` on agents/steps | Use global `.timeout()` only | -| Using `general` channel | Set `.channel('wf-name')` for isolation | -| `{{steps.X.output}}` without `dependsOn: ['X']` | Output won't be available yet | -| Requiring exact sentinel as only completion gate | Use `exit_code` or `file_exists` verification | -| Writing 100-line task prompts | Split into lead + workers on a channel | -| `maxConcurrency: 16` with many parallel steps | Cap at 5-6 | -| Non-interactive agent reading large files via tools | Pre-read in deterministic step, inject via `{{steps.X.output}}` | -| Workers depending on lead step (deadlock) | Both depend on shared context step | -| `fan-out`/`hub-spoke` for simple parallel workers | Use `dag` instead | -| `pipeline` but expecting auto-supervisor | Only hub patterns auto-harden. Use `.pattern('supervisor')` | -| Workers without `preset: 'worker'` in one-shot DAG lead+worker flows | Add preset for clean stdout when chaining `{{steps.X.output}}` (not needed for interactive team patterns) | -| Using `_` in YAML numbers (`timeoutMs: 1_200_000`) | YAML doesn't support `_` separators | -| Workflow timeout under 30 min for complex workflows | Use `3600000` (1 hour) as default | -| Using `require()` in ESM projects | Check `package.json` for `"type": "module"` — use `import` if ESM | -| Wrapping in `async function main()` in ESM | ESM supports top-level `await` — no wrapper needed | -| Using `createWorkflowRenderer` | Does not exist. Use `.run({ cwd: process.cwd() })` | -| `export default workflow(...)...build()` | No `.build()`. Chain ends with `.run()` — the file must call `.run()`, not just export config | -| Relative import `'../workflows/builder.js'` | Use `import { workflow } from '@agent-relay/sdk/workflows'` | -| Hardcoded model strings (`model: 'opus'`) | Use constants: `import { ClaudeModels } from '@agent-relay/config'` → `model: ClaudeModels.OPUS` | -| Thinking `agent-relay run` inspects exports | It executes the file as a subprocess. Only `.run()` invocations trigger steps | -| `pattern('single')` on cloud runner | Not supported — use `dag` | -| `pattern('supervisor')` with one agent | Same agent is owner + specialist. Use `dag` | -| Invalid verification type (`type: 'deterministic'`) | Only `exit_code`, `output_contains`, `file_exists`, `custom` are valid | -| Chaining `{{steps.X.output}}` from interactive agents | PTY output is garbled. Use deterministic steps or `preset: 'worker'` | -| Single step editing 4+ files | Agents modify 1-2 then exit. Split to one file per step with verify gates | -| Relying on agents to `git commit` | Agents emit markers without running git. Use deterministic commit step | -| File-writing steps without `file_exists` verification | `exit_code` auto-passes even if no file written | -| Manual peer fanout in `handleChannelMessage()` | Use broker-managed channel subscriptions — broker fans out to all subscribers automatically | -| Client-side `personaNames.has(from)` filtering | Use `relay.subscribe()`/`relay.unsubscribe()` — only subscribed agents receive messages | -| Agents receiving noisy cross-channel messages during focused work | Use `relay.mute({ agent, channel })` to silence non-primary channels without leaving them | -| Hardcoding all channels at spawn time | Use `agent.subscribe()` / `agent.unsubscribe()` for dynamic channel membership post-spawn | -| Using `preset: 'worker'` for Codex in _interactive team_ patterns when coordination is needed | Codex interactive mode works fine with PTY channel injection. Drop the preset for interactive team patterns (keep it for one-shot DAG workers where clean stdout matters) | -| Separate reviewer agent from lead in interactive team | Merge lead + reviewer into one interactive Claude agent — reviews between rounds, fewer agents | -| Not printing PR URL after `gh pr create` | Add a final deterministic step: `echo "PR: $(cat pr-url.txt)"` or capture in the `gh pr create` command | -| Workflow ending without worktree + PR for cross-repo changes | Add `setup-worktree` at start and `push-and-pr` + `cleanup-worktree` at end | +| Mistake | Fix | +| ---------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | ----------- | +| Treating relay as transport, not as a coordination layer (every step is `preset: 'worker'`, every handoff is `{{steps.X.output}}`) | Default to **Conversation shape** for non-trivial work — interactive agents on a shared channel. Pipeline-shape is only correct when the work could be expressed as a `bash | bash | bash` pipe. | +| Interactive agents on a channel whose task strings don't tell them to talk to each other | Pick a [Chat-Native Coordination Recipe](#chat-native-coordination-recipes) (Q/A, Broadcast/Ack, Peer Review, Standup, Hand-Off) and bake it into the task prompt — otherwise you're paying for a chat substrate you're not using | +| All workflows run sequentially | Group independent workflows into parallel waves (4-7x speedup) | +| Every step depends on the previous one | Only add `dependsOn` when there's a real data dependency | +| Self-review step with no timeout | Set `timeout: 300_000` (5 min) — Codex hangs in non-interactive review | +| One giant workflow per feature | Split into smaller workflows that can run in parallel waves | +| Adding exit instructions to tasks | Runner handles self-termination automatically | +| Setting `timeoutMs` on agents/steps | Use global `.timeout()` only | +| Using `general` channel | Set `.channel('wf-name')` for isolation | +| `{{steps.X.output}}` without `dependsOn: ['X']` | Output won't be available yet | +| Requiring exact sentinel as only completion gate | Use `exit_code` or `file_exists` verification | +| Writing 100-line task prompts | Split into lead + workers on a channel | +| `maxConcurrency: 16` with many parallel steps | Cap at 5-6 | +| Non-interactive agent reading large files via tools | Pre-read in deterministic step, inject via `{{steps.X.output}}` | +| Workers depending on lead step (deadlock) | Both depend on shared context step | +| `fan-out`/`hub-spoke` for simple parallel workers | Use `dag` instead | +| `pipeline` but expecting auto-supervisor | Only hub patterns auto-harden. Use `.pattern('supervisor')` | +| Workers without `preset: 'worker'` in one-shot DAG lead+worker flows | Add preset for clean stdout when chaining `{{steps.X.output}}` (not needed for interactive team patterns) | +| Using `_` in YAML numbers (`timeoutMs: 1_200_000`) | YAML doesn't support `_` separators | +| Workflow timeout under 30 min for complex workflows | Use `3600000` (1 hour) as default | +| Using `require()` in ESM projects | Check `package.json` for `"type": "module"` — use `import` if ESM | +| Wrapping in `async function main()` in ESM | ESM supports top-level `await` — no wrapper needed | +| Using `createWorkflowRenderer` | Does not exist. Use `.run({ cwd: process.cwd() })` | +| `export default workflow(...)...build()` | No `.build()`. Chain ends with `.run()` — the file must call `.run()`, not just export config | +| Relative import `'../workflows/builder.js'` | Use `import { workflow } from '@agent-relay/sdk/workflows'` | +| Hardcoded model strings (`model: 'opus'`) | Use constants: `import { ClaudeModels } from '@agent-relay/config'` → `model: ClaudeModels.OPUS` | +| Thinking `agent-relay run` inspects exports | It executes the file as a subprocess. Only `.run()` invocations trigger steps | +| `pattern('single')` on cloud runner | Not supported — use `dag` | +| `pattern('supervisor')` with one agent | Same agent is owner + specialist. Use `dag` | +| Invalid verification type (`type: 'deterministic'`) | Only `exit_code`, `output_contains`, `file_exists`, `custom` are valid | +| Chaining `{{steps.X.output}}` from interactive agents | PTY output is garbled. Use deterministic steps or `preset: 'worker'` | +| Single step editing 4+ files | Agents modify 1-2 then exit. Split to one file per step with verify gates | +| Relying on agents to `git commit` | Agents emit markers without running git. Use deterministic commit step | +| File-writing steps without `file_exists` verification | `exit_code` auto-passes even if no file written | +| Manual peer fanout in `handleChannelMessage()` | Use broker-managed channel subscriptions — broker fans out to all subscribers automatically | +| Client-side `personaNames.has(from)` filtering | Use `relay.subscribe()`/`relay.unsubscribe()` — only subscribed agents receive messages | +| Agents receiving noisy cross-channel messages during focused work | Use `relay.mute({ agent, channel })` to silence non-primary channels without leaving them | +| Hardcoding all channels at spawn time | Use `agent.subscribe()` / `agent.unsubscribe()` for dynamic channel membership post-spawn | +| Using `preset: 'worker'` for Codex in _interactive team_ patterns when coordination is needed | Codex interactive mode works fine with PTY channel injection. Drop the preset for interactive team patterns (keep it for one-shot DAG workers where clean stdout matters) | +| Separate reviewer agent from lead in interactive team | Merge lead + reviewer into one interactive Claude agent — reviews between rounds, fewer agents | +| Not printing PR URL after `createGitHubStep({ action: 'createPR' })` | Capture `html_url` with `output: { mode: 'data', format: 'json', path: 'html_url' }` and echo or write it in a final deterministic step | +| Workflow ending without worktree + PR for cross-repo changes | Add `setup-worktree` at start and `push-and-pr` + `cleanup-worktree` at end | ## YAML Alternative diff --git a/.github/workflows/verify-publish-sdk.yml b/.github/workflows/verify-publish-sdk.yml index d78ee790d..8d7b3a2bb 100644 --- a/.github/workflows/verify-publish-sdk.yml +++ b/.github/workflows/verify-publish-sdk.yml @@ -85,12 +85,20 @@ jobs: # The SDK pins internal runtime deps to the exact same release version; # wait for those too so npm/Bun installs cannot resolve a half-published # SDK that later fails at runtime inside @agent-relay/sdk/workflows. - - name: Wait for SDK and internal deps on registry + # + # The matching broker package is included because npm silently skips + # optionalDependencies that 404 — if the broker tarball hasn't reached + # the CDN edge serving this runner yet, install would "succeed" with + # no broker installed and the next step would fail confusingly. + # `npm view version` and the actual tarball fetch can hit + # different caches, so additionally probe the tarball URL directly. + - name: Wait for SDK, internal deps, and matching broker on registry shell: bash run: | set -euo pipefail SPEC="${{ steps.spec.outputs.spec }}" VERSION="${{ steps.spec.outputs.version }}" + EXPECTED_BROKER="${{ matrix.expected_pkg }}" if [ "$VERSION" = "latest" ]; then echo "version=latest — no wait needed" exit 0 @@ -102,9 +110,10 @@ jobs: "@agent-relay/github-primitive@$VERSION" "@agent-relay/slack-primitive@$VERSION" "@agent-relay/workflow-types@$VERSION" + "${EXPECTED_BROKER}@$VERSION" ) - for i in 1 2 3 4 5 6; do + for i in 1 2 3 4 5 6 7; do missing=() for pkg in "${PACKAGES[@]}"; do if ! npm view "$pkg" version >/dev/null 2>&1; then @@ -112,8 +121,24 @@ jobs: fi done + # Even when `npm view` succeeds, the tarball may still 404 from + # the CDN edge that npm install hits. Probe the tarball URL of + # the matching broker directly. if [ "${#missing[@]}" -eq 0 ]; then - echo "registry has SDK and internal deps for $VERSION" + TARBALL_URL=$(npm view "${EXPECTED_BROKER}@$VERSION" dist.tarball 2>/dev/null || true) + if [ -z "$TARBALL_URL" ]; then + missing+=("${EXPECTED_BROKER}@$VERSION (no tarball url)") + else + STATUS=$(curl -sS -o /dev/null -w "%{http_code}" -I \ + --connect-timeout 5 --max-time 15 "$TARBALL_URL" || echo "000") + if [ "$STATUS" != "200" ]; then + missing+=("${EXPECTED_BROKER}@$VERSION (tarball HTTP $STATUS)") + fi + fi + fi + + if [ "${#missing[@]}" -eq 0 ]; then + echo "registry has SDK, internal deps, and ${EXPECTED_BROKER} for $VERSION" exit 0 fi diff --git a/.gitignore b/.gitignore index 87909f171..f2f71f14a 100644 --- a/.gitignore +++ b/.gitignore @@ -87,6 +87,7 @@ web/waitlist.json web/.open-next .msd/ /workflows/* +.workflow-artifacts/* !/workflows/ci/ !/workflows/refactor/ !/workflows/relayauth-integration/ diff --git a/package-lock.json b/package-lock.json index 21defb95b..82da6ac92 100644 --- a/package-lock.json +++ b/package-lock.json @@ -35,7 +35,7 @@ "@relayfile/local-mount": "^0.2.2", "@relayfile/sdk": "^0.6.0", "@sinclair/typebox": "^0.34.14", - "agent-trajectories": "^0.5.7", + "agent-trajectories": "^0.5.8", "chalk": "^4.1.2", "chokidar": "^5.0.0", "commander": "^12.1.0", @@ -6342,9 +6342,9 @@ } }, "node_modules/agent-trajectories": { - "version": "0.5.7", - "resolved": "https://registry.npmjs.org/agent-trajectories/-/agent-trajectories-0.5.7.tgz", - "integrity": "sha512-XOD35i7h6JftQytnG0F1mI0FndqR8QuhO73xEsqXa+bI3dyK+sQ3+s4zhKtZxhR9pQZRkRV3nclkSmp6EjQfPQ==", + "version": "0.5.8", + "resolved": "https://registry.npmjs.org/agent-trajectories/-/agent-trajectories-0.5.8.tgz", + "integrity": "sha512-Cu/+uyxAy+eNSlpzuOhk62kM/i0BdlfG8Z4avyzfbHbQ3I9EQLqiUikl3WcG75m3v+4MwTbJq9e6YTG8/ykKPw==", "license": "MIT", "dependencies": { "@clack/prompts": "^0.7.0", diff --git a/package.json b/package.json index 78a0edbb7..3cdcc4727 100644 --- a/package.json +++ b/package.json @@ -147,7 +147,7 @@ "@relayfile/local-mount": "^0.2.2", "@relayfile/sdk": "^0.6.0", "@sinclair/typebox": "^0.34.14", - "agent-trajectories": "^0.5.7", + "agent-trajectories": "^0.5.8", "chalk": "^4.1.2", "chokidar": "^5.0.0", "commander": "^12.1.0", diff --git a/prpm.lock b/prpm.lock index 844cbb1ea..5f4f9d8f2 100644 --- a/prpm.lock +++ b/prpm.lock @@ -154,9 +154,9 @@ "installedPath": ".claude/skills/choosing-swarm-patterns/SKILL.md" }, "@agent-relay/writing-agent-relay-workflows#claude": { - "version": "1.5.1", - "resolved": "https://registry.prpm.dev/api/v1/packages/%40agent-relay%2Fwriting-agent-relay-workflows/1.5.1.tar.gz", - "integrity": "sha256-745f43edd930c5f930065e8f92a3f2fae6ef902dd768576d574f9c3a50837042", + "version": "1.6.2", + "resolved": "https://registry.prpm.dev/api/v1/packages/%40agent-relay%2Fwriting-agent-relay-workflows/1.6.2.tar.gz", + "integrity": "sha256-2394f6b0cd6ef65e871720646d23b9406a2193a314bd82e1f9a6eb0416c672ed", "format": "claude", "subtype": "skill", "sourceFormat": "claude", @@ -194,9 +194,9 @@ "installedPath": ".agents/skills/choosing-swarm-patterns/SKILL.md" }, "@agent-relay/writing-agent-relay-workflows#codex": { - "version": "1.5.1", - "resolved": "https://registry.prpm.dev/api/v1/packages/%40agent-relay%2Fwriting-agent-relay-workflows/1.5.1.tar.gz", - "integrity": "sha256-745f43edd930c5f930065e8f92a3f2fae6ef902dd768576d574f9c3a50837042", + "version": "1.6.2", + "resolved": "https://registry.prpm.dev/api/v1/packages/%40agent-relay%2Fwriting-agent-relay-workflows/1.6.2.tar.gz", + "integrity": "sha256-2394f6b0cd6ef65e871720646d23b9406a2193a314bd82e1f9a6eb0416c672ed", "format": "codex", "subtype": "skill", "sourceFormat": "claude", @@ -244,5 +244,5 @@ "installedPath": ".agents/skills/relay-80-100-workflow/SKILL.md" } }, - "generated": "2026-05-04T08:36:28.666Z" + "generated": "2026-05-09T09:15:37.156Z" } \ No newline at end of file