fix(orchestrating-agent-relay): correct ACK target + broker-lifecycle troubleshooting#52
Conversation
…ycle troubleshooting
Across multiple fresh runs, orchestrators told workers to DM `broker` for
ACK, but `broker` is the broker's internal routing self-name, not a
DM-able agent (workers hit `Agent "broker" not found`). Make ACK/DONE
target explicitly `orchestrator`/`#general`. Add Common Mistakes rows
for: half-started broker (process alive, status STOPPED, missing
connection metadata), the status-vs-who stale-connection contradiction,
and unresolved ${RELAY_API_KEY} template auth failures — each pointing
at `agent-relay doctor`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR updates the orchestrating-agent-relay SKILL.md documentation to clarify the correct protocol targets for worker ACK/DONE responses (orchestrator or ChangesOrchestrator Relay Protocol and Troubleshooting Guide
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Hardens the
orchestrating-agent-relayskill against friction observed repeatedly in fresh runs.1. Wrong ACK target (the recurring one)
The skill led orchestrators to tell workers to DM
brokerfor ACK.brokeris the broker's internal routing self-name (crates/brokerworkspace.rs/routing.rsin the relay repo), not a spawnable/DM-able agent — every codex/claude worker hitAgent "broker" not foundand recovered via#general. Protocol now states ACK/DONE go toorchestrator(the auto-registered spawning identity) or#general, neverbroker.2. Broker-lifecycle troubleshooting (Common Mistakes rows)
statussays STOPPED andFailed to read broker connection metadata—upspawned a broker whose readiness timed out and was never reaped; retryingupdoesn't clean the orphan. Documents thepkill/down --force+ clean.agent-relay/recovery.statusvswhocontradiction:status(state file) says RUNNING/Agents:N whilewho/send/replies(live RPC) fail — stale.agent-relay/connection.json/ wrong port / second broker. Recovery +agent-relay doctor.Invalid agent tokenfrom orchestrator CLI while broker+workers keep working — unresolved${RELAY_API_KEY}template used as a literal key.Docs only. (A relay-side PR #914 hardens the corresponding broker/
doctorbehavior; this is the skill-guidance counterpart.)🤖 Generated with Claude Code