sdk: typed action result predicates, zod agentResultSchema, orchestration docs#1082
Conversation
…tion docs
Closes three DX gaps found while building an orchestrator against the v8 SDK:
- @agent-relay/sdk: relay.registerAction(...) now returns a TypedActionHandle
whose completed()/failed()/invoked()/denied() predicates carry the
registration's input/output types into addListener handlers, so orchestrators
consume their own action results without string keys or `as` casts. The
string path (relay.action(name)) and ActionHandle.unregister() are unchanged.
- @agent-relay/harness-driver: SpawnedAgentHandle.waitForResult() resolves with
the structured result submitted through the submit_result MCP tool (replays
from broker event history; resolves on exit or timeout), and agentResultSchema
now accepts zod-style validators, coerced to JSON Schema via the SDK's
actionSchemaToJsonSchema before the spawn request reaches the broker.
- CHANGELOG: the structured-result entry now describes what actually shipped
(it advertised agent.waitForResult(), result.onResult, and
addListener('agentResult') which did not exist).
- docs: new orchestrating-with-actions page covering the request →
typed-callback pipeline pattern, with typed-handle examples and cross-links
in actions and event-handlers.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds SpawnedAgentHandle.waitForResult and AgentResultInfo, widens and coerces agentResultSchema inputs to JSON Schema, introduces typed action-event predicates and TypedActionHandle in the SDK (with wiring and tests), and updates docs, changelog, and navigation for orchestrating with typed actions. ChangesStructured results and typed action listeners
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: baef66a3c8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| schema: SpawnPtyInput['agentResultSchema'] | ||
| ): Record<string, unknown> | boolean | undefined { | ||
| if (schema === undefined || typeof schema === 'boolean') return schema; | ||
| return actionSchemaToJsonSchema(schema as ActionSchema); |
There was a problem hiding this comment.
Preserve validator schemas for agent results
When callers pass a zod v3 schema or any custom safeParse validator without toJSONSchema, this helper delegates to actionSchemaToJsonSchema, whose fallback is a permissive { type: 'object' } descriptor. Unlike actions, agent results have no local validation fallback after submission, so the broker either accepts malformed object results or rejects valid non-object results even though the new AgentResultSchema type says these validators are supported. Please either perform a real conversion for the supported validators or reject/omit unconvertible validators instead of silently changing their contract.
Useful? React with 👍 / 👎.
|
Preview deployed!
This preview will be cleaned up when the PR is merged or closed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/harness-driver/src/agent-result.test.ts (1)
37-81: ⚡ Quick winConsider adding test coverage for multiple results in history.
The current replay test (lines 54-60) only covers a single result in history. Given that agents can stream interim results (per the doc comment's mention of checking the
finalflag), consider adding a test case that pre-populates history with multipleagent_resultevents to verify which one is replayed.🧪 Suggested additional test case
+ it('replays the last result when multiple exist in history', async () => { + const stub = createStubClient([ + resultEvent('worker', { interim: 1 }, false), + resultEvent('worker', { interim: 2 }, false), + resultEvent('worker', { final: 3 }, true), + ]); + const handle = createHandle(stub); + + const info = await handle.waitForResult(); + // Current implementation uses getLastEvent, so expects the last result + expect(info).toMatchObject({ reason: 'result', data: { final: 3 }, final: true }); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/harness-driver/src/agent-result.test.ts` around lines 37 - 81, Add a test that pre-populates the stub broker history with multiple agent_result events for the same agent to ensure waitForResult replays the correct event: use createStubClient([...resultEvent('worker', {score: 1, final: false}), resultEvent('worker', {score: 2, final: false}), resultEvent('worker', {score: 3, final: true})]) then createHandle(stub) and await handle.waitForResult() and assert the returned info matches the final result (e.g., data.score === 3 and final === true); reference createStubClient, resultEvent, createHandle and SpawnedAgentHandle.waitForResult in the new test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/harness-driver/src/agent-handle.ts`:
- Around line 162-201: The replay branch in waitForResult uses
this.client.getLastEvent('agent_result', this.name) which returns the most
recent agent_result, but the live path resolves on the first subsequent
agent_result; make the replay behavior symmetric by finding the first matching
agent_result in the broker history instead of the last: after calling
this.client.connectEvents() iterate the client's event history (e.g. eventBuffer
or the public API that exposes past events) from the start to find the first
event with kind === 'agent_result' and name === this.name and then return
Promise.resolve(toResultInfo<T>(thatEvent)); keep the rest of waitForResult
(timer, onEvent subscriber, matchExit handling) unchanged so live behavior still
resolves on the first new result.
---
Nitpick comments:
In `@packages/harness-driver/src/agent-result.test.ts`:
- Around line 37-81: Add a test that pre-populates the stub broker history with
multiple agent_result events for the same agent to ensure waitForResult replays
the correct event: use createStubClient([...resultEvent('worker', {score: 1,
final: false}), resultEvent('worker', {score: 2, final: false}),
resultEvent('worker', {score: 3, final: true})]) then createHandle(stub) and
await handle.waitForResult() and assert the returned info matches the final
result (e.g., data.score === 3 and final === true); reference createStubClient,
resultEvent, createHandle and SpawnedAgentHandle.waitForResult in the new test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8ce6e348-8690-4fec-949b-3ab5d4400603
📒 Files selected for processing (14)
CHANGELOG.mdpackages/harness-driver/src/agent-handle.tspackages/harness-driver/src/agent-result.test.tspackages/harness-driver/src/client.tspackages/harness-driver/src/types.tspackages/sdk/package.jsonpackages/sdk/src/__tests__/typed-action-handle.test.tspackages/sdk/src/agent-relay.tspackages/sdk/src/facade.tspackages/sdk/src/listeners.tsweb/content/docs/actions.mdxweb/content/docs/event-handlers.mdxweb/content/docs/orchestrating-with-actions.mdxweb/lib/docs-nav.ts
Motivation
Building an orchestrator (relay-scout) against
@agent-relay/sdkv8 surfaced three DX gaps in the actions/results surface:relay.action('name').completed()) and cast the event'soutput: unknown— the generics passed toregisterActionnever reached the subscription.agent.waitForResult(), per-spawnresult.onResult,relay.addListener('agentResult', ...)) that does not exist anywhere inpackages/*— only thesubmit_resultMCP tool, the broker'sagent_resultevent, and the JsonSchema-onlyagentResultSchemaspawn field had shipped.What changed
Gap 1 — typed action handles (
@agent-relay/sdk)relay.registerAction(...)now returns aTypedActionHandle<TInput, TOutput>:handle.invoked()/completed()/failed()/denied()build phase-bound predicates (withcalledBy(...), matching the existing predicate-builder style) whose events carry the registration'sinput/outputtypes; events use the public shape (agentfor the caller), consistent with the documentedaddListenersurface.addListenergains a generic overload —addListener<TEvent>(selector: ListenerPredicate<TEvent>, handler: (e: TEvent) => void)— onListenerHub,AgentRelay, andAgentRelayAgent; the existing string/predicate overload is kept verbatim as a fallback, so this is semver-minor.relay.action(name)andActionHandle.unregister()are unchanged; the handle adds members, removes none.packages/sdk/src/__tests__/typed-action-handle.test.ts(9 tests — runtime delivery through the hub for all four phases, caller filtering, action isolation, unregister, legacy string-path coexistence, plus inline type-level assertions checked bytsc).Gap 2 — structured-result contract (
@agent-relay/harness-driver+ CHANGELOG)Investigation: the broker plumbing IS real end-to-end —
submit_resultMCP tool in the CLI posts to the broker's listen API (crates/broker/src/listen_api.rsSubmitAgentResult), the broker emitsBrokerEvent::AgentResult, and it arrives on the harness-driver WS stream askind: 'agent_result'. What was missing was any consumer API: nothing bridges broker events onto the harness-driver event bus, so the advertisedaddListener('agentResult')never fires, andwaitForResult/result.onResultexisted nowhere.waitForResult()(the small-lift branch):SpawnedAgentHandle.waitForResult<T>(timeoutMs?)sits besidewaitForExit/waitForIdle, resolves with{ reason: 'result', resultId, data, final, metadata }from theagent_resultbroker event, replays from event history when the result already landed, and resolves{ reason: 'exited' }/{ reason: 'timeout' }so it can never hang on a dead worker.agentResultSchemaaccepts zod: the spawn inputs' field is nowJsonSchema | ZodLikeSchema | SafeParseSchema; validators are coerced with the SDK's existingactionSchemaToJsonSchemabefore the spawn body reaches the broker (raw JSON Schema and boolean schemas pass through unchanged).agentResultSchemaincl. zod,submit_result,waitForResult()), and a new bullet covers the typed handles.packages/harness-driver/src/agent-result.test.ts(8 tests — result/replay/exit/timeout forwaitForResult, and spawn-body schema coercion for pty + cli spawns via a mocked fetch).Gap 3 — orchestration docs (web only, no changelog entry)
web/content/docs/orchestrating-with-actions.mdx(+ nav entry in the Automation group): the lead-scoring-crew pattern end-to-end — zod input = the worker's typed result shape, task assignment over messaging, validated action reporting (with the DM-the-caller note since actions are fire-and-forget), pipeline advancement viahandle.completed(), and failure/stall handling viahandle.failed()+ a timeout guard.actions.mdx("React to completion") andevent-handlers.mdx("Action listeners") now lead with the typed-handle form and cross-link the new page.Deliberately not done
result.onResultorrelay.addListener('agentResult', ...): the harness-driver event bus has no broker-event bridge at all today (none of the declared broker events —agentSpawned,agentIdle,agentResult, … — are ever emitted onto it); building that bridge is a separate, larger piece of work. The changelog no longer claims these exist.waitForResult()resolves on the first submitted result; interim (final: false) streaming results are surfaced via thefinalflag rather than a multi-shot API, keeping this minimal.package-lock.jsonchurn fromnpm install(no dependency changes in this PR).Test evidence
npm --prefix packages/sdk run build✓,npm --prefix packages/sdk test✓ (10 files, 85 tests),npm --prefix packages/sdk run check✓npm --prefix packages/harness-driver run build✓ andrun check✓ (notestscript exists; its co-located tests run through the root vitest suite)npx vitest run✓ — 61 files passed, 806 tests passed (includes the new harness-driver tests)npm --prefix web test✓ (6 files, 15 tests)npm run typecheck✓ (config, cloud, utils, policy, sdk, harness-driver, harnesses, cli)🤖 Generated with Claude Code