Skip to content

sdk: typed action result predicates, zod agentResultSchema, orchestration docs#1082

Merged
willwashburn merged 6 commits into
mainfrom
feat/typed-action-results
Jun 11, 2026
Merged

sdk: typed action result predicates, zod agentResultSchema, orchestration docs#1082
willwashburn merged 6 commits into
mainfrom
feat/typed-action-results

Conversation

@willwashburn

Copy link
Copy Markdown
Member

Motivation

Building an orchestrator (relay-scout) against @agent-relay/sdk v8 surfaced three DX gaps in the actions/results surface:

  1. To consume the results of an action you registered yourself, you had to re-key it by string (relay.action('name').completed()) and cast the event's output: unknown — the generics passed to registerAction never reached the subscription.
  2. The root CHANGELOG advertised a structured-result contract (agent.waitForResult(), per-spawn result.onResult, relay.addListener('agentResult', ...)) that does not exist anywhere in packages/* — only the submit_result MCP tool, the broker's agent_result event, and the JsonSchema-only agentResultSchema spawn field had shipped.
  3. There was no doc describing the canonical request → typed-callback orchestration pattern.

What changed

Gap 1 — typed action handles (@agent-relay/sdk)

relay.registerAction(...) now returns a TypedActionHandle<TInput, TOutput>:

const handle = relay.registerAction({ name: 'x', input: InZod, output: OutZod, handler });
relay.addListener(handle.completed(), (event) => event.output.saved); // typed TOutput, no cast
relay.addListener(handle.failed(), (event) => event.error);
  • handle.invoked() / completed() / failed() / denied() build phase-bound predicates (with calledBy(...), matching the existing predicate-builder style) whose events carry the registration's input/output types; events use the public shape (agent for the caller), consistent with the documented addListener surface.
  • addListener gains a generic overload — addListener<TEvent>(selector: ListenerPredicate<TEvent>, handler: (e: TEvent) => void) — on ListenerHub, AgentRelay, and AgentRelayAgent; the existing string/predicate overload is kept verbatim as a fallback, so this is semver-minor.
  • Backward compatible: relay.action(name) and ActionHandle.unregister() are unchanged; the handle adds members, removes none.
  • Tests: packages/sdk/src/__tests__/typed-action-handle.test.ts (9 tests — runtime delivery through the hub for all four phases, caller filtering, action isolation, unregister, legacy string-path coexistence, plus inline type-level assertions checked by tsc).

Gap 2 — structured-result contract (@agent-relay/harness-driver + CHANGELOG)

Investigation: the broker plumbing IS real end-to-end — submit_result MCP tool in the CLI posts to the broker's listen API (crates/broker/src/listen_api.rs SubmitAgentResult), the broker emits BrokerEvent::AgentResult, and it arrives on the harness-driver WS stream as kind: 'agent_result'. What was missing was any consumer API: nothing bridges broker events onto the harness-driver event bus, so the advertised addListener('agentResult') never fires, and waitForResult / result.onResult existed nowhere.

  • Implemented waitForResult() (the small-lift branch): SpawnedAgentHandle.waitForResult<T>(timeoutMs?) sits beside waitForExit/waitForIdle, resolves with { reason: 'result', resultId, data, final, metadata } from the agent_result broker event, replays from event history when the result already landed, and resolves { reason: 'exited' } / { reason: 'timeout' } so it can never hang on a dead worker.
  • agentResultSchema accepts zod: the spawn inputs' field is now JsonSchema | ZodLikeSchema | SafeParseSchema; validators are coerced with the SDK's existing actionSchemaToJsonSchema before the spawn body reaches the broker (raw JSON Schema and boolean schemas pass through unchanged).
  • CHANGELOG corrected: the entry now describes exactly what ships (agentResultSchema incl. zod, submit_result, waitForResult()), and a new bullet covers the typed handles.
  • Tests: packages/harness-driver/src/agent-result.test.ts (8 tests — result/replay/exit/timeout for waitForResult, and spawn-body schema coercion for pty + cli spawns via a mocked fetch).

Gap 3 — orchestration docs (web only, no changelog entry)

  • New page web/content/docs/orchestrating-with-actions.mdx (+ nav entry in the Automation group): the lead-scoring-crew pattern end-to-end — zod input = the worker's typed result shape, task assignment over messaging, validated action reporting (with the DM-the-caller note since actions are fire-and-forget), pipeline advancement via handle.completed(), and failure/stall handling via handle.failed() + a timeout guard.
  • actions.mdx ("React to completion") and event-handlers.mdx ("Action listeners") now lead with the typed-handle form and cross-link the new page.

Deliberately not done

  • Did not implement result.onResult or relay.addListener('agentResult', ...): the harness-driver event bus has no broker-event bridge at all today (none of the declared broker events — agentSpawned, agentIdle, agentResult, … — are ever emitted onto it); building that bridge is a separate, larger piece of work. The changelog no longer claims these exist.
  • waitForResult() resolves on the first submitted result; interim (final: false) streaming results are surfaced via the final flag rather than a multi-shot API, keeping this minimal.
  • Reverted incidental package-lock.json churn from npm install (no dependency changes in this PR).

Test evidence

  • npm --prefix packages/sdk run build ✓, npm --prefix packages/sdk test ✓ (10 files, 85 tests), npm --prefix packages/sdk run check
  • npm --prefix packages/harness-driver run build ✓ and run check ✓ (no test script exists; its co-located tests run through the root vitest suite)
  • Root npx vitest run ✓ — 61 files passed, 806 tests passed (includes the new harness-driver tests)
  • npm --prefix web test ✓ (6 files, 15 tests)
  • Root npm run typecheck ✓ (config, cloud, utils, policy, sdk, harness-driver, harnesses, cli)

🤖 Generated with Claude Code

…tion docs

Closes three DX gaps found while building an orchestrator against the v8 SDK:

- @agent-relay/sdk: relay.registerAction(...) now returns a TypedActionHandle
  whose completed()/failed()/invoked()/denied() predicates carry the
  registration's input/output types into addListener handlers, so orchestrators
  consume their own action results without string keys or `as` casts. The
  string path (relay.action(name)) and ActionHandle.unregister() are unchanged.
- @agent-relay/harness-driver: SpawnedAgentHandle.waitForResult() resolves with
  the structured result submitted through the submit_result MCP tool (replays
  from broker event history; resolves on exit or timeout), and agentResultSchema
  now accepts zod-style validators, coerced to JSON Schema via the SDK's
  actionSchemaToJsonSchema before the spawn request reaches the broker.
- CHANGELOG: the structured-result entry now describes what actually shipped
  (it advertised agent.waitForResult(), result.onResult, and
  addListener('agentResult') which did not exist).
- docs: new orchestrating-with-actions page covering the request →
  typed-callback pipeline pattern, with typed-handle examples and cross-links
  in actions and event-handlers.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@willwashburn willwashburn requested a review from khaliqgant as a code owner June 10, 2026 16:49
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5cc7bd9c-2299-488b-88ac-4ae72c08b636

📥 Commits

Reviewing files that changed from the base of the PR and between d4c3157 and b039208.

📒 Files selected for processing (1)
  • packages/harness-driver/src/agent-result.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/harness-driver/src/agent-result.test.ts

📝 Walkthrough

Walkthrough

This PR adds SpawnedAgentHandle.waitForResult and AgentResultInfo, widens and coerces agentResultSchema inputs to JSON Schema, introduces typed action-event predicates and TypedActionHandle in the SDK (with wiring and tests), and updates docs, changelog, and navigation for orchestrating with typed actions.

Changes

Structured results and typed action listeners

Layer / File(s) Summary
Harness driver result lifecycle
packages/harness-driver/src/agent-handle.ts, packages/harness-driver/src/agent-result.test.ts
SpawnedAgentHandle.waitForResult<T>() awaits first agent_result event from broker with ordered-history replay, live-listen fallback, timeout, or exit settlement; AgentResultInfo<T> represents result/exit/timeout outcomes; tests cover replay, live resolution, exit, and timeout.
Harness driver schema coercion
packages/harness-driver/src/types.ts, packages/harness-driver/src/client.ts, packages/harness-driver/src/agent-result.test.ts
Adds AgentResultSchema union (JSON Schema
SDK typed action listener API
packages/sdk/src/listeners.ts
Introduces typed action event interfaces (ActionInvokedEvent, ActionCompletedEvent, ActionFailedEvent, ActionDeniedEvent), TypedActionPredicate mapping/filtering, and TypedActionHandle with phase-bound predicates (invoked(), completed(), failed(), denied()); ListenerHub.addListener gains an overload accepting ListenerPredicate<TEvent>.
SDK relay and facade typed handle returns
packages/sdk/src/agent-relay.ts, packages/sdk/src/facade.ts
AgentRelay.registerAction() and registerFacadeAction() now return TypedActionHandle<TInput, TOutput>; agentRelayAgent casts addListener to the new typed overload; facade returns a createTypedActionHandle that wires unregister to relay unsubscribe and local handle cleanup.
SDK typed handle test coverage
packages/sdk/package.json, packages/sdk/src/__tests__/typed-action-handle.test.ts
Adds comprehensive Vitest suite verifying typed handle metadata, typed completed/failed/invoked/denied payloads and ordering, action/caller filtering, authorization denial, unregister behavior, and backward compatibility; test entry added to package.json.
Documentation and navigation updates
CHANGELOG.md, web/content/docs/actions.mdx, web/content/docs/event-handlers.mdx, web/content/docs/orchestrating-with-actions.mdx, web/lib/docs-nav.ts
CHANGELOG updated to mention waitForResult() and typed handle predicates; docs revised to show subscribing via handles and typed predicates, new orchestration guide added with examples for registration, messaging, completion/failure/stall handling; docs navigation updated with the new page.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • khaliqgant

Poem

A rabbit sips on typed delight,
Awaiting results under broker light.
Predicates hum, events align,
From invoke to completion, typed and fine.
🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the three main changes: typed action result predicates, zod support for agentResultSchema, and new orchestration documentation.
Description check ✅ Passed The description comprehensively covers all required sections: motivation (three DX gaps), what changed (Gap 1-3 with implementation details), deliberately not done, and test evidence with passing results.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/typed-action-results

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: baef66a3c8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

schema: SpawnPtyInput['agentResultSchema']
): Record<string, unknown> | boolean | undefined {
if (schema === undefined || typeof schema === 'boolean') return schema;
return actionSchemaToJsonSchema(schema as ActionSchema);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve validator schemas for agent results

When callers pass a zod v3 schema or any custom safeParse validator without toJSONSchema, this helper delegates to actionSchemaToJsonSchema, whose fallback is a permissive { type: 'object' } descriptor. Unlike actions, agent results have no local validation fallback after submission, so the broker either accepts malformed object results or rejects valid non-object results even though the new AgentResultSchema type says these validators are supported. Please either perform a real conversion for the supported validators or reject/omit unconvertible validators instead of silently changing their contract.

Useful? React with 👍 / 👎.

@github-actions

Copy link
Copy Markdown
Contributor

Preview deployed!

Environment URL
Web https://djcowmf1837o8.cloudfront.net

This preview will be cleaned up when the PR is merged or closed.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/harness-driver/src/agent-result.test.ts (1)

37-81: ⚡ Quick win

Consider adding test coverage for multiple results in history.

The current replay test (lines 54-60) only covers a single result in history. Given that agents can stream interim results (per the doc comment's mention of checking the final flag), consider adding a test case that pre-populates history with multiple agent_result events to verify which one is replayed.

🧪 Suggested additional test case
+  it('replays the last result when multiple exist in history', async () => {
+    const stub = createStubClient([
+      resultEvent('worker', { interim: 1 }, false),
+      resultEvent('worker', { interim: 2 }, false),
+      resultEvent('worker', { final: 3 }, true),
+    ]);
+    const handle = createHandle(stub);
+
+    const info = await handle.waitForResult();
+    // Current implementation uses getLastEvent, so expects the last result
+    expect(info).toMatchObject({ reason: 'result', data: { final: 3 }, final: true });
+  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/harness-driver/src/agent-result.test.ts` around lines 37 - 81, Add a
test that pre-populates the stub broker history with multiple agent_result
events for the same agent to ensure waitForResult replays the correct event: use
createStubClient([...resultEvent('worker', {score: 1, final: false}),
resultEvent('worker', {score: 2, final: false}), resultEvent('worker', {score:
3, final: true})]) then createHandle(stub) and await handle.waitForResult() and
assert the returned info matches the final result (e.g., data.score === 3 and
final === true); reference createStubClient, resultEvent, createHandle and
SpawnedAgentHandle.waitForResult in the new test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/harness-driver/src/agent-handle.ts`:
- Around line 162-201: The replay branch in waitForResult uses
this.client.getLastEvent('agent_result', this.name) which returns the most
recent agent_result, but the live path resolves on the first subsequent
agent_result; make the replay behavior symmetric by finding the first matching
agent_result in the broker history instead of the last: after calling
this.client.connectEvents() iterate the client's event history (e.g. eventBuffer
or the public API that exposes past events) from the start to find the first
event with kind === 'agent_result' and name === this.name and then return
Promise.resolve(toResultInfo<T>(thatEvent)); keep the rest of waitForResult
(timer, onEvent subscriber, matchExit handling) unchanged so live behavior still
resolves on the first new result.

---

Nitpick comments:
In `@packages/harness-driver/src/agent-result.test.ts`:
- Around line 37-81: Add a test that pre-populates the stub broker history with
multiple agent_result events for the same agent to ensure waitForResult replays
the correct event: use createStubClient([...resultEvent('worker', {score: 1,
final: false}), resultEvent('worker', {score: 2, final: false}),
resultEvent('worker', {score: 3, final: true})]) then createHandle(stub) and
await handle.waitForResult() and assert the returned info matches the final
result (e.g., data.score === 3 and final === true); reference createStubClient,
resultEvent, createHandle and SpawnedAgentHandle.waitForResult in the new test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8ce6e348-8690-4fec-949b-3ab5d4400603

📥 Commits

Reviewing files that changed from the base of the PR and between 040da37 and 885d40a.

📒 Files selected for processing (14)
  • CHANGELOG.md
  • packages/harness-driver/src/agent-handle.ts
  • packages/harness-driver/src/agent-result.test.ts
  • packages/harness-driver/src/client.ts
  • packages/harness-driver/src/types.ts
  • packages/sdk/package.json
  • packages/sdk/src/__tests__/typed-action-handle.test.ts
  • packages/sdk/src/agent-relay.ts
  • packages/sdk/src/facade.ts
  • packages/sdk/src/listeners.ts
  • web/content/docs/actions.mdx
  • web/content/docs/event-handlers.mdx
  • web/content/docs/orchestrating-with-actions.mdx
  • web/lib/docs-nav.ts

Comment thread packages/harness-driver/src/agent-handle.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant