Skip to content

Improve harness tool-call wait UX#128

Open
minpeter wants to merge 2 commits into
mainfrom
codex/harness-stream-hooks-pretool
Open

Improve harness tool-call wait UX#128
minpeter wants to merge 2 commits into
mainfrom
codex/harness-stream-hooks-pretool

Conversation

@minpeter
Copy link
Copy Markdown
Owner

@minpeter minpeter commented May 15, 2026

Summary

  • expose public runAgentLoop hooks for every stream part and buffered pre-tool assistant text
  • add Agent.generate() as the non-streaming generateText counterpart to Agent.stream()
  • document the new DX and add a patch changeset for @ai-sdk-tool/harness
  • ignore packages/minimal-agent/.minimal-agent local runtime/smoke artifacts

Verification

  • pnpm -F @ai-sdk-tool/harness test
  • pnpm -F @ai-sdk-tool/harness typecheck
  • pnpm -F @ai-sdk-tool/harness build
  • pnpm run typecheck:root
  • pnpm exec ultracite check packages/harness
  • live smoke: LIVE_AI_BASE_URL=https://llm.vhh.sh/v1 LIVE_AI_MODEL=minpeter/gpt-5.4-mini node --conditions=@ai-sdk-tool/source --import tsx packages/minimal-agent/.minimal-agent/harness-pretool-live.ts
    • observed preToolText: 광화문 현재 날씨를 확인해볼게요.
    • sawPreToolText=true, preToolBeforeToolExecute=true, preToolBeforeToolCallHook=true

Notes

  • generic harness-level behavior only; no Bori/MGW/weather-specific acknowledgement fallback
  • pre-tool acknowledgement remains model-text driven via stream deltas and observer hooks

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Auto reviews are disabled on this repository. To trigger a review, include @crb review in the PR description. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a65ecb9-492f-45d1-8990-7d9a0ba1a9fc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/harness-stream-hooks-pretool

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a non-streaming generation path via Agent.generate() and adds new observation hooks to runAgentLoop, including onStreamPart and onTextBeforeToolCall for buffering assistant text before tool boundaries. Feedback highlights that Agent.generate() incorrectly attempts to use the stopWhen option, which is unsupported by generateText, and fails to resolve functional instructions. Additionally, the getTextDelta utility and associated test mocks should be updated to use the standard textDelta property from the AI SDK to ensure compatibility.

Comment thread packages/harness/src/agent.ts
Comment thread packages/harness/src/loop.ts
Comment thread packages/harness/src/agent.ts Outdated
Comment thread packages/harness/src/loop.test.ts Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

Re-trigger cubic

@minpeter minpeter force-pushed the codex/harness-stream-hooks-pretool branch from e2c7619 to c942605 Compare May 15, 2026 19:17
The harness now exposes generic stream observation hooks and a non-streaming generate path so chat surfaces can acknowledge work before tools run without hard-coded domain branches. The loop keeps observer hooks isolated from stream control while documentation and regression tests cover the public API. The minimal-agent local state directory is ignored so live smoke artifacts do not leak into future diffs.

Constraint: Upstream change must stay generic and package-local; no Bori/MGW/weather-specific behavior or new dependencies.
Rejected: Deterministic fallback acknowledgement generation | caller-specific behavior belongs outside the upstream harness.
Rejected: Leaving pre-tool hooks as private loop casts | consumers could not use the DX through exported RunAgentLoopOptions types.
Confidence: high
Scope-risk: moderate
Directive: Keep pre-tool acknowledgements driven by model text deltas and public hooks, not package-level domain heuristics.
Tested: pnpm -F @ai-sdk-tool/harness test
Tested: pnpm -F @ai-sdk-tool/harness typecheck
Tested: pnpm -F @ai-sdk-tool/harness build
Tested: pnpm run typecheck:root
Tested: pnpm exec ultracite check packages/harness
Tested: LIVE_AI_BASE_URL=https://llm.vhh.sh/v1 LIVE_AI_MODEL=minpeter/gpt-5.4-mini node --conditions=@ai-sdk-tool/source --import tsx packages/minimal-agent/.minimal-agent/harness-pretool-live.ts
@minpeter minpeter force-pushed the codex/harness-stream-hooks-pretool branch from c942605 to 8f1ca79 Compare May 15, 2026 19:30
The review found that the pre-tool text hook exposed raw AI SDK boundary parts too directly, including a boundary without a guaranteed toolName. This keeps the generic hook but normalizes its callback payload, narrows loop-only consumers to LoopAgent, and documents the timing/backpressure and async-instruction boundaries.

Constraint: Preserve the existing model-agnostic harness direction and avoid new dependencies.

Rejected: Drop tool-input-end as a boundary | it would hide a useful early flush point rather than making the contract explicit.

Rejected: Keep Agent as the only loop-facing type | it would force generate() onto loop-only test doubles and custom agents.

Confidence: high

Scope-risk: narrow

Directive: Keep onTextBeforeToolCall payload harness-normalized when AI SDK stream part shapes drift.

Tested: pnpm -F @ai-sdk-tool/harness test

Tested: pnpm -F @ai-sdk-tool/harness typecheck

Tested: pnpm -F @ai-sdk-tool/harness build

Tested: pnpm run check

Tested: pnpm run typecheck

Tested: pnpm run build

Tested: pnpm run test

Tested: final code-review APPROVE and architecture CLEAR via delegated reviewers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant