Skip to content

Apply LLM middleware and hooks to final ToolCallLoop call#19

Merged
eanzhao merged 1 commit into
devfrom
pr/ai-toolcallloop-final-call-devbase
Mar 2, 2026
Merged

Apply LLM middleware and hooks to final ToolCallLoop call#19
eanzhao merged 1 commit into
devfrom
pr/ai-toolcallloop-final-call-devbase

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented Feb 26, 2026

This pull request refactors the LLM invocation logic in ToolCallLoop to improve code reuse and clarity, and updates the loop's behavior when the maximum number of tool call rounds is reached. It also enhances test coverage to ensure the new logic is properly exercised. The most important changes are grouped below.

Refactoring and Code Reuse

  • Extracted the LLM invocation logic (including hooks and middleware) from the main loop into a new private method InvokeLlmAsync in ToolCallLoop, reducing duplication and improving maintainability. (src/Aevatar.AI.Core/Tools/ToolCallLoop.cs [1] [2]

Behavior Change: Final LLM Call

  • When the maximum number of tool call rounds is exhausted, the loop now makes a final LLM call without tools to ensure a textual response is produced, and appends it to the messages if present. (src/Aevatar.AI.Core/Tools/ToolCallLoop.cs src/Aevatar.AI.Core/Tools/ToolCallLoop.csL135-R157)

Testing and Middleware

  • Updated the test ExecuteAsync_WhenMaxRoundsReachedWithoutTerminalContent_Should to verify that the final LLM call is made without tools, and that hooks and middleware are invoked the correct number of times. (test/Aevatar.AI.Tests/ToolCallLoopTests.cs [1] [2]
  • Added a DelegateLlmCallMiddleware helper class to facilitate testing of LLM call middleware. (test/Aevatar.AI.Tests/ToolCallLoopTests.cs test/Aevatar.AI.Tests/ToolCallLoopTests.csR284-R289)

@loning loning requested a review from eanzhao February 27, 2026 07:16
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这次改动的价值比较明确:

  • ToolCallLoop 里的 LLM 调用逻辑抽到 InvokeLlmAsync,减少重复代码,hook/middleware 生命周期也更一致;
  • maxRounds 用尽后新增一次 不带 tools 的 final LLM call,能提高拿到最终文本回复的概率;
  • 测试已覆盖 final call 的关键行为(Tools == null,以及 LLM middleware / hook 被调用次数)。

有两个非阻塞的小建议(低优先级):

  1. 增加一个正向用例:final no-tools call 返回文本时,断言返回值和 messages 中 assistant 文本都正确;
  2. 增加一个 terminate 用例:验证 final call 在 middleware Terminate = true 时的短路行为(尤其是 provider 调用次数和返回内容)。

结论:LGTM,可合并(建议后续补上上述两条测试,回归风险会更低)。

@eanzhao eanzhao merged commit 2dc2e3a into dev Mar 2, 2026
8 checks passed
eanzhao added a commit that referenced this pull request Apr 30, 2026
19 inline comments arrived after de82e0a; verified each. Three of them
(#13, #14, #16) point at the HttpClient captive bug already fixed in
de82e0a — those will be answered with a reply. Three are NyxID-side
contract gaps (#15, #18, #19) verified against ~/Code/NyxID HEAD cdfef0a;
those need separate NyxID PRs and will be tracked. The rest are fixed
here:

- /model list (codex MAJOR #11): read owner default from
  context.RegistrationScopeId, not the ambient queryPort overload —
  channel inbound has no Studio HTTP request behind it, so the ambient
  resolver returned `default`/unrelated state. Falls back to ambient
  only when the scope is empty (defensive). Tests pinned.
- StateTokenCodec.TryDecodeAsync (consensus MINOR #10): map
  AevatarOAuthClientNotProvisionedException to a distinct
  state_client_not_provisioned code instead of state_signature_invalid.
  IdentityOAuthEndpoints surfaces a "正在初始化, 30 秒后重试" detail
  for that code, matching the /init handler's cold-start message.
- AevatarOAuthClientBootstrapService (#8, #9):
  - wrap RunWithRetryAsync in RunSafelyAsync that logs any escape so
    the unobserved-task exception sink is no longer the only safety net.
  - StopAsync now catches TimeoutException too: when the host shutdown
    deadline fires before the bootstrap task observes its own
    _stoppingCts cancellation, log + return cleanly instead of leaking
    a noisy trace.
- AevatarOAuthClientGAgent.HandleEnsureProvisioned (#6, #7): document
  why CancellationToken.None is the contract — the framework's
  EventHandlerDiscoverer requires single-parameter handlers, so a
  turn-scoped CT cannot be surfaced. The named HTTP client's per-
  request timeout bounds the worst case during silo shutdown.
- NyxIdRedirectUriResolver (#4): emit a warning when all URL sources
  are unset and the environment is not developer-shaped, parity with
  NyxIdAuthorityResolver's existing fallback warning. Wired through
  bootstrap + broker call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request May 8, 2026
Address review batch on PR #562 (10 inline comments). All in files I have
recent ownership of and require no architectural shifts:

- #16 (blocker, security): ssh_exec is now opt-in via NyxIdToolOptions.
  EnableSshExecTool. Hosts that haven't wired the approval middleware no
  longer see the tool by default. Mainnet host opts in (Lark bot needs it).
- #21 (major, bug): code_execute keeps the modern /execute + {language,
  script} contract, but on a NyxID-proxy upstream 404 it retries the legacy
  /run + {language, code} contract so deployments still pinned to old
  chrono-sandbox-service builds keep working.
- #22 (major, bug): SkillRegistry.IsFresh now exempts SkillSource != Remote
  from TTL — local skills are baked in at registration and don't need
  expiring; prior behavior dropped them from use_skill after the first 5min.
- #18 (major, bug): TurnRunner.TryResolveSenderBindingAsync narrows the
  catch to transient infra errors (Http/Timeout/IO/JSON) and surfaces
  non-transient (logic, NRE, serialization) at Error level so ops can
  distinguish "sender unbound" from "binding store broken".
- #19 (major, bug): ConversationReplyGenerator narrows the
  sender-route-fallback catch to transient errors via
  IsRetryableSenderRouteFailure. Programmer errors no longer cost an LLM
  round on retry.
- #29 + #30 (minor): inbox runtime gives metadata enrichment its own 15s
  budget separate from the LLM run, surfacing
  errorCode=llm_reply_metadata_timeout when scope/UserConfig lookup is
  slow. ResolveFallbackTimeout treats ResponseTimeoutSeconds<=0 as "no
  timeout" rather than silently snapping back to 120s.
- #12 (minor): ConversationGAgent's stream-chunk and final-stream-chunk
  edits run under a 10s CTS now; the failure path already uses one. A hung
  relay can no longer pin the actor turn forever.
- #27 (minor, security): ConstantTimeEquals docstring tightened — removed
  the "for future callers" line and added a SCOPE comment that this helper
  is rebuild-admin-only and shouldn't be promoted to internal/public
  without replacing its length-leak with a length-padding scheme.
- #23 (major, bug): CLI ornn skills slug default → ornn-api (matches the
  registered slug; bare "ornn" is the SPA frontend that returns HTML).

Build clean (NyxId / Skills / NyxidChat / Mainnet hosts), 30 AI tests +
15 inbox runtime tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants