[codex] Add tool-driven Aevatar core invocation sources by eanzhao · Pull Request #830 · aevatarAI/aevatar

eanzhao · 2026-05-22T17:09:51Z

背景

这个 PR 是 ADR-0026 的 Stage 1：把 aevatar 的核心能力重新定位为 LLM tool source，让模型通过 function call 主动选择何时使用 workflow、GAgent、team、readmodel observation 等能力，而不是继续在入口层维护 ForwardToGAgent / ForwardToTeam 这类并行路由方言。

改动

新增 ADR：docs/adr/0026-tool-first-chat-ingress.md，明确 tool-first chat ingress 的目标、边界和后续阶段。
新增 Aevatar.AI.ToolProviders.AevatarInvocation，提供 5 个 invocation tools：
- aevatar_invoke_gagent
- aevatar_invoke_team
- aevatar_start_workflow
- aevatar_observe_run
- aevatar_query_readmodel
新增共享 AevatarInvocationDispatcher，统一做 proto 参数解析、caller scope 注入、调度、readmodel 查询与结构化错误返回。
通过 proto descriptor 生成严格 JSON schema，避免把核心语义塞进无约束 JSON bag。
接入 Mainnet Host DI，让这些 tool sources 能进入现有 IAgentToolSource 发现链路。
补 Lark caller-scope 回归测试，证明 Lark send tool 使用可信的 AgentToolRequestContext.NyxIdAccessToken，payload/外部 metadata 不能覆盖调用者凭据。
补 /v1/responses E2E 测试，证明模型发出的 aevatar_invoke_gagent additive tool call 会走 tool loop 并通过 IActorDispatchPort 投递 actor envelope，而不是走 legacy ForwardToGAgent 静态调用链路。

影响

这是 tool-driven core loop 的第一步，不删除现有 legacy forward path。
GAgent / workflow 的 wait=complete 仍返回结构化 wait_complete_unavailable；当前阶段支持 ack / stream，后续由 session actor/观察链路承接长任务 continuation。
aevatar_query_readmodel 只允许查询封闭集合 readmodel，不开放任意 document collection。
没有修改 NyxID、chrono-* 等外部仓库。

验证

dotnet test test/Aevatar.AI.ToolProviders.AevatarInvocation.Tests/Aevatar.AI.ToolProviders.AevatarInvocation.Tests.csproj --nologo：通过，21 passed。
dotnet test test/Aevatar.AI.ToolProviders.Lark.Tests/Aevatar.AI.ToolProviders.Lark.Tests.csproj --nologo：通过，61 passed。
dotnet test test/Aevatar.Hosting.Tests/Aevatar.Hosting.Tests.csproj --filter FullyQualifiedName~PostResponses_StreamWithAevatarInvokeGAgentAdditiveTool_ShouldDispatchActorEnvelope --nologo：通过，1 passed。
bash tools/ci/test_stability_guards.sh：通过。
bash tools/ci/architecture_guards.sh：通过。
git diff --check origin/dev..HEAD：通过。

备注：本地测试仍有既有 NuGet source mapping / analyzer warnings，没有测试失败。

Records the architectural decision to collapse ChatRouteAction to Reject + ForwardToModel, exposing GAgent/Team/Workflow invocation as IAgentToolSource tools through the existing ToolCallLoop. Supersedes ADR-0024 §D5 (v1 action set) and ADR-0025 (voice v1 ForwardToGAgent); ADR-0024 D1/D2/D3/D4/D6 stand. Tracked end-to-end in epic #808; voice GA prerequisite in #809. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Implements ADR-0026 Stage 1 unit-1 (epic #808). New project src/Aevatar.AI.ToolProviders.AevatarInvocation/ exposes aevatar_invoke_gagent / _invoke_team / _start_workflow / _observe_run / _query_readmodel as IAgentToolSource, so the LLM can drive orchestration through the existing ToolCallLoop instead of parallel router branches. Design: - Tool payloads are proto-derived strict JSON-Schema (no map<string,string> bags) - wait=ack|stream|complete supported; stream is default for long-running tools; GAgent/workflow wait=complete returns wait_complete_unavailable until Stage 2 session actor lands - Caller scope flows through AgentToolRequestContext only; protected caller-scope keys (LLMRequestMetadataKeys.*) are stripped from LLM-supplied payload.headers before server values are stamped, so the LLM cannot inject overrides for nyxid.access_token / scope_id / owner_subject etc. - query_readmodel is bounded to a closed registered set - Dispatch reuses existing surfaces (IActorDispatchPort, ITeamEntryMemberResolver + IStaticGAgentStreamInvocationPort<AGUIEvent>, ICommandDispatchService<WorkflowChatRunRequest,...>); no new dispatch chain 21 tests pass (4 credential-injection regression + 1 ObserveRun fast-fail added in post-review hardening); arch_guards + test_stability + docs lint all PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Implements ADR-0026 Stage 1 unit-2 (D7 prerequisite) for the Lark outbound caller-scope guarantee. After auditing the existing path (LarkMessagesSendTool → LarkNyxClient → NyxIdApiClient) no production refactor was required: the tool already reads AgentToolRequestContext.NyxIdAccessToken (no credential parameters) and forwards the caller bearer through NyxID's api-lark-bot proxy, which exchanges to a Lark tenant_access_token without seeing the caller's authorization header. The metadata-bag credential-injection surface that unit-1 had to harden is structurally absent here (no headers/metadata bag at the dispatch boundary). Added 2 regression tests: - Asserts the dispatched NyxID call carries AgentToolRequestContext's trusted typed NyxIdAccessToken - Asserts a malicious LLM payload (smuggled nyx_id_access_token, fake headers, ExternalMetadata overriding LLMRequestMetadataKeys.NyxIdAccessToken) cannot override the trusted caller token at dispatch NyxID investigation summary (verified via gh against ChronoAIProject/NyxID backend source): /api/v1/proxy/s/api-lark-bot/open-apis/im/v1/messages accepts only the caller's NyxID bearer; NyxID resolves caller's api-lark-bot binding, exchanges {app_id, app_secret} → tenant_access_token per channel_adapters/lark.rs::lark_family_token_exchange_config, strips the inbound authorization, and injects bearer for outbound to Lark. Semantic: messages post as the caller's bound Lark bot (NyxID-mediated), not as the human user's OAuth identity and not as Aevatar's service-level identity. This satisfies ADR-0026 §D7's "lands in the caller's Lark account" use case. 61/61 tests pass; arch_guards + test_stability + docs lint all PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes ADR-0026 Stage 1 (epic #808). Integration test demonstrates the new tool-first ingress path works end-to-end after units 1+2 landed, without touching any production code. Test: MainnetResponsesEndpointsTests.PostResponses_StreamWithAevatarInvokeGAgentAdditiveTool_ShouldDispatchActorEnvelope Scenario: - /v1/responses streamed request with real DI registration of unit-1's AddAevatarInvocationTools (5 production IAgentToolSource instances) - Stubbed LLM emits aevatar_invoke_gagent tool call with a malicious payload that smuggles nyxid.access_token + aevatar.scope_id overrides - ResponsesCompletionApplicationService executes the local tool call inline (not as function_call SSE output — verified against production StreamAsync behavior) - AevatarInvocationDispatcher dispatches through IActorDispatchPort (captured by RecordingActorDispatchPort) - LLM round 2 continues after tool result, SSE lifecycle completes Assertions: - Dispatched envelope's Route.PublisherActorId == DirectGAgentPublisherId - Dispatched ChatRequestEvent.Headers carry the trusted bearer/scope (caller-scope protection from unit-1 verified end-to-end) - ThrowingStaticGAgentStreamInvocationPort.InvocationCount == 0 (the legacy ForwardToGAgent/ForwardToTeam path in ResponsesEndpoints.cs:779-927 is NOT entered) 202/202 tests pass in Aevatar.Hosting.Tests; arch_guards + test_stability_guards + docs lint all PASS. No production code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-22T17:27:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.07%. Comparing base (fccb80d) to head (853a8a1).

@@            Coverage Diff             @@
##              dev     #830      +/-   ##
==========================================
+ Coverage   83.06%   83.07%   +0.01%     
==========================================
  Files         981      981              
  Lines       61936    61936              
  Branches     8069     8069              
==========================================
+ Hits        51447    51454       +7     
+ Misses       7009     6996      -13     
- Partials     3480     3486       +6

Flag	Coverage Δ
ci	`83.07% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eanzhao and others added 4 commits May 22, 2026 16:41

eanzhao marked this pull request as ready for review May 22, 2026 17:19

eanzhao requested review from jason-aelf and louis4li as code owners May 22, 2026 17:19

Merge branch 'dev' of aelf:aevatarAI/aevatar into feature/core-loop

c4aedcb

Fix responses forward team status probes

853a8a1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add tool-driven Aevatar core invocation sources#830

[codex] Add tool-driven Aevatar core invocation sources#830
eanzhao wants to merge 6 commits into
devfrom
feature/core-loop

eanzhao commented May 22, 2026

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eanzhao commented May 22, 2026

背景

改动

影响

验证

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 22, 2026 •

edited

Loading