Skip to content

test: strengthen durable correctness coverage for responses state#742

Open
YueZh127 wants to merge 2 commits into
devfrom
test/2026-05-19_issue-667-llmsession-durable-correctness
Open

test: strengthen durable correctness coverage for responses state#742
YueZh127 wants to merge 2 commits into
devfrom
test/2026-05-19_issue-667-llmsession-durable-correctness

Conversation

@YueZh127
Copy link
Copy Markdown
Collaborator

Closes #667

Problem

Responses durable 主链已经有基本覆盖,但在 #667 关注的几类边界上,测试还不够扎实:

  • LlmSession 对 forwarded tool call 的权威状态推进还缺少更完整的 durable contract 覆盖
  • ResponsesAgentToolState 对 terminal / expired state 的稳定读取与恢复语义覆盖不足
  • host /v1/responses 对 expired / cancelled / terminal state 的错误表达还缺少更细的边界测试
  • continue / tool result / terminal rejection 的一些路径,之前更多覆盖 happy path,缺少“不能继续推进”的锁定

Solution

这次 PR 只补测试,不改生产语义、不改公开接口,围绕 #667 增加 durable correctness coverage:

  • LlmSessionGAgentTests 中补 forwarded tool call durable lifecycle 覆盖
  • 补 terminal state 的吸收态与重复 resolve 幂等语义
  • ResponsesAgentToolStateGAgentTests 中补 terminal / expired state 的稳定查询与恢复语义
  • ResponsesAgentToolStateCurrentStateProjectorTests 中补 current-state projector 对 terminal state 的对外呈现保护
  • MainnetResponsesEndpointsTests 中补 host 级 durable rejection coverage,确保 cancelled / expired / terminal state 不会被错误继续推进,也不会错误调用 provider

Scope

本 PR 仅包含测试改动,未修改生产代码。

涉及文件:

  • test/Aevatar.GAgentService.Tests/Core/LlmSessionGAgentTests.cs
  • test/Aevatar.GAgentService.Tests/Core/ResponsesAgentToolStateGAgentTests.cs
  • test/Aevatar.GAgentService.Tests/Projection/ResponsesAgentToolStateCurrentStateProjectorTests.cs
  • test/Aevatar.Hosting.Tests/MainnetResponsesEndpointsTests.cs

Key Coverage Added

  • LlmSession forwarded tool call 从 Pending -> Received -> Resolved 的 durable 状态闭环
  • session expired 后 forwarded tool call 保持 expired,不被后续 signal 误复活
  • cancelled forwarded tool call 保持 terminal,不再接受 result
  • duplicate resolve 不推进 version,也不覆盖第一次 ResolvedAt
  • ResponsesAgentToolState 在 expired / cancelled / resolved 等 terminal state 下可稳定查询
  • host 在 previous_response_id expired / cancelled 时返回结构化错误,而不是继续 provider path
  • host 在 cancelled / expired tool call 下返回 tool_call_not_available,且不写入 tool result、不调用 provider

Validation

已执行:

dotnet test test/Aevatar.GAgentService.Tests/Aevatar.GAgentService.Tests.csproj --nologo --filter "LlmSession|ResponsesAgentToolState|ResponsesCompletion"
dotnet test test/Aevatar.Hosting.Tests/Aevatar.Hosting.Tests.csproj --nologo --filter "MainnetResponsesEndpointsTests"
bash tools/ci/test_stability_guards.sh
bash tools/ci/query_projection_priming_guard.sh
bash tools/ci/projection_state_version_guard.sh
bash tools/ci/projection_state_mirror_current_state_guard.sh

结果:

  • Aevatar.GAgentService.Tests 目标过滤通过
  • MainnetResponsesEndpointsTests 通过
  • test stability guard 通过
  • projection / query 边界相关 guards 通过

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.51%. Comparing base (5d7162f) to head (0df3352).
⚠️ Report is 115 commits behind head on dev.

@@            Coverage Diff             @@
##              dev     #742      +/-   ##
==========================================
+ Coverage   82.32%   82.51%   +0.19%     
==========================================
  Files         932      941       +9     
  Lines       59485    60101     +616     
  Branches     7805     7872      +67     
==========================================
+ Hits        48970    49594     +624     
+ Misses       7155     7114      -41     
- Partials     3360     3393      +33     
Flag Coverage Δ
ci 82.51% <ø> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 53 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[test] 补齐 LlmSession / ResponsesAgentToolState 的 durable correctness 回归测试

1 participant