MAF-Inspired Framework Improvements by eanzhao · Pull Request #3 · aevatarAI/aevatar

eanzhao · 2026-02-16T13:13:18Z

MAF-Inspired Framework Improvements

基于 Microsoft Agent Framework (MAF) 调研，对 Aevatar 框架进行 6 项改进，涵盖中间件管线、可观测性、人机交互、工具审批、编排模式和表达式语言。

变更概览

改进项	优先级	新增文件	修改文件	涉及层
Agent Middleware Pipeline	HIGH	4	3	AI.Abstractions / AI.Core
OpenTelemetry GenAI Conventions	HIGH	4	1	AI.Core / Foundation.Runtime
Human-in-the-Loop	MEDIUM	3	5	Workflow.Core / Projection / AGUI
Tool Approval Mechanism	MEDIUM	1	1	AI.Abstractions
Orchestration Patterns	LOW	5	0	Foundation.Core
Expression Language	LOW	1	1	Workflow.Core

1. Agent Middleware Pipeline

问题

原有 Hook 系统 (IAIGAgentExecutionHook) 是 best-effort 执行，无法修改或拦截 Agent 输入输出。MAF 提供三层中间件（Agent Run / Function Calling / IChatClient），支持安全过滤、限流、内容审查等横切关注点。

方案

新增三层 Middleware 接口，与 Hook 系统共存（Hook 做观测，Middleware 做拦截）：

IAgentRunMiddleware — 包裹整个 Agent Chat 执行，可修改用户输入、短路返回
IToolCallMiddleware — 包裹每次 Tool 执行，可验证参数、覆盖结果、阻止调用
ILLMCallMiddleware — 包裹每次 LLM 调用，可修改 messages、注入 system prompt、变换响应

中间件通过 DI 注册，遵循 ASP.NET Core 中间件模式（next() 递归链）。

补充约定：

ChatAsync 与 ChatStreamAsync 均会进入 IAgentRunMiddleware 链。
流式路径中，真实的 provider.ChatStreamAsync(...) 会在 ILLMCallMiddleware 链的 next() 内执行，确保观测/限流/缓存等横切逻辑对流式调用同样生效。

新增文件

src/Aevatar.AI.Abstractions/Middleware/IAgentRunMiddleware.cs
src/Aevatar.AI.Abstractions/Middleware/IToolCallMiddleware.cs
src/Aevatar.AI.Abstractions/Middleware/ILLMCallMiddleware.cs
src/Aevatar.AI.Core/Middleware/MiddlewarePipeline.cs

修改文件

src/Aevatar.AI.Core/Tools/ToolCallLoop.cs — 集成 Tool + LLM 中间件
src/Aevatar.AI.Core/Chat/ChatRuntime.cs — 集成 Agent Run + LLM 中间件
src/Aevatar.AI.Core/AIGAgentBase.cs — 从 DI 收集中间件并注入 Runtime

使用示例

// 注册安全过滤中间件
services.AddSingleton<IAgentRunMiddleware, ContentSafetyMiddleware>();
services.AddSingleton<IToolCallMiddleware, ToolAuditMiddleware>();
services.AddSingleton<ILLMCallMiddleware, PromptInjectionGuardMiddleware>();

2. OpenTelemetry GenAI Semantic Conventions

问题

原有 ActivitySource 仅在 LocalActor 层记录事件处理 span，缺少 LLM 调用和 Tool 执行的标准化 span 和 metrics。MAF 遵循 OpenTelemetry GenAI Semantic Conventions 发射 invoke_agent / chat / execute_tool 标准 span。

方案

以内置 Middleware 形式实现可观测性（GenAIObservabilityMiddleware），同时实现三个 Middleware 接口：

Span 类型（GenAI 语义规范）：

invoke_agent <name> — 每次 Agent Run，包含 gen_ai.agent.id / gen_ai.agent.name / gen_ai.provider.name
chat <model> — 每次 LLM 调用，包含 gen_ai.request.model / gen_ai.provider.name / gen_ai.usage.*_tokens
execute_tool <name> — 每次 Tool 调用，包含 gen_ai.tool.name / gen_ai.tool.call_id（span kind 为 INTERNAL）

Metrics（直方图）：

gen_ai.client.token.usage — Token 消耗
gen_ai.client.operation.duration — LLM 调用耗时
aevatar.tool.invocation.duration — Tool 调用耗时

敏感数据控制：通过 GenAIActivitySource.EnableSensitiveData 控制是否在 span 中包含 prompt/response 内容。

新增文件

src/Aevatar.AI.Core/Observability/GenAIActivitySource.cs
src/Aevatar.AI.Core/Observability/GenAIObservabilityMiddleware.cs
src/Aevatar.Foundation.Runtime/Observability/AevatarObservabilityOptions.cs
src/Aevatar.Foundation.Runtime/Observability/GenAIMetrics.cs

修改文件

src/Aevatar.Foundation.Runtime/Observability/AevatarActivitySource.cs — 增加 GenAI span 工厂方法

使用示例

// 启用全链路 GenAI 可观测性
var mw = new GenAIObservabilityMiddleware();
services.AddSingleton<IAgentRunMiddleware>(mw);
services.AddSingleton<IToolCallMiddleware>(mw);
services.AddSingleton<ILLMCallMiddleware>(mw);

3. Human-in-the-Loop Workflow Steps

问题

MAF 提供 Question / Confirmation / WaitForInput / RequestExternalInput 四种 Human-in-the-Loop 操作。Aevatar 工作流引擎没有暂停执行等待人工输入的机制。

方案

新增两个工作流步骤模块和暂停/恢复事件：

HumanApprovalModule — 处理 type: human_approval，暂停工作流等待 yes/no 审批
HumanInputModule — 处理 type: human_input，暂停工作流等待自由文本输入

通过 WorkflowSuspendedEvent / WorkflowResumedEvent 实现暂停恢复，AGUI 层映射为 HUMAN_INPUT_REQUEST 事件供前端渲染。

YAML DSL 扩展

steps:
  - id: approve_report
    type: human_approval
    parameters:
      prompt: "审批生成的报告？"
      timeout: "3600"
      on_reject: fail

  - id: get_context
    type: human_input
    parameters:
      prompt: "请提供补充信息"
      variable: user_context
      timeout: "1800"
      on_timeout: fail

参数约定（最小集）：

human_approval：prompt / timeout（秒）/ on_reject（默认 fail）
human_input：prompt / variable / timeout（秒）/ on_timeout（默认 fail）

新增文件

src/workflow/Aevatar.Workflow.Core/Modules/HumanApprovalModule.cs
src/workflow/Aevatar.Workflow.Core/Modules/HumanInputModule.cs
src/workflow/Aevatar.Workflow.Projection/Reducers/WorkflowSuspendedEventReducer.cs

修改文件

src/workflow/Aevatar.Workflow.Core/cognitive_messages.proto — 新增 WorkflowSuspendedEvent / WorkflowResumedEvent
src/Aevatar.Presentation.AGUI/AGUIEvents.cs — 新增 HumanInputRequestEvent / HumanInputResponseEvent
src/workflow/Aevatar.Workflow.Presentation.AGUIAdapter/EventEnvelopeToAGUIEventMapper.cs — 新增映射 Handler
src/workflow/Aevatar.Workflow.Presentation.AGUIAdapter/DependencyInjection/ServiceCollectionExtensions.cs — 注册
src/workflow/Aevatar.Workflow.Core/ServiceCollectionExtensions.cs — 默认注册 human_input / human_approval 步骤模块

4. Tool Approval Mechanism

问题

MAF 的 @tool(approval_mode="always_require") 可阻止 Agent 自主执行敏感工具。Aevatar 的工具系统无审批机制。

方案

在 IAgentTool 接口增加 ApprovalMode 属性（默认 NeverRequire，向后兼容）：

public enum ToolApprovalMode
{
    NeverRequire = 0,   // 立即执行（默认）
    AlwaysRequire = 1,  // 始终需要审批
    Auto = 2,           // 由 Middleware 决定
}

与 Middleware Pipeline 自然组合：IToolCallMiddleware 可检查 context.Tool.ApprovalMode 实现审批逻辑。

新增文件

src/Aevatar.AI.Abstractions/ToolProviders/ToolApprovalMode.cs

修改文件

src/Aevatar.AI.Abstractions/ToolProviders/IAgentTool.cs — 增加 ApprovalMode 默认接口属性

5. Reusable Multi-Agent Orchestration Patterns

问题

MAF 将 5 种编排模式打包为可复用组件。Aevatar 通过 Workflow 模块实现类似能力，但与 YAML 引擎紧耦合，无法在无 YAML 场景下程序化使用。

方案

抽取 4 种编排模式为 Aevatar.Foundation.Core.Orchestration 命名空间下的一等公民：

模式	类名	行为
Sequential	`SequentialOrchestration`	链式：A → B → C
Concurrent	`ConcurrentOrchestration`	并行扇出 + 合并
Vote	`VoteOrchestration`	并行执行 + 投票选最佳
Handoff	`HandoffOrchestration`	动态 Agent 控制转移

所有模式基于 IOrchestration<TInput, TOutput> 接口，接受 Func<string, string, CancellationToken, Task<string>> 作为 Agent 执行委托，与具体 Agent 实现解耦。

新增文件

src/Aevatar.Foundation.Core/Orchestration/IOrchestration.cs
src/Aevatar.Foundation.Core/Orchestration/SequentialOrchestration.cs
src/Aevatar.Foundation.Core/Orchestration/ConcurrentOrchestration.cs
src/Aevatar.Foundation.Core/Orchestration/VoteOrchestration.cs
src/Aevatar.Foundation.Core/Orchestration/HandoffOrchestration.cs

6. Declarative Workflow Expression Language

问题

MAF 的声明式工作流使用 PowerFx 表达式语言。Aevatar YAML DSL 有变量替换但缺少运行时表达式求值。

方案

新增轻量级 ${...} 表达式求值器，支持：

语法	功能	示例
`${name}`	变量引用	`${variables.user}`
`${if(...)}`	条件	`${if(variables.age, 'adult', 'minor')}`
`${concat(...)}`	拼接	`${concat('Hello, ', variables.name)}`
`${isBlank(...)}`	空值检测	`${isBlank(variables.input)}`
`${length(...)}`	长度	`${length(variables.text)}`
`${and/or/not(...)}`	布尔逻辑	`${and(variables.a, variables.b)}`
`${upper/lower/trim(...)}`	字符串函数	`${upper(variables.name)}`

支持嵌套函数调用和引号字面量，参数分隔正确处理括号嵌套。

新增文件

src/workflow/Aevatar.Workflow.Core/Expressions/WorkflowExpressionEvaluator.cs

修改文件

src/workflow/Aevatar.Workflow.Core/Modules/WorkflowLoopModule.cs — 在派发 StepRequestEvent 前对步骤参数执行 ${...} 求值

运行时变量与求值时机

求值时机：WorkflowLoopModule.DispatchStep(...) 发布 StepRequestEvent 前，对 step.parameters 的 value 做一次模板替换（${...}）。
变量来源：每个 run_id 维护独立变量字典，默认包含：
- input：当前步骤的输入（上一步输出）
- <stepId>：已完成步骤的输出（同名 key），可用 ${first_step} 或 ${variables.first_step} 引用

验证

dotnet build aevatar.slnx --nologo    # 0 错误
dotnet test aevatar.slnx --nologo     # 所有新增测试通过

…ServiceId; share contract math Addresses PR #457 review. ## Functional fix (the inline review): InvokePath / invoke handler mismatch The contract returned by the new `GET /members/.../endpoints/.../contract` was telling the frontend to call `/members/{memberId}/invoke/...`, but the existing platform handler for that path resolves the member through `IMemberPublishedServiceResolver` which today returns `publishedServiceId == memberId`. Studio's bind path persists `publishedServiceId == "member-{memberId}"`. So the contract was built for `member-{memberId}` while invoke would target `{memberId}` → 404. Fix: register `StudioAwareMemberPublishedServiceResolver` from Studio's DI. It first asks `IStudioMemberQueryPort` for the member's stored `publishedServiceId`; if no Studio member exists, falls back to the legacy deterministic mapping (`memberId == publishedServiceId`) so direct platform binds keep working unchanged. Now contract / activate / retire / invoke / runs all resolve to the same identity. ## Refactors per the PR review - **#1 Duplicated contract-building logic**: extracted the pure helpers (`ResolveCurrentContractRevision`, `EnumeratePreferredContractRevisionIds`, `RevisionContainsEndpoint`, `IsChatEndpoint`, `ResolveStreamFrameFormat`, `BuildBase64PayloadPlaceholder`, `BuildTypedInvokeRequestExampleBody`) into `Aevatar.GAgentService.Abstractions.Services.ServiceEndpointContractMath`. Both `ScopeServiceEndpoints.cs` (legacy) and `StudioMemberService.cs` (member-first) funnel through it. A bug fix in one helper now propagates to both paths automatically. - **#3 / #4 Repeated resolve+verify pattern**: introduced `ResolveBoundServiceContextAsync` returning `(ScopeId, MemberId, PublishedServiceId, Identity, Service, Revisions)`. The three new methods now all share one query path; activate / retire dropped from 4 platform queries to 2. - **#2 Non-atomic activate**: documented with a `NOTE:` comment that `SetDefaultServingRevision` then `ActivateServiceRevision` is intentionally non-transactional, mirroring the legacy scope-default behavior, and that both commands are platform-side idempotent. - **#7 Hardcoded "retired" string**: introduced `MemberRevisionLifecycleStatusNames.Retired` next to the existing `MemberLifecycleStageNames` so future lifecycle verbs declare themselves alongside it instead of as scattered magic strings. - **#6 / #8 Input trimming**: collapsed the four ad-hoc trimming sites into a single `NormalizeRequired(value, fieldName)` helper applied at the service entry of every public method. Trimming now happens at exactly one boundary per call. ## Tests - 13 new tests pin the resolver's contract (Studio member → stored publishedServiceId; non-Studio member → legacy fallback; trim; reject malformed input; empty publishedServiceId degrades safely). - Existing tests unchanged: 327 Studio + 281 platform integration passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

P1 — workflow termination on publish failure (reviewer's critical): `PublishFailureAsync` emits `StepCompletedEvent { Success = false }`, which the kernel routes through `TryRetryAsync` → `TryOnErrorAsync` → fail. With no retry/on_error policy on `publish_to_twitter`, a Twitter 401/403/429/5xx terminated the entire workflow run as failed. Add `on_error: { strategy: skip, default_output: "twitter_publish_failed" }` to the YAML so the run advances to `done` cleanly; the module already surfaces categorized errors to Lark independently. #2 — Twitter v2 native error shape: `ClassifyTwitterResponse` now recognizes the third response shape NyxID can forward verbatim: `{ "title": "...", "detail": "...", "errors": [...] }` (Twitter's native problem-details for content-policy / duplicate-tweet rejections). Falls through to `twitter_publish_rejected` with the Twitter `message` text in the Lark surfacing so users read the actual rejection reason. #1 — Duplicate tweet risk: documented in code comment that `POST /2/tweets` has no server-side dedup; the social_media template intentionally has no `retry` policy on this step, and `on_error: skip` advances rather than retrying. Authors customizing the YAML must keep this invariant. #3 — Removed redundant `nyxClient!` null-forgiving (no-op cleanup). #4 — Renamed `ChannelMetadataKeys.LarkProxySlug` → `LarkOutboundProxySlug` (`channel.lark.outbound_proxy_slug`) to disambiguate "Lark API surface" from "NyxID provider routing". #5 — Added xml-doc on `TrySendLarkAsync` documenting the dual-scope api-key dependency (key must carry both api-twitter AND api-lark-bot entitlements) so future callers don't silently break Lark surfacing when narrowing the key's scope. #6 — Added `RequiredServiceSlugs` field to `SocialMediaTemplateSpec` for parity with `DailyReportTemplateSpec`; `CreateSocialMediaAgentAsync` now reads slugs from the spec instead of inlining the list. Tests: - 3 new `ClassifyTwitterResponse` tests for the Twitter native error shapes (errors-array, RFC-7807 title/detail-only, empty-object unexpected-shape). - Existing social_media test now also asserts `strategy: skip` lands in the upserted YAML. - 482 channel-runtime + 236 workflow.core tests pass; full solution builds with 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…elper Resolves three followup architecture-review points on PR #451: ### Channel.Runtime drops AI / Workflow direct deps (review #2) `Channel.Runtime`'s csproj used to pull in `Aevatar.AI.Abstractions` and `Aevatar.Workflow.Application.Abstractions` because of two files that straddled the channel/AI and channel/workflow boundary: - `ChannelContextMiddleware` (an `ILLMCallMiddleware` impl) — moved to `Aevatar.GAgents.NyxidChat`, which is the only package that needs it and already references `AI.Abstractions`. NyxidChat SCE registers it for the LLM call pipeline; Channel.Runtime SCE no longer touches `ILLMCallMiddleware`. - `ChannelCardActionRouting` (builds `WorkflowResumeCommand`) — moved to `Aevatar.GAgents.NyxidChat` for the same reason. Its sole consumer (`ChannelConversationTurnRunner`) lives there too. `Channel.Runtime.csproj` now references only `Channel.Abstractions`, `Foundation.Abstractions`/`Core`, and the `CQRS.Projection.*` slice — matching the "channel-agnostic flow + projection infrastructure" charter from the RFC. Tests (`ChannelCardActionRoutingTests`) get the extra `using Aevatar.GAgents.NyxidChat;`. ### Extract Elasticsearch projection-store toggle helper (review #4) The `ResolveElasticsearchEnabled` + `BuildElasticsearchOptions` helper pair was duplicated three times (Channel.Runtime / Device / Scheduled SCEs) with slightly different log strings and `Console.Error.WriteLine` output. Centralized into `Aevatar.CQRS.Projection.Providers.Elasticsearch.DependencyInjection.ElasticsearchProjectionConfiguration` with two static helpers: - `IsEnabled(IConfiguration?, ILogger?, string? storeName)` — explicit flag → endpoints presence → false; logs a structured warning via `ILogger` (when supplied) instead of `Console.Error.WriteLine`. - `BindOptions(IConfiguration)` — typed binder for `ElasticsearchProjectionDocumentStoreOptions`. All three SCEs now call into this helper; per-package warning text is parameterized via `storeName`. Section path (`Projection:Document:Providers:Elasticsearch`) is exposed as a const so future call sites stay in sync. ### Followup points acknowledged but deferred - **Cross-package dep chain `NyxidChat → Authoring.Lark → Scheduled → Platform.Lark`** (review #1) — pre-existing arch debt that the split surfaced rather than introduced. Cleaner would be to invert via `IInboundFlowResolver` plug-ins so `ChannelConversationTurnRunner` doesn't reach into `AgentBuilderCardFlow` directly. Out of scope for the package split; tracking as a separate follow-up. - **Tombstone compactor "central coordinator"** (review #3) — `Channel.Runtime` defines `ITombstoneCompactionTarget` but does not reference `Device` / `Scheduled` at the csproj level; per-package targets register themselves through DI. The plug-in pattern is intentional and keeps the DAG one-way. - **`Scheduled` package name vs UserAgentCatalog content** (review #5) — `UserAgentCatalog` is the delivery-target registry that Scheduled agents read at execution time to route output, so co-locating it with `SkillRunnerGAgent` / `WorkflowAgentGAgent` is intentional. Renaming to `AgentCatalog` would split actors from their primary consumer; deferring. 473/473 ChannelRuntime.Tests pass; full slnx still only fails the same two pre-existing Mainnet hosting `BindAsync on IStudioMemberService` tests that reproduce on origin/dev. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ash-command Wires up all six pieces the ADR called out as parallel-capable now that NyxID#549 has shipped, so per-user binding stops requiring a follow-up PR per layer: #1 — AuthContext.external_subject (Channel.Abstractions) Move ExternalSubjectRef out of Identity.Abstractions into Channel.Abstractions where AuthContext lives. Identity is a consumer of channel concepts, not their owner; the new typed AuthContext.external_subject = 4 field is the broker-mode outbound identity carrier (ADR §Outbound Send) and the legacy string user_credential_ref keeps working for non-broker callers. Adds a new AuthContext.OnBehalfOfExternalSubject helper. #2 — Projection chain ExternalIdentityBindingDocument (proto) + Partial (IProjectionReadModel) + MetadataProvider + MaterializationContext + Projector (ICurrentStateProjectionMaterializer) + ExternalIdentityBindingProjectionQueryPort + ExternalIdentityBindingProjectionReadinessPort (polling impl, write-side completion path only — ADR §Projection Readiness explicitly allows this). AddChannelIdentityProjection registers the lot. #3 — NyxIdRemoteCapabilityBroker HTTP-backed INyxIdCapabilityBroker + INyxIdBrokerCallbackClient against the NyxID#549 wire shape: /oauth/authorize URL building, RFC 8693 token-exchange with subject_token_type=urn:nyxid:params:oauth:token-type:binding-id, /oauth/bindings/{id} delete, authorization-code -> binding_id exchange. Maps invalid_grant -> BindingRevokedException so the upper layer event-source-revokes the local binding. PkceHelper covers RFC 7636 S256; StateTokenCodec seals correlation+verifier+exp into an HMAC token (kid header, deterministic Protobuf payload — ADR §Implementation Notes #1). AddNyxIdRemoteCapabilityBroker wires HttpClient + options + codec + both interfaces. #4 — /api/oauth/nyxid-callback endpoint (IdentityOAuthEndpoints) Decodes state, exchanges code -> binding_id, dispatches CommitBindingCommand to ExternalIdentityBindingGAgent via IActorRuntime, waits on the projection readiness port, returns a friendly bind-confirmation page (id_token decoded locally for the display name — no /oauth/userinfo round-trip per ADR L61). Classifies error UX by HTTP status (ADR §Implementation Notes #3): 400 for state-token issues, 502 for broker exchange failure, 200 with a "binding pending propagation" message on projection timeout. #5 — Slash-command routing in ChannelConversationTurnRunner /init and /unbind are handled before the LLM by a new TryHandleSlashCommandAsync that resolves identity ports lazily through the existing IServiceProvider — deployments without per-user binding fall through unchanged. /init replies with the authorize URL (private DM only — ADR §Decision); /unbind resolves binding_id, calls broker.RevokeBindingAsync, replies with the unbind confirmation. The runner short-circuits via the existing SendReplyAsync, so reply delivery rides the relay outbound port like any other reply. #6 — /api/webhooks/nyxid-broker-revocation receiver (Continuous Access Evaluation) BrokerRevocationWebhookValidator verifies HMAC-SHA256 over the raw body (X-NyxID-Signature: sha256=<hex>), parses the JSON envelope into a typed BrokerRevocationNotification, and the endpoint event-source-revokes the local binding actor. Aligns with NyxID#549 V2-7's CAE channel — when a user revokes from NyxID's console the binding goes inactive in seconds rather than waiting for the 5-min access TTL. Tests: - 591 ChannelRuntime.Tests pass (39 of which are the new Identity-tagged tests covering actor, projector, broker fake, state-token codec, and ExternalSubjectRef extension). - Full solution `dotnet build aevatar.slnx` is clean. - `tools/docs/lint.sh` 33 file(s) checked, 0 errors. The ChannelConversationTurnRunner integration is feature-flag-shaped: if neither IExternalIdentityBindingQueryPort nor INyxIdCapabilityBroker is registered, the slash-command path is a no-op and existing bot-owner-shared behaviour is preserved. Production rollout is a DI toggle (AddChannelIdentityProjection + AddNyxIdRemoteCapabilityBroker + MapIdentityOAuthEndpoints) plus the Bot-Owner-Shared termination strategy choice from the ADR §Bot-Owner-Shared 模式终止策略. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Five round-2 follow-ups from the Codex consensus: #5 Orphan binding leak on actor-activation failure (resolved): The OAuth callback's already-bound branch revoked the orphan binding_id; the actor-activation-failed branch (when the local actor cannot be created post-exchange) silently returned 503 and left the freshly issued binding_id active at NyxID. Now the same TryRevokeOrphanBindingAsync helper runs before the 503 response so both leak paths cleanup symmetrically. #3 NyxIdAuthorityResolver footgun on staging clusters (resolved): Resolve() now takes an optional ILogger and emits a warning when AEVATAR_NYXID_AUTHORITY is unset AND ASPNETCORE_ENVIRONMENT / DOTNET_ENVIRONMENT indicate a non-Development environment. Local dev remains silent (env is empty / starts with "dev" / "local"); staging / qa / prod operators see the warning so a forgotten env-var doesn't silently register clients against production NyxID. #1 csproj coupling — store wiring belongs to the host (resolved): Split AddChannelIdentity into two extension methods. AddChannelIdentity now registers actors / projector / broker / slash commands but NOT document stores. AddChannelIdentityProjectionStores wires the ES vs InMemory choice and is called by the composition root (Mainnet host now invokes both methods). Tests / demos can mix and match — agent module never decides on the host's behalf which physical store to use. #2 kid rotation grace window (resolved): HMAC key rotation previously invalidated all in-flight state tokens (kid hardcoded "v1", no multi-key verification). The actor now carries current_kid + previous_kid + previous_demoted_at_unix on its state + projection. Encode signs with the snapshot's current kid; decode tries current first, then previous when demoted_at + state_token_lifetime is still in the future. A v1→v2 rotation produces deterministic kid succession (parses + increments the integer suffix) so verifiers can route signed tokens to the right key. #4 HMAC key still in projection (partial — code unchanged, ADR hardened): The projection-store-as-actor-state-mirror pattern is the established shape in this codebase (Channel.Runtime same coupling). Removing the hmac_key from the document would require a generic actor query/reply pattern that CLAUDE.md explicitly forbids ("禁止 generic actor query/reply"). The ADR-0018 §Implementation Notes #1 already documents the explicit tradeoff (state_token TTL ≤ 5 min, rotation command available, ES index access boundary equals actor event-store boundary, KMS migration path noted as a follow-up). The proto comment on AevatarOAuthClientDocument now also calls out the production access-scoping requirement so a least-privileged ES reader can't extract the key. Encryption-at-rest with an envelope key from env-var is the natural follow-up if a deployment widens ES access beyond the actor event-store boundary. Tests: - StateTokenCodec gains a kid-rotation grace test (decode succeeds with previous kid before lifetime expires; fails after). - AevatarOAuthClientGAgent gains a rotation-demotion test (verifies v1→v2 kid succession + previous-key carry-over) and a first-seed test (no previous-key fields populated on initial provision). - 800 ChannelRuntime tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three actionable items from kimi (the rest of the round were either fixed in 2f54361 / de82e0a or push-back acknowledgements): #1 (kimi MAJOR, di) — service-locator anti-pattern: ChannelSlashCommandContext exposed IServiceProvider so handlers could pull anything they wanted at request time, which CLAUDE.md forbids outside framework integration code. Removed Services from the context; ModelChannelSlashCommandHandler now takes IUserConfigQueryPort + IUserConfigCommandService directly via constructor (both nullable optional so deployments without the Studio projection wired up still register the handler and surface "not enabled" cleanly). Existing /init / /unbind / /whoami handlers were already constructor-injected so this is a /model-only refactor + abstraction-shape clean-up. #2 (kimi MAJOR, security) — webhook validator missed grace window: BrokerRevocationWebhookValidator only verified inbound NyxID-signed webhooks against snapshot.HmacKey. During a key rotation, NyxID can still send webhooks signed with the previous key (their cache / in-flight state lags the rotation event), and silently rejecting them drops real CAE revocations. Validator now mirrors StateTokenCodec.ResolveVerificationKey: try current key first, then the previous key when its demoted_at + state_token_lifetime is still in the future. Took NyxIdBrokerOptions + TimeProvider via constructor to share the same lifetime config as the codec. #3 (kimi MINOR) — three ADR-0017 → ADR-0018 reference fixes in StateTokenCodec, INyxIdBrokerCallbackClient, NyxIdBrokerOptions; the earlier renumbering missed these xmldoc comments. 6 new BrokerRevocationWebhookValidatorTests cover the current-key happy path, forged-signature rejection, previous-key acceptance within grace window, previous-key rejection after grace window, missing signature header, and unsupported scheme prefix. 807 ChannelRuntime tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- #15 (major, di): AddOrnnSkills now self-registers NyxIdApiClient and NyxIdToolOptions via TryAdd safety nets so workflow hosts that enable Ornn without AddNyxIdTools get a clean 404 path on missing BaseUrl rather than a confusing DI resolution failure. - #3 (major, di): Introduce IExternalIdentityBindingProjectionPort and inject the interface in IdentityOAuthEndpoints. The concrete class remains for runtime composition; the registration also publishes the interface as a forwarder so existing tests that hold the concrete type keep working. - #14 + #17 (major, concurrency): TurnStreamingReplySink TOCTOU concern was based on a misread — drainSignal is captured inside the same lock acquisition as _dispatchInProgress=false in the cap branch, and the throttle gate's nextIsFinal=false invariant makes _drainTcs guaranteed null on that path. Document the invariants so future readers don't re-flag this. - #20 (major, arch): Replace the singleton Dictionary cache in NyxIdLlmServiceCatalogClient with IMemoryCache. Per CLAUDE.md "中间层状态约束", per-caller state lives in a host-owned cache, not a service field. AbsoluteExpiration policy preserved (30s). - #9 (minor): /api/v1/proxy/services?per_page=100 was already extracted into NyxIdLlmCatalogRoutes.ProxyServicesPath — both call sites already use the constant. - #10 (minor): LooksLikeLlmRouteCandidate already has negative-signal filtering plus a two-distinct-weak-signals threshold. No change. - #11 (minor): ExternalIdentityBindingProjector already logs a Warning with DocumentId/EventId/Version on the empty-binding delete path. Build clean (Ornn / Identity / NyxidChat / Channel.Runtime), updated catalog client test passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tions Production changes: - Migrate response_sessions and tool-state protos from `string *_json` to `bytes *_payload`; actors stay Protobuf-only, JSON parsing moves to the host boundary (addresses mimo-v2.5-pro blocker findings #2/#3). - Extract `ResponsesTodoItemParser` in Abstractions so the host adapter preview and the actor's persisted view share one parser (addresses glm-5.1 finding on duplicated todo parsing). - Introduce `IWebApiClient` abstraction so `ResponsesAevatarToolProvider` depends on an interface (addresses kimi finding on DI inversion). - Add `WebFetchUrlGuard` to block loopback, private (10/8, 172.16/12, 192.168/16, 169.254/16, 127/8, 0/8), and link-local addresses before WebFetch emits an HTTP request (addresses codex blocker #1 on SSRF / bearer token exfiltration via attacker-controlled URLs). Test changes: - Update existing Response* tests to track the new proto shape. - Add `WebFetchUrlGuardTests` covering scheme rejection, IPv4/IPv6 private ranges, IPv4-mapped IPv6, hostname loopback variants, and the public-host accept path. - Add `ResponsesTodoItemParserTests` covering todos-array, single string/object, content-key fallback chain, numeric id, malformed JSON, and whitespace normalization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

new features inspired by MAF

2c7ba9c

eanzhao merged commit a2dc54a into dev Feb 19, 2026
1 check passed

This was referenced Apr 6, 2026

feat: Skill 自动发现 — 从 Ornn registry 定期拉取 skill 列表 #118

Open

Architecture audit: 6 milestone blockers for Living with AI Demo (04-17) and NyxID M0 (04-18) #143

Closed

louis4li added a commit that referenced this pull request Apr 9, 2026

Expand agent coverage and fix scope endpoint tests (#143 #3 #4)

4630876

This was referenced Apr 13, 2026

plan: VoicePresence as EventModule — Phase 1 实现计划 #179

Closed

feat: AgentBuilder — 飞书管家 + SkillRunnerGAgent + IHumanInteractionPort (Day One 执行方案) #185

Closed

Refactor/frontend #248

Merged

eanzhao mentioned this pull request Apr 27, 2026

feat(channel): structured backfill outcome + audit + scope repair (#391) #460

Merged

6 tasks

This was referenced Apr 27, 2026

Split ChannelRuntime megamodule into 8 packages (#263) #451

Merged

feat(social_media): publish approved drafts to X via NyxID api-twitter #461

Merged

Propose ADR-0017: Studio Team as first-class aggregate under scope #470

Merged

eanzhao mentioned this pull request Apr 29, 2026

feat(channel): ADR-0017 + scaffolding for per-user NyxID binding #477

Merged

4 tasks

eanzhao mentioned this pull request Apr 30, 2026

refactor(runtime): adopt AgentKind + Kind Registry for actor identity (#498 Phase 1) #535

Merged

10 tasks

eanzhao mentioned this pull request Apr 30, 2026

refactor(skill-runner): split into SkillDefinitionGAgent + SkillExecutionGAgent #497

Open

5 tasks

eanzhao mentioned this pull request May 15, 2026

Fix GAgent durable terminal completion #657

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAF-Inspired Framework Improvements#3

MAF-Inspired Framework Improvements#3
eanzhao merged 1 commit into
devfrom
feature/maf-inspire

eanzhao commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant