Skip to content

Aevatar context database#4

Closed
eanzhao wants to merge 4 commits into
devfrom
feature/context-database
Closed

Aevatar context database#4
eanzhao wants to merge 4 commits into
devfrom
feature/context-database

Conversation

@eanzhao
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao commented Feb 16, 2026

Context Database 架构文档

概述

Aevatar Context Database 将技能、资源、记忆、会话统一抽象为 aevatar:// 虚拟文件系统,提供:

  • 基于 URI 的统一存储访问(IContextStore
  • 基于语义向量的上下文检索(IContextRetriever
  • 基于 LLM 的分层摘要(L0/L1)与记忆提取
  • 通过 Projection Pipeline 的记忆写入扩展点

本文档以当前实现为准,描述真实行为、默认参数、接入方式和现阶段限制。

灵感来源于 OpenViking,在 Aevatar 分层架构中落地。

模块全景

graph TB
    subgraph abstractions["Aevatar.Context.Abstractions"]
        aevatarUri["AevatarUri"]
        contextStore["IContextStore"]
        contextRetriever["IContextRetriever"]
    end

    subgraph core["Aevatar.Context.Core"]
        uriMapper["AevatarUriPhysicalMapper"]
        localFileStore["LocalFileContextStore"]
        inMemoryStore["InMemoryContextStore"]
    end

    subgraph extraction["Aevatar.Context.Extraction"]
        layerGenerator["LLMContextLayerGenerator"]
        semanticProcessor["SemanticProcessor"]
    end

    subgraph retrieval["Aevatar.Context.Retrieval"]
        vectorIndex["LocalVectorIndex"]
        intentAnalyzer["IntentAnalyzer"]
        hierarchicalRetriever["HierarchicalRetriever"]
        injectionMiddleware["ContextInjectionMiddleware"]
    end

    subgraph memory["Aevatar.Context.Memory"]
        memoryExtractor["LLMMemoryExtractor"]
        memoryDeduplicator["MemoryDeduplicator"]
        memoryWriter["MemoryWriter"]
        memoryProjector["MemoryExtractionProjector"]
    end

    core --> abstractions
    extraction --> abstractions
    retrieval --> abstractions
    memory --> abstractions
    memory --> retrieval
    memoryProjector --> cqrsProjection["Aevatar.CQRS.Projection.Abstractions"]
Loading

虚拟文件系统

URI 格式

aevatar://{scope}/{path}

Scope 映射

Scope 说明 物理路径 代码映射
skills 全局技能定义 ~/.aevatar/skills/ AevatarPaths.Skills
resources 外部知识资源 ~/.aevatar/resources/ AevatarPaths.Resources
user 用户数据与记忆 ~/.aevatar/users/ AevatarPaths.Users
agent Agent 运行时数据 ~/.aevatar/agents/ AevatarPaths.AgentData
session 会话上下文 ~/.aevatar/sessions/ AevatarPaths.Sessions

说明:

  • AevatarPaths.AgentDataAevatarPaths.Agents 当前都映射到 ~/.aevatar/agents/
  • 反向映射 FromPhysicalPath 仅识别上述五个 scope;其他物理路径返回 null
  • 未知 scope 在正向映射时抛出 ArgumentException

AevatarUri 行为要点

  • Scheme 匹配大小写不敏感;Scope 会归一化为小写。
  • aevatar://scopeaevatar://scope/ 都视为目录。
  • Path 尾部斜杠会被裁剪,目录语义由 IsDirectory 保留。
  • Parent 在 scope 根目录时返回自身。
  • Join("") 返回自身,Join("x/") 结果为目录,Join("x") 结果为文件。

存储层行为

LocalFileContextStore

能力 当前行为
ReadAsync 文件不存在时抛 FileNotFoundException
WriteAsync 自动创建父目录;存在同名文件时覆盖
DeleteAsync 目标不存在时静默返回;目录删除依赖 recursive
ListAsync 仅返回直接子项;自动跳过 . 开头隐藏项
GlobAsync 总是递归;大小写不敏感;模式会被简化处理
ExistsAsync 目录用 Directory.Exists;文件用 File.Exists
GetAbstractAsync 读取目录下 .abstract.md,传文件 URI 时自动取父目录
GetOverviewAsync 读取目录下 .overview.md,传文件 URI 时自动取父目录

额外约束:

  • GlobAsync 主要覆盖简单模式(例如 **/*.md),复杂组合模式并非完整 glob 实现。
  • 路径映射层当前不做路径穿越防护,调用方应避免将 ../ 写入 URI path。

InMemoryContextStore

  • 使用 ConcurrentDictionary 存储文件与目录,适合单测和本地验证。
  • 写文件会自动补齐父目录链。
  • 目录 ExistsAsync 在存在子文件或子目录时也会返回 true
  • GlobAsync 为轻量匹配逻辑,覆盖常见模式但不是完整 glob 语义。

分层信息模型(L0/L1/L2)

目标语义

层级 文件 目标用途
L0 .abstract.md 快速过滤、向量检索摘要
L1 .overview.md 结构化导航、Rerank 参考
L2 原始文件 按需深读

当前实现行为

  • L0/L1 目前都以目录级隐藏文件形式存储。
  • 文件 URI 的 GetAbstractAsync / GetOverviewAsync 实际读取其父目录摘要文件。
  • SemanticProcessor 采用自底向上处理目录树,目录级 L0/L1 由子摘要聚合生成。
graph BT
    leafFile["LeafFile"] --> generateFileAbstract["GenerateAbstractAsync"]
    generateFileAbstract --> writeParentAbstract["Write parent .abstract.md"]
    writeParentAbstract --> aggregateChildAbstracts["Aggregate child abstracts"]
    aggregateChildAbstracts --> generateDirectoryLayers["GenerateDirectoryLayersAsync"]
    generateDirectoryLayers --> writeDirectoryLayers["Write .abstract.md and .overview.md"]
Loading

检索链路

FindAsync(简单搜索)

query -> embedding -> vectorIndex.SearchAsync(topK=10) -> 按 ContextType 分组 -> FindResult

特点:

  • 不走意图分析。
  • 支持 targetScope 限定检索范围。

SearchAsync(复杂搜索)

sequenceDiagram
    participant userQuery as UserQuery
    participant intentAnalyzer as IntentAnalyzer
    participant hierarchicalRetriever as HierarchicalRetriever
    participant vectorIndex as IContextVectorIndex

    userQuery->>intentAnalyzer: AnalyzeAsync(query, session)
    intentAnalyzer-->>hierarchicalRetriever: TypedQueryArray

    loop eachTypedQuery
        hierarchicalRetriever->>vectorIndex: SearchAsync global topK 3
        vectorIndex-->>hierarchicalRetriever: GlobalResults
        loop queueDrillDown
            hierarchicalRetriever->>vectorIndex: SearchChildrenAsync topK 5
            vectorIndex-->>hierarchicalRetriever: ChildResults
        end
    end

    hierarchicalRetriever-->>userQuery: FindResult
Loading

实现要点:

  • IntentAnalyzer 将输入拆成 0 到 5 条 TypedQuery
  • 每条 TypedQuery 先做全局检索,再做目录子项下钻。
  • 分数传播公式:final = 0.5 * child + 0.5 * parent
  • 收敛条件:连续 3 轮 topScore 变化小于 0.001
  • ContextType.Memory 的根范围是 null,表示跨 user/agent 记忆检索。

检索默认参数

组件 参数 当前值
IntentAnalyzer MaxTokens 500
IntentAnalyzer Temperature 0.0
IntentAnalyzer RecentMessagesLimit 5TakeLast(5)
IntentAnalyzer MaxTypedQueries 5
HierarchicalRetriever GlobalSearchTopK 3
HierarchicalRetriever SearchChildrenTopK 5
HierarchicalRetriever FinalTake 10
HierarchicalRetriever ScorePropagationAlpha 0.5
HierarchicalRetriever MaxConvergenceRounds 3
HierarchicalRetriever ConvergenceThreshold 0.001
ContextInjectionMiddleware MaxContextTokenBudget 3000
ContextInjectionMiddleware 实际预算单位 3000 * 4 字符估算
ContextInjectionMiddleware 检索入口 FindAsync

上下文注入中间件

ContextInjectionMiddleware 行为:

  • 从请求消息中提取最后一条 user 消息作为查询。
  • 检索成功后拼装 system 消息插入对话。
  • 单次调用链通过 metadata key 去重,避免重复注入。
  • 出现异常会降级为“跳过注入继续调用”。

记忆链路

6 类记忆

分类 归属 存储路径 可合并
Profile user aevatar://user/{userId}/memories/
Preferences user aevatar://user/{userId}/memories/preferences/
Entities user aevatar://user/{userId}/memories/entities/
Events user aevatar://user/{userId}/memories/events/
Cases agent aevatar://agent/{agentId}/memories/cases/
Patterns agent aevatar://agent/{agentId}/memories/patterns/

提取与去重流程

messages -> LLMMemoryExtractor -> CandidateMemory[]
          -> embedding -> vectorIndex.SearchAsync(topK=3, scope=targetPath)
          -> 决策: Create / Update / Skip
          -> MemoryWriter -> IContextStore

去重决策矩阵:

条件 决策
无匹配或最佳相似度 < 0.85 Create
相似度 >= 0.85 且分类可合并 Update
相似度 > 0.95 且分类不可合并 Skip
相似度 >= 0.85 且分类不可合并且不满足 skip 条件 Create

可合并类别:ProfilePreferencesEntitiesPatterns
不可合并类别:EventsCases

记忆模块默认参数

组件 参数 当前值
LLMMemoryExtractor MaxTokens 2000
LLMMemoryExtractor Temperature 0.0
LLMMemoryExtractor 对话截断长度 10000 字符
MemoryDeduplicator SimilarityThreshold 0.85
MemoryDeduplicator SkipThreshold 0.95
MemoryDeduplicator 相似检索 topK 3
MemoryWriter 文件名格式 yyyyMMdd-HHmmss-{slug}.md
MemoryWriter merge 分隔符 \\n\\n---\\n\\n
MemoryExtractionProjector Order 200
MemoryExtractionProjector 默认 userId/agentId "default"

Projection Pipeline 集成

MemoryExtractionProjector<TContext, TTopology> 为泛型投影器,实现 IProjectionProjector<TContext, TTopology>

graph LR
    eventEnvelope["EventEnvelopeStream"] --> projectionCoordinator["ProjectionCoordinator"]
    projectionCoordinator --> readModelProjector["WorkflowExecutionReadModelProjector(Order=0)"]
    projectionCoordinator --> aguiEventProjector["WorkflowExecutionAGUIEventProjector(Order=100)"]
    projectionCoordinator --> memoryExtractionProjector["MemoryExtractionProjector(Order=200)"]
    memoryExtractionProjector --> extractionFlow["CompleteAsync -> Extract -> Deduplicate -> Write"]
Loading

项目依赖图

graph TD
    contextAbstractions["Aevatar.Context.Abstractions"]
    contextCore["Aevatar.Context.Core"]
    contextExtraction["Aevatar.Context.Extraction"]
    contextRetrieval["Aevatar.Context.Retrieval"]
    contextMemory["Aevatar.Context.Memory"]
    aevatarConfig["Aevatar.Configuration"]
    aiAbstractions["Aevatar.AI.Abstractions"]
    projectionAbstractions["Aevatar.CQRS.Projection.Abstractions"]

    contextCore --> contextAbstractions
    contextCore --> aevatarConfig
    contextExtraction --> contextAbstractions
    contextExtraction --> aiAbstractions
    contextRetrieval --> contextAbstractions
    contextRetrieval --> aiAbstractions
    contextMemory --> contextAbstractions
    contextMemory --> contextRetrieval
    contextMemory --> aiAbstractions
    contextMemory --> projectionAbstractions
Loading

DI 注册与启用

模块注册

services
    .AddContextStore()           // LocalFileContextStore + AevatarUriPhysicalMapper
    .AddContextExtraction()      // LLMContextLayerGenerator + SemanticProcessor
    .AddContextRetrieval()       // LocalVectorIndex + IntentAnalyzer + HierarchicalRetriever
    .AddContextMemory();         // LLMMemoryExtractor + MemoryDeduplicator + MemoryWriter

// 测试环境
services.AddInMemoryContextStore();

AddContextStore() 会调用 AevatarPaths.EnsureContextDirectories(),确保基础目录存在。

Bootstrap 集成

builder.Services.AddAevatarBootstrap(builder.Configuration, options =>
{
    options.EnableContextDatabase = true;
});

启用后会追加注册:

  • AddContextStore()
  • AddContextExtraction()
  • AddContextRetrieval()
  • AddContextMemory()
  • ILLMCallMiddleware -> ContextInjectionMiddleware

Workflow Projection 集成

builder.Services.AddWorkflowExecutionProjectionProjector<
    MemoryExtractionProjector<WorkflowExecutionProjectionContext, IReadOnlyList<WorkflowExecutionTopologyEdge>>>();

扩展点

扩展点 当前实现 生产替换方向
IContextStore LocalFileContextStore 对象存储或分布式文件系统
IContextVectorIndex LocalVectorIndex Qdrant / Weaviate / Pinecone
IContextLayerGenerator LLMContextLayerGenerator 混合摘要策略
IMemoryExtractor LLMMemoryExtractor 规则引擎 + LLM 混合
IEmbeddingGenerator 外部注入 OpenAI / 本地模型

当前实现限制与优化方向

主题 当前状态 优化方向
向量索引构建 代码中暂无自动索引构建流水线,检索前需保证索引已写入 增加资源落库后的索引构建与增量更新流程
URI 路径安全 映射层未显式防止 ../ 穿越 在映射时增加规范化与根路径约束校验
记忆归属标识 MemoryExtractionProjector 当前使用硬编码 default 用户与 Agent 从投影上下文解析真实 userIdagentId
注入预算精度 ContextInjectionMiddleware 使用字符数近似 token 接入真实 tokenizer 进行预算控制
Glob 兼容度 LocalFileContextStore 仅覆盖简化 glob 模式 引入完整 glob 匹配库并补齐测试
去重决策枚举 DeduplicationDecision.Merge 已定义,当前去重器不产出该分支 根据分类策略补齐 merge 触发条件
并发写一致性 MemoryWriter.Merge 为读后写,非原子 增加存储级并发控制或版本检查
目录摘要粒度 L0/L1 以目录级文件为主,文件级摘要与目录摘要存在复用 明确文件级与目录级摘要存储策略并拆分

测试现状

  • 已覆盖:AevatarUriInMemoryContextStore、检索主流程、记忆提取与去重决策。
  • 待加强:LocalFileContextStore 真实文件系统场景的集成测试与边界安全测试。

eanzhao and others added 4 commits February 16, 2026 21:12
…ation

- Introduce a comprehensive OCP architecture refactor plan to address current implementation issues, focusing on modularity, extensibility, and adherence to the Open/Closed Principle.
- Document a phased approach for refactoring, including the introduction of new modules, plugin architecture, and clear separation of concerns.
- Add a PR review audit document outlining strict architectural and coding standards, identifying key issues, and providing a risk assessment for the current implementation.

This commit enhances project documentation and sets a clear path for future architectural improvements.
@eanzhao eanzhao closed this Feb 25, 2026
eanzhao added a commit that referenced this pull request Apr 27, 2026
…ServiceId; share contract math

Addresses PR #457 review.

## Functional fix (the inline review): InvokePath / invoke handler mismatch

The contract returned by the new `GET /members/.../endpoints/.../contract`
was telling the frontend to call `/members/{memberId}/invoke/...`, but the
existing platform handler for that path resolves the member through
`IMemberPublishedServiceResolver` which today returns
`publishedServiceId == memberId`. Studio's bind path persists
`publishedServiceId == "member-{memberId}"`. So the contract was built for
`member-{memberId}` while invoke would target `{memberId}` → 404.

Fix: register `StudioAwareMemberPublishedServiceResolver` from Studio's
DI. It first asks `IStudioMemberQueryPort` for the member's stored
`publishedServiceId`; if no Studio member exists, falls back to the
legacy deterministic mapping (`memberId == publishedServiceId`) so
direct platform binds keep working unchanged. Now contract / activate /
retire / invoke / runs all resolve to the same identity.

## Refactors per the PR review

- **#1 Duplicated contract-building logic**: extracted the pure
  helpers (`ResolveCurrentContractRevision`,
  `EnumeratePreferredContractRevisionIds`, `RevisionContainsEndpoint`,
  `IsChatEndpoint`, `ResolveStreamFrameFormat`,
  `BuildBase64PayloadPlaceholder`, `BuildTypedInvokeRequestExampleBody`)
  into `Aevatar.GAgentService.Abstractions.Services.ServiceEndpointContractMath`.
  Both `ScopeServiceEndpoints.cs` (legacy) and `StudioMemberService.cs`
  (member-first) funnel through it. A bug fix in one helper now
  propagates to both paths automatically.

- **#3 / #4 Repeated resolve+verify pattern**: introduced
  `ResolveBoundServiceContextAsync` returning
  `(ScopeId, MemberId, PublishedServiceId, Identity, Service, Revisions)`.
  The three new methods now all share one query path; activate /
  retire dropped from 4 platform queries to 2.

- **#2 Non-atomic activate**: documented with a `NOTE:` comment that
  `SetDefaultServingRevision` then `ActivateServiceRevision` is
  intentionally non-transactional, mirroring the legacy scope-default
  behavior, and that both commands are platform-side idempotent.

- **#7 Hardcoded "retired" string**: introduced
  `MemberRevisionLifecycleStatusNames.Retired` next to the existing
  `MemberLifecycleStageNames` so future lifecycle verbs declare
  themselves alongside it instead of as scattered magic strings.

- **#6 / #8 Input trimming**: collapsed the four ad-hoc trimming sites
  into a single `NormalizeRequired(value, fieldName)` helper applied at
  the service entry of every public method. Trimming now happens at
  exactly one boundary per call.

## Tests

- 13 new tests pin the resolver's contract (Studio member → stored
  publishedServiceId; non-Studio member → legacy fallback; trim;
  reject malformed input; empty publishedServiceId degrades safely).
- Existing tests unchanged: 327 Studio + 281 platform integration
  passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request Apr 27, 2026
P1 — workflow termination on publish failure (reviewer's critical):
`PublishFailureAsync` emits `StepCompletedEvent { Success = false }`,
which the kernel routes through `TryRetryAsync` → `TryOnErrorAsync` →
fail. With no retry/on_error policy on `publish_to_twitter`, a Twitter
401/403/429/5xx terminated the entire workflow run as failed. Add
`on_error: { strategy: skip, default_output: "twitter_publish_failed" }`
to the YAML so the run advances to `done` cleanly; the module already
surfaces categorized errors to Lark independently.

#2 — Twitter v2 native error shape: `ClassifyTwitterResponse` now
recognizes the third response shape NyxID can forward verbatim:
`{ "title": "...", "detail": "...", "errors": [...] }` (Twitter's
native problem-details for content-policy / duplicate-tweet rejections).
Falls through to `twitter_publish_rejected` with the Twitter `message`
text in the Lark surfacing so users read the actual rejection reason.

#1 — Duplicate tweet risk: documented in code comment that
`POST /2/tweets` has no server-side dedup; the social_media template
intentionally has no `retry` policy on this step, and `on_error: skip`
advances rather than retrying. Authors customizing the YAML must keep
this invariant.

#3 — Removed redundant `nyxClient!` null-forgiving (no-op cleanup).

#4 — Renamed `ChannelMetadataKeys.LarkProxySlug` →
`LarkOutboundProxySlug` (`channel.lark.outbound_proxy_slug`) to
disambiguate "Lark API surface" from "NyxID provider routing".

#5 — Added xml-doc on `TrySendLarkAsync` documenting the dual-scope
api-key dependency (key must carry both api-twitter AND api-lark-bot
entitlements) so future callers don't silently break Lark surfacing
when narrowing the key's scope.

#6 — Added `RequiredServiceSlugs` field to `SocialMediaTemplateSpec`
for parity with `DailyReportTemplateSpec`; `CreateSocialMediaAgentAsync`
now reads slugs from the spec instead of inlining the list.

Tests:
- 3 new `ClassifyTwitterResponse` tests for the Twitter native error
  shapes (errors-array, RFC-7807 title/detail-only, empty-object
  unexpected-shape).
- Existing social_media test now also asserts `strategy: skip` lands in
  the upserted YAML.
- 482 channel-runtime + 236 workflow.core tests pass; full solution
  builds with 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request Apr 27, 2026
…elper

Resolves three followup architecture-review points on PR #451:

### Channel.Runtime drops AI / Workflow direct deps (review #2)

`Channel.Runtime`'s csproj used to pull in `Aevatar.AI.Abstractions` and
`Aevatar.Workflow.Application.Abstractions` because of two files that
straddled the channel/AI and channel/workflow boundary:

- `ChannelContextMiddleware` (an `ILLMCallMiddleware` impl) — moved to
  `Aevatar.GAgents.NyxidChat`, which is the only package that needs it
  and already references `AI.Abstractions`. NyxidChat SCE registers it
  for the LLM call pipeline; Channel.Runtime SCE no longer touches
  `ILLMCallMiddleware`.
- `ChannelCardActionRouting` (builds `WorkflowResumeCommand`) — moved
  to `Aevatar.GAgents.NyxidChat` for the same reason. Its sole consumer
  (`ChannelConversationTurnRunner`) lives there too.

`Channel.Runtime.csproj` now references only `Channel.Abstractions`,
`Foundation.Abstractions`/`Core`, and the `CQRS.Projection.*` slice —
matching the "channel-agnostic flow + projection infrastructure"
charter from the RFC. Tests (`ChannelCardActionRoutingTests`) get the
extra `using Aevatar.GAgents.NyxidChat;`.

### Extract Elasticsearch projection-store toggle helper (review #4)

The `ResolveElasticsearchEnabled` + `BuildElasticsearchOptions` helper
pair was duplicated three times (Channel.Runtime / Device / Scheduled
SCEs) with slightly different log strings and `Console.Error.WriteLine`
output. Centralized into
`Aevatar.CQRS.Projection.Providers.Elasticsearch.DependencyInjection.ElasticsearchProjectionConfiguration`
with two static helpers:

- `IsEnabled(IConfiguration?, ILogger?, string? storeName)` — explicit
  flag → endpoints presence → false; logs a structured warning via
  `ILogger` (when supplied) instead of `Console.Error.WriteLine`.
- `BindOptions(IConfiguration)` — typed binder for
  `ElasticsearchProjectionDocumentStoreOptions`.

All three SCEs now call into this helper; per-package warning text is
parameterized via `storeName`. Section path
(`Projection:Document:Providers:Elasticsearch`) is exposed as a const
so future call sites stay in sync.

### Followup points acknowledged but deferred

- **Cross-package dep chain `NyxidChat → Authoring.Lark → Scheduled →
  Platform.Lark`** (review #1) — pre-existing arch debt that the split
  surfaced rather than introduced. Cleaner would be to invert via
  `IInboundFlowResolver` plug-ins so `ChannelConversationTurnRunner`
  doesn't reach into `AgentBuilderCardFlow` directly. Out of scope for
  the package split; tracking as a separate follow-up.
- **Tombstone compactor "central coordinator"** (review #3) —
  `Channel.Runtime` defines `ITombstoneCompactionTarget` but does not
  reference `Device` / `Scheduled` at the csproj level; per-package
  targets register themselves through DI. The plug-in pattern is
  intentional and keeps the DAG one-way.
- **`Scheduled` package name vs UserAgentCatalog content** (review #5)
  — `UserAgentCatalog` is the delivery-target registry that Scheduled
  agents read at execution time to route output, so co-locating it
  with `SkillRunnerGAgent` / `WorkflowAgentGAgent` is intentional.
  Renaming to `AgentCatalog` would split actors from their primary
  consumer; deferring.

473/473 ChannelRuntime.Tests pass; full slnx still only fails the same
two pre-existing Mainnet hosting `BindAsync on IStudioMemberService`
tests that reproduce on origin/dev.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request Apr 29, 2026
…ash-command

Wires up all six pieces the ADR called out as parallel-capable now that
NyxID#549 has shipped, so per-user binding stops requiring a follow-up PR
per layer:

#1 — AuthContext.external_subject (Channel.Abstractions)
    Move ExternalSubjectRef out of Identity.Abstractions into
    Channel.Abstractions where AuthContext lives. Identity is a consumer
    of channel concepts, not their owner; the new typed
    AuthContext.external_subject = 4 field is the broker-mode outbound
    identity carrier (ADR §Outbound Send) and the legacy string
    user_credential_ref keeps working for non-broker callers. Adds a
    new AuthContext.OnBehalfOfExternalSubject helper.

#2 — Projection chain
    ExternalIdentityBindingDocument (proto) + Partial (IProjectionReadModel)
    + MetadataProvider + MaterializationContext + Projector
    (ICurrentStateProjectionMaterializer) + ExternalIdentityBindingProjectionQueryPort
    + ExternalIdentityBindingProjectionReadinessPort (polling impl, write-side
    completion path only — ADR §Projection Readiness explicitly allows this).
    AddChannelIdentityProjection registers the lot.

#3 — NyxIdRemoteCapabilityBroker
    HTTP-backed INyxIdCapabilityBroker + INyxIdBrokerCallbackClient against
    the NyxID#549 wire shape: /oauth/authorize URL building, RFC 8693
    token-exchange with subject_token_type=urn:nyxid:params:oauth:token-type:binding-id,
    /oauth/bindings/{id} delete, authorization-code -> binding_id exchange.
    Maps invalid_grant -> BindingRevokedException so the upper layer
    event-source-revokes the local binding. PkceHelper covers RFC 7636 S256;
    StateTokenCodec seals correlation+verifier+exp into an HMAC token (kid
    header, deterministic Protobuf payload — ADR §Implementation Notes #1).
    AddNyxIdRemoteCapabilityBroker wires HttpClient + options + codec +
    both interfaces.

#4 — /api/oauth/nyxid-callback endpoint (IdentityOAuthEndpoints)
    Decodes state, exchanges code -> binding_id, dispatches CommitBindingCommand
    to ExternalIdentityBindingGAgent via IActorRuntime, waits on the projection
    readiness port, returns a friendly bind-confirmation page (id_token decoded
    locally for the display name — no /oauth/userinfo round-trip per ADR L61).
    Classifies error UX by HTTP status (ADR §Implementation Notes #3):
    400 for state-token issues, 502 for broker exchange failure, 200 with a
    "binding pending propagation" message on projection timeout.

#5 — Slash-command routing in ChannelConversationTurnRunner
    /init and /unbind are handled before the LLM by a new
    TryHandleSlashCommandAsync that resolves identity ports lazily through
    the existing IServiceProvider — deployments without per-user binding
    fall through unchanged. /init replies with the authorize URL (private DM
    only — ADR §Decision); /unbind resolves binding_id, calls
    broker.RevokeBindingAsync, replies with the unbind confirmation. The
    runner short-circuits via the existing SendReplyAsync, so reply
    delivery rides the relay outbound port like any other reply.

#6 — /api/webhooks/nyxid-broker-revocation receiver (Continuous Access Evaluation)
    BrokerRevocationWebhookValidator verifies HMAC-SHA256 over the raw body
    (X-NyxID-Signature: sha256=<hex>), parses the JSON envelope into a
    typed BrokerRevocationNotification, and the endpoint event-source-revokes
    the local binding actor. Aligns with NyxID#549 V2-7's CAE channel — when
    a user revokes from NyxID's console the binding goes inactive in seconds
    rather than waiting for the 5-min access TTL.

Tests:
- 591 ChannelRuntime.Tests pass (39 of which are the new Identity-tagged
  tests covering actor, projector, broker fake, state-token codec, and
  ExternalSubjectRef extension).
- Full solution `dotnet build aevatar.slnx` is clean.
- `tools/docs/lint.sh` 33 file(s) checked, 0 errors.

The ChannelConversationTurnRunner integration is feature-flag-shaped: if
neither IExternalIdentityBindingQueryPort nor INyxIdCapabilityBroker is
registered, the slash-command path is a no-op and existing
bot-owner-shared behaviour is preserved. Production rollout is a DI
toggle (AddChannelIdentityProjection + AddNyxIdRemoteCapabilityBroker
+ MapIdentityOAuthEndpoints) plus the Bot-Owner-Shared termination
strategy choice from the ADR §Bot-Owner-Shared 模式终止策略.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request Apr 29, 2026
Five round-2 follow-ups from the Codex consensus:

#5 Orphan binding leak on actor-activation failure (resolved):
  The OAuth callback's already-bound branch revoked the orphan
  binding_id; the actor-activation-failed branch (when the local actor
  cannot be created post-exchange) silently returned 503 and left the
  freshly issued binding_id active at NyxID. Now the same
  TryRevokeOrphanBindingAsync helper runs before the 503 response so
  both leak paths cleanup symmetrically.

#3 NyxIdAuthorityResolver footgun on staging clusters (resolved):
  Resolve() now takes an optional ILogger and emits a warning when
  AEVATAR_NYXID_AUTHORITY is unset AND ASPNETCORE_ENVIRONMENT /
  DOTNET_ENVIRONMENT indicate a non-Development environment. Local dev
  remains silent (env is empty / starts with "dev" / "local"); staging
  / qa / prod operators see the warning so a forgotten env-var doesn't
  silently register clients against production NyxID.

#1 csproj coupling — store wiring belongs to the host (resolved):
  Split AddChannelIdentity into two extension methods. AddChannelIdentity
  now registers actors / projector / broker / slash commands but NOT
  document stores. AddChannelIdentityProjectionStores wires the ES vs
  InMemory choice and is called by the composition root (Mainnet host
  now invokes both methods). Tests / demos can mix and match — agent
  module never decides on the host's behalf which physical store to use.

#2 kid rotation grace window (resolved):
  HMAC key rotation previously invalidated all in-flight state tokens
  (kid hardcoded "v1", no multi-key verification). The actor now carries
  current_kid + previous_kid + previous_demoted_at_unix on its state +
  projection. Encode signs with the snapshot's current kid; decode tries
  current first, then previous when demoted_at + state_token_lifetime
  is still in the future. A v1→v2 rotation produces deterministic kid
  succession (parses + increments the integer suffix) so verifiers can
  route signed tokens to the right key.

#4 HMAC key still in projection (partial — code unchanged, ADR
  hardened):
  The projection-store-as-actor-state-mirror pattern is the established
  shape in this codebase (Channel.Runtime same coupling). Removing the
  hmac_key from the document would require a generic actor query/reply
  pattern that CLAUDE.md explicitly forbids ("禁止 generic actor
  query/reply"). The ADR-0018 §Implementation Notes #1 already
  documents the explicit tradeoff (state_token TTL ≤ 5 min, rotation
  command available, ES index access boundary equals actor event-store
  boundary, KMS migration path noted as a follow-up). The proto
  comment on AevatarOAuthClientDocument now also calls out the
  production access-scoping requirement so a least-privileged ES
  reader can't extract the key. Encryption-at-rest with an envelope
  key from env-var is the natural follow-up if a deployment widens ES
  access beyond the actor event-store boundary.

Tests:
  - StateTokenCodec gains a kid-rotation grace test (decode succeeds
    with previous kid before lifetime expires; fails after).
  - AevatarOAuthClientGAgent gains a rotation-demotion test (verifies
    v1→v2 kid succession + previous-key carry-over) and a first-seed
    test (no previous-key fields populated on initial provision).
  - 800 ChannelRuntime tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eanzhao added a commit that referenced this pull request Apr 30, 2026
19 inline comments arrived after de82e0a; verified each. Three of them
(#13, #14, #16) point at the HttpClient captive bug already fixed in
de82e0a — those will be answered with a reply. Three are NyxID-side
contract gaps (#15, #18, #19) verified against ~/Code/NyxID HEAD cdfef0a;
those need separate NyxID PRs and will be tracked. The rest are fixed
here:

- /model list (codex MAJOR #11): read owner default from
  context.RegistrationScopeId, not the ambient queryPort overload —
  channel inbound has no Studio HTTP request behind it, so the ambient
  resolver returned `default`/unrelated state. Falls back to ambient
  only when the scope is empty (defensive). Tests pinned.
- StateTokenCodec.TryDecodeAsync (consensus MINOR #10): map
  AevatarOAuthClientNotProvisionedException to a distinct
  state_client_not_provisioned code instead of state_signature_invalid.
  IdentityOAuthEndpoints surfaces a "正在初始化, 30 秒后重试" detail
  for that code, matching the /init handler's cold-start message.
- AevatarOAuthClientBootstrapService (#8, #9):
  - wrap RunWithRetryAsync in RunSafelyAsync that logs any escape so
    the unobserved-task exception sink is no longer the only safety net.
  - StopAsync now catches TimeoutException too: when the host shutdown
    deadline fires before the bootstrap task observes its own
    _stoppingCts cancellation, log + return cleanly instead of leaking
    a noisy trace.
- AevatarOAuthClientGAgent.HandleEnsureProvisioned (#6, #7): document
  why CancellationToken.None is the contract — the framework's
  EventHandlerDiscoverer requires single-parameter handlers, so a
  turn-scoped CT cannot be surfaced. The named HTTP client's per-
  request timeout bounds the worst case during silo shutdown.
- NyxIdRedirectUriResolver (#4): emit a warning when all URL sources
  are unset and the environment is not developer-shaped, parity with
  NyxIdAuthorityResolver's existing fallback warning. Wired through
  bootstrap + broker call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants