Skip to content

[refactor-design] iter46 cluster-046-workflow-file-catalog-query-port: Workflow catalog 查询现场扫文件 + singleton 缓存事实 #871

@loning

Description

@loning

Workflow catalog 查询现场扫文件并缓存业务事实

一段话说清楚

现在 workflow catalog/capabilities 查询不是读取已经物化好的目录视图,而是在查询方法里现场扫目录、读 YAML、解析 workflow,再用单例内存缓存结果。同一个查询结果取决于当前进程看到的文件系统和本地缓存,而不是 actor 或 projection 已提交的事实

多实例部署时不同进程可能看到不同目录、不同缓存水位,接口还会返回一个本地生成时间(GeneratedAtUtc = DateTimeOffset.UtcNow),让调用方误以为这是权威刷新时间。Workflow definition 的事实来源变成"谁刚好处理了这次查询"。

违反:

  • CLAUDE「查询始终走 readmodel:对外查询只读 readmodel;不暴露 actor 内部状态、state mirror payload 或 event replay 为查询主路径」
  • CLAUDE「禁止 query-time replay/priming」
  • CLAUDE「禁止中间层维护 entity/actor 等 ID → 上下文/事实状态的进程内映射(Dictionary/ConcurrentDictionary/HashSet/Queue)」
  • CLAUDE「单线程事实源:禁止 lock/Monitor/ConcurrentDictionary 作为并发补丁维护事实状态」
  • CLAUDE「查询诚实:readmodel 必须暴露权威源版本或刷新戳;禁止在弱读结果上暗示强一致」

Evidence

  • src/workflow/Aevatar.Workflow.Infrastructure/Workflows/FileBackedWorkflowCatalogPort.cs:78-80:_cacheLock_workflowFileDiscoveryCache_parsedWorkflowCache on singleton port
  • src/workflow/Aevatar.Workflow.Infrastructure/Workflows/FileBackedWorkflowCatalogPort.cs:92-151:ListWorkflowCatalog / GetWorkflowDetail / GetCapabilities 现场 discover files / read registry YAML / parse definitions / load connector config
  • src/workflow/Aevatar.Workflow.Infrastructure/Workflows/FileBackedWorkflowCatalogPort.cs:457-566:从目录 discover files + mutate process caches under lock (_cacheLock)
  • src/workflow/Aevatar.Workflow.Infrastructure/DependencyInjection/ServiceCollectionExtensions.cs:40-44:registers infrastructure singleton as both IWorkflowCatalogPort and IWorkflowCapabilitiesPort
  • src/workflow/Aevatar.Workflow.Application/Queries/WorkflowExecutionQueryApplicationService.cs:12-20:injects those query ports as application read surface

Fix Boundary

Keep scope to workflow catalog/capabilities read path。Do NOT add new authoring features or change DSL semantics:

  • Replace FileBackedWorkflowCatalogPort as online query source with materialized workflow catalog/capabilities readmodel
  • Move file discovery/import into explicit startup activation, catalog actor command, or projection materializer(this decision needs maintainer input — ownership contract change)
  • Query app service should read versioned/freshness-bearing document, not enumerate files or parse YAML in request path
  • Remove process-local _workflowFileDiscoveryCache, _parsedWorkflowCache, _cacheLock from online query port
  • Add guard/tests to prevent Directory.EnumerateFiles, _parser.Parse, AevatarConnectorConfig.LoadConnectors() reappearing inside workflow query ports

为什么需要 design 决策

修复不是机械改名,需要决定 file-backed workflow 的权威事实拥有者:

  1. WorkflowGAgent 拥有 definition facts(docs 现有说法)?
  2. 单独 WorkflowCatalogActor(新 actor — 可能命中 hardcoded trigger Feature/cqrs projection suite #2)?
  3. Startup-time materializer(workflow registry → projection readmodel on app start)?
  4. 影响热加载、user home workflows、repo workflows、capabilities 文档版本号、多实例部署一致性

需要回答的问题

auto-loop-resume 前请回答:

  • Fact owner:WorkflowGAgent per-definition / WorkflowCatalogActor aggregate(新 actor 警告)/ startup materializer / projection-driven?
  • File discovery:startup activation 一次性 / catalog actor command / projection materializer?
  • Hot reload:支持 / 不支持(只 startup 加载)?
  • Multi-source:user home + repo + registry 谁负责合并?
  • Version/watermark:GeneratedAtUtc 改用 commit version / projection watermark?
  • Scope 拆分:单 cluster 还是拆 N?

Auto-loop 行为

phase9-auto-solve;maintainer 评论 RESET round;auto-loop-resume = 最新评论作 design decision。

📢 cc 原作者

@loning @eanzhao

⟦AI:AUTO-LOOP⟧

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-loopCreated by codex-refactor-loop skillphase9-auto-solveOperator opted this design issue into Phase 9 auto-solverefactor-design-neededCluster flagged requires_design by codex-refactor-loop auto audit🎉 phase:merged

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions