Skip to content

cluster-007: v1 work-unit 契约(#3 共识)#8

Merged
loning merged 3 commits into
auto-refact-devfrom
refactor/iter2-cluster-007-work-unit-contract-schema
May 25, 2026
Merged

cluster-007: v1 work-unit 契约(#3 共识)#8
loning merged 3 commits into
auto-refact-devfrom
refactor/iter2-cluster-007-work-unit-contract-schema

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented May 25, 2026

摘要

实现 issue #3(cluster-007)Phase 9 r3 共识(structural):v1 work-unit 契约 —— clusters_* 保持权威 + 新增 WorkUnitV1 items + WORK_UNIT_ID alias,零迁移(不引入 WorkUnitEnvelopeV1),向后兼容。文档层落于 SKILL.md + REFERENCE.md。

契合 maintainer 指令「重构即开发也可以,别过度泛化」——薄契约、零迁移。

共识链路:#3 r1→r3(converge×2 → 3/3 consensus structural)。完整记录见 issue #3

🤖 Auto-loop · 共识 implement(#3)

⟦AI:AUTO-LOOP⟧

Phase 9 r3 consensus(structural):clusters_* 保持权威 + 新增 WorkUnitV1
items + WORK_UNIT_ID alias,零迁移(契合 maintainer「别过度泛化」)。
文档层引入契约于 SKILL.md + REFERENCE.md,向后兼容。

待 Phase 8 三 reviewer 共识后 merge。

⟦AI:AUTO-LOOP⟧
@loning loning added auto-loop codex-refactor-loop 管理的 issue/PR 🚀 phase:pr-open 👀 phase:reviewing labels May 25, 2026
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 Phase 8 三 reviewer 共识(不需要人介入)

architect / tests / quality 审 v1 work-unit 契约。三 approve → merge。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 架构审查通过: v1 WorkUnit 契约未引入新架构债

TL;DR

  • 这是什么:PR 8 的 architect 角度审查,只看 CLAUDE/AGENTS 架构合规。
  • 结论:approve,没有发现会阻塞合并的架构违规。
  • 需要 maintainer 做什么:无需架构侧修改,等待其他 reviewer verdict 后由 controller 汇总。

详细说明

这次变更把 WorkUnitV1 记录为技能文档里的队列元素契约,没有新增运行时 primitive。clusters_planned / clusters_active / clusters_done / clusters_failed 仍是 v1 权威容器,并明确禁止 work_units_* 迁移、envelope wrapper、normalizer helper 和 state-v2 migration。

我按实际三点 diff 和 cluster 共识文件核对了范围。PR 只改 skills/codex-refactor-loop/REFERENCE.mdskills/codex-refactor-loop/SKILL.md; implement summary 里写明 SCOPE_EXTEND: None,prompt 模板改动因硬 scope 被跳过。反模式 grep 没看到 actor 直调、SubscribeAsync<EventEnvelope>、state/payload JSON serializer、生产 sleep/delay([Skip]work_units_*WorkUnitEnvelopeV1


📎 完整 codex 原始输出(存档备查)

pr: 8
role: architect
verdict: approve

Verdict

Approve: no architecture-compliance blocker; the PR keeps the v1 clusters_* queues authoritative, documents WorkUnitV1 without adding a parallel state path, and stays inside the implemented cluster scope.

Evidence

  • skills/codex-refactor-loop/REFERENCE.md:5 defines WorkUnitV1 as a contract in the skill reference, not a new runtime primitive. This maps to AGENTS.md: "唯一产物是 skills/<name>/ 下的 SKILL.md 及其配套文件。"
  • skills/codex-refactor-loop/REFERENCE.md:7 through skills/codex-refactor-loop/REFERENCE.md:10 explicitly keeps clusters_planned, clusters_active, clusters_done, and clusters_failed authoritative and forbids migrated queue containers, envelope wrappers, normalizer helpers, and state-v2 migration. This matches the Phase 9 consensus source and avoids deletion-first regressions.
  • skills/codex-refactor-loop/REFERENCE.md:31 through skills/codex-refactor-loop/REFERENCE.md:39 limits compatibility to current audit-backed units and says non-audit units must not fabricate cluster_id / legacy_cluster_id; no empty forwarding interface, dead wrapper, or parallel pathway is introduced.
  • skills/codex-refactor-loop/SKILL.md:617 and skills/codex-refactor-loop/SKILL.md:632 through skills/codex-refactor-loop/SKILL.md:636 add work_unit_schema_version: 1 while preserving legacy read behavior without migration.
  • skills/codex-refactor-loop/SKILL.md:692 through skills/codex-refactor-loop/SKILL.md:702 normalizes accepted audit clusters into WorkUnitV1 and keeps existing cluster fields only for v1 compatibility.
  • skills/codex-refactor-loop/SKILL.md:780 through skills/codex-refactor-loop/SKILL.md:788 and skills/codex-refactor-loop/SKILL.md:804 through skills/codex-refactor-loop/SKILL.md:806 expose WORK_UNIT_ID=$CLUSTER_ID at implement/verify prompt boundaries without changing markers or artifact routing.
  • skills/codex-refactor-loop/SKILL.md:1582 through skills/codex-refactor-loop/SKILL.md:1585 frames Phase 9 as work-unit design while preserving cluster_id as legacy routing metadata.
  • Scope honesty: the PR changes only skills/codex-refactor-loop/REFERENCE.md and skills/codex-refactor-loop/SKILL.md; implement-cluster-007-work-unit-contract-schema.md documents SCOPE_EXTEND: None and notes prompt template edits were intentionally omitted by the hard scope.
  • Old/New pattern comment check: no refactored production type or method is present in this Markdown-only diff, so the // Refactor (iterN/cluster-XXX): Old pattern: … New principle: … code-comment rule is not triggered.
  • Anti-pattern grep over the actual diff found no matches for actor.HandleEventAsync(, SubscribeAsync<EventEnvelope>, JSON serializer state/payload changes, sleep/delay( production pacing, sync-over-async wording, middle-layer Dictionary<,>, raw HttpClient constructors, [Skip], work_units_*, or WorkUnitEnvelopeV1.
  • No .proto files are touched, no *WriteActor / *ReadActor / *Store split is introduced, and no new external repo reference appears in the diff.

REVIEW_DONE:8:architect:approve
⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 tests review: 阻塞,缺少 WorkUnitV1 防回归测试

TL;DR

  • 这是什么:PR 8 的测试质量 review,范围只看测试覆盖和测试稳定性。
  • 结论:reject,因为新增 WorkUnitV1 契约和禁止迁移表面后,没有 checked-in source-regression test 固化这些规则。
  • 需要 maintainer 做什么:补一个读取 SKILL.md / REFERENCE.md 的源文本防回归测试后再重跑 review。

详细说明

这次改动是文档 / schema contract 层,没有新增 sleep / delay 测试节奏,也没有 [Skip]、manual test category、断言放松或 mock-only 伪覆盖。

阻塞点是测试契约本身。WorkUnitV1 是 state queue item 的 v1 契约,PR 明确规定 clusters_* 仍是权威队列,并禁止 work_units_* 迁移、WorkUnitEnvelopeV1、normalizer helper、state-v2 migration。这类“不准回退到另一套框架”的规则,应该进 checked-in source-regression test,而不是只在 implement summary 里跑一次 rg。建议把测试放到 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.pyProjectRulesPromptContractTests 附近,断言 contract markers 存在、forbidden tokens 不存在。


📎 完整 codex 原始输出(存档备查)
---
pr: 8
role: tests
verdict: reject
---

## Verdict
Reject: the PR is docs/schema-contract only and adds no sleep/skip problems, but it introduces a no-regression contract without a checked-in source-regression test.

## Evidence
- `skills/codex-refactor-loop/REFERENCE.md:7` through `:10` defines the new `WorkUnitV1` contract and explicitly forbids migrated queue containers, envelope wrappers, normalizer helpers, and state-v2 migration for this contract.
- `skills/codex-refactor-loop/SKILL.md:632` through `:636` adds the operational legacy-read rule for `work_unit_schema_version` and `clusters_*` as authoritative v1 queues; `:692` through `:702` adds the audit-to-`WorkUnitV1` normalization and non-audit no-fabricated-`cluster_id` rule.
- No test file changed in the PR, and the existing source-regression test surface in `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:246` through `:269` only protects the prior `PROJECT_RULES` prompt/bootstrap contract. It does not assert the new work-unit contract is present, nor does it assert forbidden migration tokens such as `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, or `WorkUnitEnvelopeV1` remain absent.
- The implementation summary reports a manual negative grep for the forbidden migration surface, but that check is not committed as a repeatable test. This is exactly the kind of no-regression rule the tests checklist expects to be represented as source-regression assertions.
- Diff scan found no added `sleep`/`delay` test pacing, no `test_polling_allowlist.txt` change, no added `[Skip]` or manual-test category, no loosened existing assertions, and no mock-only pseudo-coverage.

## What would change your verdict
Add a checked-in source-regression test, preferably alongside `ProjectRulesPromptContractTests`, that reads `skills/codex-refactor-loop/REFERENCE.md` and `SKILL.md` and asserts:
- the v1 contract markers are present: `WorkUnitV1`, `work_unit_schema_version`, `work_unit_id == id == cluster_id == legacy_cluster_id`, `WORK_UNIT_ID=$CLUSTER_ID`, and the non-audit rule that `cluster_id` / `legacy_cluster_id` must not be fabricated;
- forbidden migration/framework tokens are absent from the skill surface, including `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, `WorkUnitEnvelopeV1`, `WorkUnitProducerV1`, and `work_unit_producer.py`.

REVIEW_DONE:8:tests:reject

⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 质量评审:comment

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 code quality 角度独立评审。
  • 结论是什么:comment,整体契约薄且聚焦,但有 2 个命名/契约清晰度问题。
  • 需要 maintainer 做什么:建议作者把 legacy 映射和 WORK_UNIT_ID prompt 消费补齐后再合并。

详细说明

WorkUnitV1 是 v1 队列项契约。当前文档在 legacy state 读取处写成 kind=producer=audit,但同一契约又定义 audit-backed unit 为 kind="audit-cluster"producer="audit"。这会让读者不知道 kind 到底是类型还是来源。

WORK_UNIT_ID 是新声明的 canonical prompt identity,也就是 prompt 里代表 work unit 的主身份。问题是 prompts/implement.mdprompts/verify.md 仍只读 ${CLUSTER_ID},现有 render_template() 也只替换 {{cluster_id}}。所以 {{work_unit_id}} 现在像新增了一个公开契约,但没有实际消费者。

建议改法:把 legacy 映射统一写成 kind="audit-cluster", producer="audit";然后二选一,要么把 WORK_UNIT_ID 真正写进 prompt 模板和 renderer,要么先删掉 {{work_unit_id}} / “canonical prompt identity” 说法,等后续 PR 再启用。


📎 完整 codex 原始输出(存档备查)

pr: 8
role: quality
verdict: comment

Verdict

Comment: the WorkUnitV1 shape is mostly readable and focused, but two contract-wording issues will confuse future maintainers.

Evidence

  • skills/codex-refactor-loop/REFERENCE.md:34 and skills/codex-refactor-loop/SKILL.md:635 say legacy state should be treated as kind=producer=audit, while the same contract defines current audit-backed units as kind="audit-cluster" and producer="audit" at REFERENCE.md:31 / SKILL.md:696. That compresses two fields into one value and makes the kind naming ambiguous.
  • skills/codex-refactor-loop/SKILL.md:780 lists {{work_unit_id}} as an implement prompt placeholder and SKILL.md:804 says WORK_UNIT_ID is the canonical prompt identity, but the actual prompt templates still only reference ${CLUSTER_ID} (skills/codex-refactor-loop/prompts/implement.md:1, prompts/verify.md:1) and the existing renderer only substitutes {{cluster_id}} (skills/codex-refactor-loop/scripts/controller_lib.sh:131). As written, the new prompt identity is public surface with no reader, which is dead/confusing contract surface.

What would change your verdict (only if comment or reject)

Spell the legacy mapping as kind="audit-cluster", producer="audit" everywhere, and either wire WORK_UNIT_ID into the prompt templates/renderer or remove {{work_unit_id}} / "canonical prompt identity" wording until a follow-up actually consumes it.

REVIEW_DONE:8:quality:comment

⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

1 similar comment
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 质量评审:comment

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 code quality 角度独立评审。
  • 结论是什么:comment,整体契约薄且聚焦,但有 2 个命名/契约清晰度问题。
  • 需要 maintainer 做什么:建议作者把 legacy 映射和 WORK_UNIT_ID prompt 消费补齐后再合并。

详细说明

WorkUnitV1 是 v1 队列项契约。当前文档在 legacy state 读取处写成 kind=producer=audit,但同一契约又定义 audit-backed unit 为 kind="audit-cluster"producer="audit"。这会让读者不知道 kind 到底是类型还是来源。

WORK_UNIT_ID 是新声明的 canonical prompt identity,也就是 prompt 里代表 work unit 的主身份。问题是 prompts/implement.mdprompts/verify.md 仍只读 ${CLUSTER_ID},现有 render_template() 也只替换 {{cluster_id}}。所以 {{work_unit_id}} 现在像新增了一个公开契约,但没有实际消费者。

建议改法:把 legacy 映射统一写成 kind="audit-cluster", producer="audit";然后二选一,要么把 WORK_UNIT_ID 真正写进 prompt 模板和 renderer,要么先删掉 {{work_unit_id}} / “canonical prompt identity” 说法,等后续 PR 再启用。


📎 完整 codex 原始输出(存档备查)

pr: 8
role: quality
verdict: comment

Verdict

Comment: the WorkUnitV1 shape is mostly readable and focused, but two contract-wording issues will confuse future maintainers.

Evidence

  • skills/codex-refactor-loop/REFERENCE.md:34 and skills/codex-refactor-loop/SKILL.md:635 say legacy state should be treated as kind=producer=audit, while the same contract defines current audit-backed units as kind="audit-cluster" and producer="audit" at REFERENCE.md:31 / SKILL.md:696. That compresses two fields into one value and makes the kind naming ambiguous.
  • skills/codex-refactor-loop/SKILL.md:780 lists {{work_unit_id}} as an implement prompt placeholder and SKILL.md:804 says WORK_UNIT_ID is the canonical prompt identity, but the actual prompt templates still only reference ${CLUSTER_ID} (skills/codex-refactor-loop/prompts/implement.md:1, prompts/verify.md:1) and the existing renderer only substitutes {{cluster_id}} (skills/codex-refactor-loop/scripts/controller_lib.sh:131). As written, the new prompt identity is public surface with no reader, which is dead/confusing contract surface.

What would change your verdict (only if comment or reject)

Spell the legacy mapping as kind="audit-cluster", producer="audit" everywhere, and either wire WORK_UNIT_ID into the prompt templates/renderer or remove {{work_unit_id}} / "canonical prompt identity" wording until a follow-up actually consumes it.

REVIEW_DONE:8:quality:comment

⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 Phase 8 fix r1(不需要人介入)

architect approve · tests reject · quality(见下)→ 派 fix codex r1。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 PR #8 fix-codex round 1 已完成

TL;DR

  • 这是什么: fix-codex 已处理 round 1 的 1 个 reject 和 1 个 comment。
  • 结论: 已补 WorkUnitV1 source-regression 测试,并把 WORK_UNIT_ID 接入 prompt/renderer。
  • 下一步: controller 可提交本轮修复并重新派 Phase 8 三 reviewer。

详细说明

本轮 applied 7 项, rejected 0 项, blocked 0 项。REFERENCE.mdSKILL.md 中 legacy state 的映射已统一为 kind="audit-cluster" + producer="audit",避免把两个字段压成一个值。

另外按 tests reviewer 的 reject 要求,新增 checked-in source-regression 覆盖:它断言 WorkUnitV1 / work_unit_schema_version / WORK_UNIT_ID=$CLUSTER_ID 等契约标记存在,并断言本轮禁止引入的 migration/envelope 表面不出现在 skill surface。prompt 侧也已把 implement/verify/meta-judge 的主身份切到 ${WORK_UNIT_ID},并在 renderer 中支持 {{work_unit_id}} fallback 到 $CLUSTER_ID

验证结果: git diff --check 通过; python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py 通过,20 tests。


📎 完整 codex 原始输出(存档备查)
# Fix report for PR 8 round 1

## Applied
- (A) `skills/codex-refactor-loop/REFERENCE.md:34`: clarified legacy state mapping as `kind="audit-cluster"` plus `producer="audit"` instead of compressed `kind=producer=audit` wording (addresses reviewer:quality's evidence #1).
- (A) `skills/codex-refactor-loop/SKILL.md:635`: clarified the same legacy mapping in the operational bootstrap/read rule (addresses reviewer:quality's evidence #1).
- (B) `skills/codex-refactor-loop/prompts/implement.md:1`: SCOPE_EXTEND because Phase 9 consensus explicitly authorized prompt identity wiring and reviewer:quality found dead prompt identity surface; changed the primary implement heading to `${WORK_UNIT_ID}` and kept `${CLUSTER_ID}` as the v1 audit compatibility alias (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/prompts/verify.md:1`: SCOPE_EXTEND for the same prompt identity wiring; changed the primary verify heading to `${WORK_UNIT_ID}` and documented `${CLUSTER_ID}` as the compatibility alias for existing artifacts and markers (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/prompts/meta-judge.md:3`: SCOPE_EXTEND because the Phase 9 consensus plan included work-unit framing for the meta-judge prompt; added `${WORK_UNIT_ID}` while preserving `${CLUSTER_ID}` as legacy routing metadata (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/scripts/controller_lib.sh:127`: SCOPE_EXTEND because reviewer:quality cited the renderer as only substituting `{{cluster_id}}`; added `{{work_unit_id}}` rendering with fallback to `$CLUSTER_ID` for legacy callers (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:272`: SCOPE_EXTEND because reviewer:tests rejected without checked-in source-regression coverage; added `WorkUnitV1SourceRegressionTests` for required contract markers and forbidden migration/framework tokens (addresses reviewer:tests's evidence #1).

## Rejected as false positive
- None.

## Blocked (cannot fix this round)
- None.

## Build status
- build: pass (`true`; `git diff --check`)
- tests: pass (`python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py`, 20 tests)

## Recommendation for next round
- expect unanimous

⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

把 v1 work-unit 契约接入 implement/meta-judge/verify prompt + controller_lib
+ 补测试(applied-7 / rejected-0 / blocked-0)。详见 fix-pr8-round-1.md。

⟦AI:AUTO-LOOP⟧
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 Phase 8 r2 重评审(不需要人介入)

fix r1 applied-7(契约接入 prompt/lib + 测试,commit 95f3974)。请 reviewer 审 scope 是否必需。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 质量复审: comment

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 readability / simplicity 角度独立复审。
  • 结论:整体命名清晰、范围聚焦,但一个被改动的 shell helper 缺少本 cluster 强制的 Refactor Old/New 自说明块。
  • 需要 maintainer 做什么:建议补一段 # Refactor (iter2/cluster-007-work-unit-contract-schema): 注释后再合并。

详细说明

主要问题在 skills/codex-refactor-loop/scripts/controller_lib.sh:128: render_template 新增了 {{work_unit_id}} 渲染逻辑,也就是把新 work-unit 身份接进 prompt 模板的入口。但这个关键 helper 附近没有本 cluster 要求的 Refactor (iter2/cluster-007-work-unit-contract-schema) Old/New 自说明块。测试里已有对应块,所以问题不是全局缺失,而是可执行 helper 路径漏了。

其余部分质量上没看到阻塞点:WorkUnitV1work_unit_idlegacy_cluster_idkindproducersource_ref 都表达了业务意图;没有新增单实现抽象;diff 也集中在 WorkUnitV1 契约、prompt identity alias 和回归测试。


📎 完整 codex 原始输出(存档备查)
---
pr: 8
role: quality
verdict: comment
---

## Verdict
Comment: the WorkUnitV1 contract reads cleanly and the diff is focused, but the changed shell helper is missing the cluster self-doc block required for refactor work.

## Evidence
- `skills/codex-refactor-loop/scripts/controller_lib.sh:128`: `render_template` gained the `{{work_unit_id}}` substitution at line 131, but this key changed helper has no `# Refactor (iter2/cluster-007-work-unit-contract-schema):` Old/New block. The test class has a clear block at `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:273`, so the issue is isolated to the executable helper path.
- Naming looks acceptable: `WorkUnitV1`, `work_unit_id`, `legacy_cluster_id`, `kind`, `producer`, and `source_ref` describe the contract purpose directly, and the legacy `cluster_id` alias is explicitly scoped in `REFERENCE.md:29`.
- No dead-code or over-engineering smell found in this diff: the only new executable surface is the `{{work_unit_id}}` renderer branch, and it is exercised by `WorkUnitV1SourceRegressionTests.test_work_unit_v1_contract_markers_are_present`.
- No unrelated drive-by cleanup found; touched files all stay inside the WorkUnitV1 contract, prompt identity aliasing, and source-regression coverage.

## What would change your verdict (only if comment or reject)
Add a short `# Refactor (iter2/cluster-007-work-unit-contract-schema):` Old/New block next to `render_template` or immediately above the `{{work_unit_id}}` substitution, explaining why `WORK_UNIT_ID` falls back to `CLUSTER_ID` during v1 compatibility.

REVIEW_DONE:8:quality:comment

⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 架构复审: comment-only, WorkUnitV1 边界可合并

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 architect 角度复审,只看 CLAUDE/AGENTS 架构合规。
  • 结论:comment-only,没有发现阻塞合并的架构退化。
  • 下一步:建议补一条 scope 说明,并给 render_template 这个模板渲染 helper 加 Old/New 注释以满足 checklist 洁癖。

详细说明

WorkUnitV1 是 v1 队列元素契约:PR 保持 clusters_* 作为权威容器,没有引入 work_units_* 迁移面或 WorkUnitEnvelopeV1 包装层,这和 issue #3 r3 共识一致。WORK_UNIT_ID 也被作为主身份接入 prompt,同时保留 CLUSTER_ID 作为现有 artifact、marker、branch/worktree 的兼容 alias。

我只标 comment,不 reject。两个点需要在 PR 里说明清楚:第一,controller_lib.sh 和 source-regression test 是支持实现的合理扩展,但 implement summary 只列了 REFERENCE.md/SKILL.md 且写了 SCOPE_EXTEND: None;第二,controller_lib.sh:125-132 改了 render_template 行为,但 helper 附近没有 checklist 要求的 Old/New refactor comment。反模式扫描没有命中 actor.HandleEventAsync(SubscribeAsync<EventEnvelope>、JSON state/payload 序列化、生产 sleep/delay(、sync-over-async、raw HttpClient[Skip]work_units_*WorkUnitEnvelopeV1;本 PR 也没有 .proto 改动。


📎 完整 codex 原始输出(存档备查)
---
pr: 8
role: architect
verdict: comment
---

## Verdict
comment-only: WorkUnitV1 architecture boundary matches the Phase 9 consensus, but the PR should surface two non-blocking compliance nits before merge.

## Evidence
- `skills/codex-refactor-loop/scripts/controller_lib.sh:125-132` changes executable prompt-rendering behavior for `{{work_unit_id}}`, but the changed helper has only a generic usage comment, not the required Old/New refactor comment. Checklist clause: "Old/New pattern comment: each refactored type/method has `// Refactor (iterN/cluster-XXX): Old pattern: …  New principle: …`. Missing or vague → comment." This is not a reject because the changed behavior is a small template substitution helper and the new source-regression test class does carry a proper `Refactor (iter2/cluster-007-work-unit-contract-schema)` Old/New comment.
- `skills/codex-refactor-loop/scripts/controller_lib.sh:125-132` and `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:272-316` are net-new PR changes outside the Phase 9 winning plan's file list (`REFERENCE.md`, `SKILL.md`, `prompts/implement.md`, `prompts/verify.md`, `prompts/meta-judge.md`) and outside the implementation summary's stated modified files (`REFERENCE.md`, `SKILL.md`), while the summary says `SCOPE_EXTEND: None`. Checklist clause: "Scope honesty: diff stays within the cluster's declared `scope_paths` (or has a documented SCOPE_EXTEND in implement summary). Diff drift → comment." Architecturally the additions are aligned with AGENTS' skill rule, "重型参考拆 `REFERENCE.md`;脚本放 `scripts/`;prompt 模板放 `prompts/`。" and with the behavior-test requirement, so this is accounting drift rather than a blocking architecture violation.
- Positive checks: the WorkUnitV1 contract keeps `clusters_*` authoritative and explicitly blocks migrated queue containers or envelope wrappers at `skills/codex-refactor-loop/REFERENCE.md:5-39`; Phase 1 normalization preserves audit compatibility and forbids fabricated non-audit cluster ids at `skills/codex-refactor-loop/SKILL.md:693-703`; prompt identity has been split to `${WORK_UNIT_ID}` with `${CLUSTER_ID}` as v1 compatibility alias at `skills/codex-refactor-loop/prompts/implement.md:1-4`, `skills/codex-refactor-loop/prompts/verify.md:1-4`, and `skills/codex-refactor-loop/prompts/meta-judge.md:1-3`.
- Anti-pattern scan over the PR diff found no `actor.HandleEventAsync(`, `SubscribeAsync<EventEnvelope>`, JSON serializer state/payload surface, production `sleep/delay(`, sync-over-async, raw `HttpClient` constructor, `[Skip]`, `work_units_*`, or `WorkUnitEnvelopeV1`; there are no `.proto` changes.

## What would change your verdict
Document the scope extension in the implement summary/PR comment, or keep the code/test additions but explicitly mark them as implementation-supporting scope expansion; add a short Old/New `Refactor (iter2/cluster-007-work-unit-contract-schema)` comment above the changed `render_template` helper if the team wants strict checklist cleanliness.

REVIEW_DONE:8:architect:comment
⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 Tests review: reject, 缺少 render_template 行为测试

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的测试质量复审,只看测试覆盖和测试质量。
  • 结论:reject,因为新增的 {{work_unit_id}} 渲染逻辑没有行为测试覆盖。
  • 需要 maintainer 做什么:补一个真正调用 render_template 的测试,覆盖 WORK_UNIT_ID 优先级和回退到 CLUSTER_ID

详细说明

render_template 是 controller 生成 actor prompt 的 shell helper,这次在 skills/codex-refactor-loop/scripts/controller_lib.sh:131 增加了 {{work_unit_id}} 替换,并要求 WORK_UNIT_ID 优先、缺失时回退到 CLUSTER_ID。这是可执行路径,不是纯文档。

当前新增测试在 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:276 只把源码拼起来检查 marker 字符串存在。这个 source-regression guard 有价值,但抓不到实际渲染失败,比如输出里还残留 {{work_unit_id}}WORK_UNIT_ID 没有优先、或回退到 CLUSTER_ID 被破坏。test_work_unit_v1_forbidden_migration_surface_is_absent 对 forbidden token 的断言是合格的,我也没有看到新增 sleep/delay[Skip]、manual trait 或弱化断言。

建议补的最小测试:source controller_lib.sh,用临时模板包含 {{work_unit_id}},分别断言 WORK_UNIT_ID=unit-123, CLUSTER_ID=cluster-007 渲染为 unit-123;WORK_UNIT_ID 未设置时渲染为 cluster-007;输出不残留 {{work_unit_id}}


📎 完整 codex 原始输出(存档备查)
    ---
    pr: 8
    role: tests
    verdict: reject
    ---
    
    ## Verdict
    Reject: PR 8 adds net-new `render_template` behavior for `{{work_unit_id}}`, but the added tests only inspect source markers and never exercise the rendered output contract.
    
    ## Evidence
    - `skills/codex-refactor-loop/scripts/controller_lib.sh:131` adds the new `{{work_unit_id}}` substitution with fallback from `WORK_UNIT_ID` to `CLUSTER_ID`; this is executable controller behavior, not pure documentation.
    - `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:276` only concatenates source files and asserts marker strings such as the literal perl replacement are present. This would not catch runtime failures in the render pipeline, such as `{{work_unit_id}}` staying unresolved, `WORK_UNIT_ID` not taking precedence, or fallback to `CLUSTER_ID` failing after `envsubst`.
    - `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:298` is a valid source-regression guard for forbidden migration-surface tokens, and I did not find added `sleep/delay`, `[Skip]`, manual traits, or loosened existing assertions in the PR diff. The blocker is specifically missing behavior coverage for the new renderer logic.
    
    ## What would change your verdict
    Add behavior tests that source `controller_lib.sh`, render a temporary template containing `{{work_unit_id}}`, and assert:
    - when `WORK_UNIT_ID=unit-123` and `CLUSTER_ID=cluster-007`, the rendered output uses `unit-123`;
    - when `WORK_UNIT_ID` is unset and `CLUSTER_ID=cluster-007`, the rendered output falls back to `cluster-007`;
    - no unresolved `{{work_unit_id}}` remains in the rendered prompt.
    
    REVIEW_DONE:8:tests:reject
    
    ⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 Phase 8 fix r2(窄修,不需要人介入)

tests r2 reject(新理由,有进展)→ fix r2 窄修(只针对 tests r2,防 scope-runaway)。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 fix r2 已补 renderer 行为测试

TL;DR

  • 这是什么: PR cluster-007: v1 work-unit 契约(#3 共识) #8 round 2 fix-codex 针对 tests reviewer 的唯一 reject 做最小修复。
  • 现在结论: 已新增 render_template 真实渲染测试,覆盖 WORK_UNIT_ID 优先、CLUSTER_ID fallback、无残留 {{work_unit_id}}
  • 下一步: controller 可提交本轮改动并重新派 Phase 8 reviewers。

详细说明

本轮按“只修 tests 点名 reject”的约束处理。新增测试会 source skills/codex-refactor-loop/scripts/controller_lib.sh,用临时模板跑 render_template,直接检查渲染结果,不再只看源码 marker。验证结果: python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py 通过 22 个 unittest;git diff --check 通过。

architect / quality 的 Old/New 注释和 scope accounting 都是 comment-only,不是本轮 blocker;为避免 scope-runaway,本轮没有改 controller_lib.sh 或 implement summary。


📎 完整 codex 原始输出(存档备查)

Fix report for PR 8 round 2

Applied

Rejected as false positive

  • None.

Blocked (cannot fix this round)

  • None. The architect and quality findings were comment-only demands about scope accounting and a render_template Old/New self-doc block. Per the round-2 narrow-fix constraint, they were not blockers and were not expanded in this pass.

Build status

  • build: pass (git diff --check -- skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py)
  • tests: pass (python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py, 22 tests)

Recommendation for next round

  • expect unanimous

⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

只针对 tests r2 点名 gap 最小修复(test_ensure_project_rules_fixed_points.py),
未扩 scope。详见 fix-pr8-round-2.md。

⟦AI:AUTO-LOOP⟧
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 Phase 8 r3 重评审(不需要人介入)

fix r2 applied-2(窄修,commit 91fd48f)。预期 tests approve → merge。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 tests review r3: PR #8 测试质量通过

TL;DR

  • 这是什么:PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 tests 视角复审,只看测试覆盖和测试质量。
  • 结论是什么:approve,新增测试覆盖了 WorkUnitV1 契约回归和 render_template 的新行为分支。
  • 下一步:controller 可以把 tests reviewer 这一票视为通过。

详细说明

这轮实际打开了 PR diff 和新增测试文件。WorkUnitV1 是 v1 work-unit 队列记录契约,测试已经检查契约标记仍在 skill/reference/prompt/controller 中,也检查禁止提前引入 work_units_* 迁移容器、envelope、producer helper 这些回归面。

可执行逻辑方面,render_template 是 controller 模板渲染 helper,新增 {{work_unit_id}} 替换逻辑。测试分别覆盖了 WORK_UNIT_ID 优先于 CLUSTER_ID、以及未设置 WORK_UNIT_ID 时回退到 CLUSTER_ID。没有新增 [Skip]、Manual trait、sleep/delay 测试节奏,也没有 polling allowlist 改动。已在 PR head 归档副本里跑过 python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py,22 个测试通过。


📎 完整 codex 原始输出(存档备查)
---
pr: 8
role: tests
verdict: approve
---

## Verdict
Approve: the PR adds focused source-regression coverage for the WorkUnitV1 contract and behavior tests for the only new executable branch in `render_template`.

## Evidence
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:315` `test_work_unit_v1_contract_markers_are_present` asserts the required WorkUnitV1 identity/schema markers across the skill, reference, prompts, and controller helper instead of only bumping line count.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:337` `test_work_unit_v1_forbidden_migration_surface_is_absent` provides source-regression assertions for the cluster's no-premature-migration rule, including `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, envelope, and producer-helper surfaces.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:357` `test_render_template_prefers_work_unit_id_over_cluster_alias` covers the new `controller_lib.sh:131` branch where `{{work_unit_id}}` renders from `WORK_UNIT_ID` and keeps `{{cluster_id}}` as the legacy alias.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:364` `test_render_template_falls_back_to_cluster_id_when_work_unit_id_is_unset` covers the fallback branch from `WORK_UNIT_ID` to `CLUSTER_ID`.
- Search of the PR diff found no added `[Skip]`, `[Trait("Category","Manual")]`, `Thread.Sleep`, `Task.Delay`, or test pacing sleeps; no polling allowlist changes were added.
- Verified the PR-head test file runs cleanly: `python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py` ran 22 tests and passed.

REVIEW_DONE:8:tests:approve
⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 quality review: comment

TL;DR

  • 这是什么:PR 8 的代码质量复审,只看命名、可读性、复杂度、死代码和范围控制。
  • 结论:comment,整体 WorkUnitV1 契约清楚且聚焦,但有 1 个测试断言绑死了具体 Perl 实现字符串。
  • 需要 maintainer 做什么:建议把该断言改成业务语义级 marker,保留现有行为测试作为真正保护。

详细说明

test_work_unit_v1_contract_markers_are_present 现在要求 controller_lib.sh 里出现完整替换表达式 s/\{\{work_unit_id\}\}/($ENV{WORK_UNIT_ID} || $ENV{CLUSTER_ID})/ge。这不是业务契约,而是 helper 的内部实现写法。旁边已经有两个行为测试覆盖了真实意图:优先使用 WORK_UNIT_ID,缺省时回退到 CLUSTER_ID

建议删掉这个 exact string marker,或换成文档/源码里的业务级说明,例如“render_template 支持 {{work_unit_id}} 并回退到 CLUSTER_ID”。这样测试仍能防回归,但不会因为实现从 Perl 换成别的等价写法而无意义失败。


📎 完整 codex 原始输出(存档备查)
---
pr: 8
role: quality
verdict: comment
---

## Verdict
Comment: the WorkUnitV1 contract is readable and focused, but one new source-regression assertion is more brittle than the behavior it protects.

## Evidence
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:323`-`330`: `test_work_unit_v1_contract_markers_are_present` requires the exact implementation string `s/\{\{work_unit_id\}\}/($ENV{WORK_UNIT_ID} || $ENV{CLUSTER_ID})/ge` to appear in `controller_lib.sh`. That duplicates implementation detail instead of business intent. The same file already has behavior-level coverage at `test_render_template_prefers_work_unit_id_over_cluster_alias` and `test_render_template_falls_back_to_cluster_id_when_work_unit_id_is_unset`, so this exact-substitution marker adds brittleness without improving readability or contract clarity.

## What would change your verdict (only if comment or reject)
Remove the exact Perl substitution marker from `required_markers`, or replace it with a business-level marker such as "render_template supports {{work_unit_id}} with CLUSTER_ID fallback" in the docs/source. Keep the two render behavior tests as the real guard for the helper behavior.

REVIEW_DONE:8:quality:comment

⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 架构复审:comment-only,契约方向通过但需补两处合规说明

TL;DR

  • 这是什么: PR cluster-007: v1 work-unit 契约(#3 共识) #8 的 architect 角度复审,只看 CLAUDE/AGENTS 架构合规。
  • 结论是什么: comment-only,不阻塞;WorkUnitV1 契约与 Phase 9 共识一致。
  • 需要维护者做什么: 建议补 render_template 的 Old/New refactor 注释,并把 prompt/script/test 扩展写入实施摘要或 PR body。

详细说明

WorkUnitV1 是 v1 队列条目的标准契约。当前 diff 保持 clusters_plannedclusters_activeclusters_doneclusters_failed 为权威队列,没有引入 work_units_* 迁移、WorkUnitEnvelopeV1 包装层或 state-v2 路线,这和 issue #3 的 Phase 9 共识一致。

我标了两个 comment-only 项。第一,skills/codex-refactor-loop/scripts/controller_lib.sh:131render_template 增加了 {{work_unit_id}} 渲染,但这个被改动的函数缺少要求的 Old/New refactor 注释。第二,实际 diff 还改了 prompt、script、test 文件,而现有 implementation summary 仍只列 REFERENCE.mdSKILL.md 且写 SCOPE_EXTEND: None;这些扩展本身可由 Phase 9 共识覆盖,但需要在摘要或 PR body 里讲清楚。

已检查: 没有 WorkUnitEnvelopeV1 / work_units_* 权威迁移面,没有 disabled tests,没有 proto 改动,没有外部仓库依赖;本地跑 python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py 为 18 tests OK。


📎 完整 codex 原始输出(存档备查)

pr: 8
role: architect
verdict: comment

Verdict

comment-only: the WorkUnitV1 contract matches the Phase 9 decision, but the PR should document one scope drift and one missing Old/New refactor comment before merge hygiene is ideal.

Evidence

  • skills/codex-refactor-loop/REFERENCE.md:5 and skills/codex-refactor-loop/REFERENCE.md:7 define WorkUnitV1 inside the existing clusters_planned, clusters_active, clusters_done, and clusters_failed queues; skills/codex-refactor-loop/REFERENCE.md:9 explicitly forbids migrated queue containers/envelopes/state-v2. This maps to the authorized cluster decision: "v1 继续把 clusters_plannedclusters_activeclusters_doneclusters_failed 作为权威持久队列; 队列元素定义为 WorkUnitV1". No reject.
  • skills/codex-refactor-loop/SKILL.md:693 normalizes accepted audit clusters into WorkUnitV1, and skills/codex-refactor-loop/SKILL.md:697 preserves work_unit_id == id == cluster_id == legacy_cluster_id for audit-backed units. This maps to the authorized cluster decision: "audit compatibility: for existing audit units, work_unit_id == id == cluster_id == legacy_cluster_id". No reject.
  • skills/codex-refactor-loop/scripts/controller_lib.sh:128 changes the render_template shell function by adding {{work_unit_id}} handling at skills/codex-refactor-loop/scripts/controller_lib.sh:131, but the changed function only has the generic usage comment at skills/codex-refactor-loop/scripts/controller_lib.sh:125; it lacks the expected # Refactor (iter2/cluster-007-work-unit-contract-schema): Old pattern: ... New principle: ... comment. This is a checklist comment, not a reject, because AGENTS only requires that skill edits follow the TDD discipline and does not make this exact comment syntax a repository-wide hard clause.
  • Scope honesty is slightly muddy: .refactor-loop/runs/implement-cluster-007-work-unit-contract-schema.md lists only skills/codex-refactor-loop/REFERENCE.md and skills/codex-refactor-loop/SKILL.md under modified files and says SCOPE_EXTEND: None, while the actual PR also changes skills/codex-refactor-loop/prompts/implement.md:1, skills/codex-refactor-loop/prompts/verify.md:1, skills/codex-refactor-loop/prompts/meta-judge.md:3, skills/codex-refactor-loop/scripts/controller_lib.sh:131, and skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:272. This is comment-only because the Phase 9 decision itself listed the prompt files and behavior-oriented verification, and AGENTS says "例外:经 Phase 9 deep consensus 明确授权的 host-agnostic bootstrap / policy 注入修正可先落地,但不得引入具体 host 事实,且必须配套行为测试。"
  • Anti-pattern grep over the PR diff found no actor.HandleEventAsync(, SubscribeAsync<EventEnvelope>, production sleep/delay, sync-over-async, raw HttpClient, disabled [Skip] tests, WorkUnitEnvelopeV1, or authoritative work_units_* migration surface. No proto files are touched, no external repo dependency is introduced, and there is no actor split.
  • python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py passed locally: 18 tests, OK.

What would change your verdict (only if comment or reject)

Add the required Old/New refactor comment above render_template, and update the implementation summary or PR body to acknowledge the prompt/script/test file expansion as authorized support for the Phase 9 WorkUnitV1 decision.

REVIEW_DONE:8:architect:comment
⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

✅ Phase 8 共识达成 — 自动合并

r3 最终:architect comment · tests approve · quality comment(0 reject)。收敛:r1/r2 tests reject → fix r1(applied-7)→ fix r2 窄修(applied-2)→ r3 tests approve。

合并到 auto-refact-dev,关联 #3 自动关闭。

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-loop codex-refactor-loop 管理的 issue/PR 🎉 phase:merged 🚀 phase:pr-open

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant