cluster-007: v1 work-unit 契约(#3 共识)#8
Conversation
Phase 9 r3 consensus(structural):clusters_* 保持权威 + 新增 WorkUnitV1 items + WORK_UNIT_ID alias,零迁移(契合 maintainer「别过度泛化」)。 文档层引入契约于 SKILL.md + REFERENCE.md,向后兼容。 待 Phase 8 三 reviewer 共识后 merge。 ⟦AI:AUTO-LOOP⟧
📊 Phase 8 三 reviewer 共识(不需要人介入)architect / tests / quality 审 v1 work-unit 契约。三 approve → merge。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 架构审查通过: v1 WorkUnit 契约未引入新架构债TL;DR
详细说明这次变更把 我按实际三点 diff 和 cluster 共识文件核对了范围。PR 只改 📎 完整 codex 原始输出(存档备查)pr: 8
|
🤖 tests review: 阻塞,缺少 WorkUnitV1 防回归测试TL;DR
详细说明这次改动是文档 / schema contract 层,没有新增 阻塞点是测试契约本身。 📎 完整 codex 原始输出(存档备查)---
pr: 8
role: tests
verdict: reject
---
## Verdict
Reject: the PR is docs/schema-contract only and adds no sleep/skip problems, but it introduces a no-regression contract without a checked-in source-regression test.
## Evidence
- `skills/codex-refactor-loop/REFERENCE.md:7` through `:10` defines the new `WorkUnitV1` contract and explicitly forbids migrated queue containers, envelope wrappers, normalizer helpers, and state-v2 migration for this contract.
- `skills/codex-refactor-loop/SKILL.md:632` through `:636` adds the operational legacy-read rule for `work_unit_schema_version` and `clusters_*` as authoritative v1 queues; `:692` through `:702` adds the audit-to-`WorkUnitV1` normalization and non-audit no-fabricated-`cluster_id` rule.
- No test file changed in the PR, and the existing source-regression test surface in `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:246` through `:269` only protects the prior `PROJECT_RULES` prompt/bootstrap contract. It does not assert the new work-unit contract is present, nor does it assert forbidden migration tokens such as `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, or `WorkUnitEnvelopeV1` remain absent.
- The implementation summary reports a manual negative grep for the forbidden migration surface, but that check is not committed as a repeatable test. This is exactly the kind of no-regression rule the tests checklist expects to be represented as source-regression assertions.
- Diff scan found no added `sleep`/`delay` test pacing, no `test_polling_allowlist.txt` change, no added `[Skip]` or manual-test category, no loosened existing assertions, and no mock-only pseudo-coverage.
## What would change your verdict
Add a checked-in source-regression test, preferably alongside `ProjectRulesPromptContractTests`, that reads `skills/codex-refactor-loop/REFERENCE.md` and `SKILL.md` and asserts:
- the v1 contract markers are present: `WorkUnitV1`, `work_unit_schema_version`, `work_unit_id == id == cluster_id == legacy_cluster_id`, `WORK_UNIT_ID=$CLUSTER_ID`, and the non-audit rule that `cluster_id` / `legacy_cluster_id` must not be fabricated;
- forbidden migration/framework tokens are absent from the skill surface, including `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, `WorkUnitEnvelopeV1`, `WorkUnitProducerV1`, and `work_unit_producer.py`.
REVIEW_DONE:8:tests:reject
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
🤖 质量评审:commentTL;DR
详细说明WorkUnitV1 是 v1 队列项契约。当前文档在 legacy state 读取处写成
建议改法:把 legacy 映射统一写成 📎 完整 codex 原始输出(存档备查)pr: 8
|
1 similar comment
🤖 质量评审:commentTL;DR
详细说明WorkUnitV1 是 v1 队列项契约。当前文档在 legacy state 读取处写成
建议改法:把 legacy 映射统一写成 📎 完整 codex 原始输出(存档备查)pr: 8
|
📊 Phase 8 fix r1(不需要人介入)architect approve · tests reject · quality(见下)→ 派 fix codex r1。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 PR #8 fix-codex round 1 已完成TL;DR
详细说明本轮 applied 7 项, rejected 0 项, blocked 0 项。 另外按 tests reviewer 的 reject 要求,新增 checked-in source-regression 覆盖:它断言 验证结果: 📎 完整 codex 原始输出(存档备查)# Fix report for PR 8 round 1
## Applied
- (A) `skills/codex-refactor-loop/REFERENCE.md:34`: clarified legacy state mapping as `kind="audit-cluster"` plus `producer="audit"` instead of compressed `kind=producer=audit` wording (addresses reviewer:quality's evidence #1).
- (A) `skills/codex-refactor-loop/SKILL.md:635`: clarified the same legacy mapping in the operational bootstrap/read rule (addresses reviewer:quality's evidence #1).
- (B) `skills/codex-refactor-loop/prompts/implement.md:1`: SCOPE_EXTEND because Phase 9 consensus explicitly authorized prompt identity wiring and reviewer:quality found dead prompt identity surface; changed the primary implement heading to `${WORK_UNIT_ID}` and kept `${CLUSTER_ID}` as the v1 audit compatibility alias (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/prompts/verify.md:1`: SCOPE_EXTEND for the same prompt identity wiring; changed the primary verify heading to `${WORK_UNIT_ID}` and documented `${CLUSTER_ID}` as the compatibility alias for existing artifacts and markers (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/prompts/meta-judge.md:3`: SCOPE_EXTEND because the Phase 9 consensus plan included work-unit framing for the meta-judge prompt; added `${WORK_UNIT_ID}` while preserving `${CLUSTER_ID}` as legacy routing metadata (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/scripts/controller_lib.sh:127`: SCOPE_EXTEND because reviewer:quality cited the renderer as only substituting `{{cluster_id}}`; added `{{work_unit_id}}` rendering with fallback to `$CLUSTER_ID` for legacy callers (addresses reviewer:quality's evidence #2).
- (B) `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:272`: SCOPE_EXTEND because reviewer:tests rejected without checked-in source-regression coverage; added `WorkUnitV1SourceRegressionTests` for required contract markers and forbidden migration/framework tokens (addresses reviewer:tests's evidence #1).
## Rejected as false positive
- None.
## Blocked (cannot fix this round)
- None.
## Build status
- build: pass (`true`; `git diff --check`)
- tests: pass (`python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py`, 20 tests)
## Recommendation for next round
- expect unanimous
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
把 v1 work-unit 契约接入 implement/meta-judge/verify prompt + controller_lib + 补测试(applied-7 / rejected-0 / blocked-0)。详见 fix-pr8-round-1.md。 ⟦AI:AUTO-LOOP⟧
📊 Phase 8 r2 重评审(不需要人介入)fix r1 applied-7(契约接入 prompt/lib + 测试,commit 95f3974)。请 reviewer 审 scope 是否必需。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 质量复审: commentTL;DR
详细说明主要问题在 其余部分质量上没看到阻塞点: 📎 完整 codex 原始输出(存档备查)---
pr: 8
role: quality
verdict: comment
---
## Verdict
Comment: the WorkUnitV1 contract reads cleanly and the diff is focused, but the changed shell helper is missing the cluster self-doc block required for refactor work.
## Evidence
- `skills/codex-refactor-loop/scripts/controller_lib.sh:128`: `render_template` gained the `{{work_unit_id}}` substitution at line 131, but this key changed helper has no `# Refactor (iter2/cluster-007-work-unit-contract-schema):` Old/New block. The test class has a clear block at `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:273`, so the issue is isolated to the executable helper path.
- Naming looks acceptable: `WorkUnitV1`, `work_unit_id`, `legacy_cluster_id`, `kind`, `producer`, and `source_ref` describe the contract purpose directly, and the legacy `cluster_id` alias is explicitly scoped in `REFERENCE.md:29`.
- No dead-code or over-engineering smell found in this diff: the only new executable surface is the `{{work_unit_id}}` renderer branch, and it is exercised by `WorkUnitV1SourceRegressionTests.test_work_unit_v1_contract_markers_are_present`.
- No unrelated drive-by cleanup found; touched files all stay inside the WorkUnitV1 contract, prompt identity aliasing, and source-regression coverage.
## What would change your verdict (only if comment or reject)
Add a short `# Refactor (iter2/cluster-007-work-unit-contract-schema):` Old/New block next to `render_template` or immediately above the `{{work_unit_id}}` substitution, explaining why `WORK_UNIT_ID` falls back to `CLUSTER_ID` during v1 compatibility.
REVIEW_DONE:8:quality:comment
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
🤖 架构复审: comment-only, WorkUnitV1 边界可合并TL;DR
详细说明
我只标 comment,不 reject。两个点需要在 PR 里说明清楚:第一, 📎 完整 codex 原始输出(存档备查)---
pr: 8
role: architect
verdict: comment
---
## Verdict
comment-only: WorkUnitV1 architecture boundary matches the Phase 9 consensus, but the PR should surface two non-blocking compliance nits before merge.
## Evidence
- `skills/codex-refactor-loop/scripts/controller_lib.sh:125-132` changes executable prompt-rendering behavior for `{{work_unit_id}}`, but the changed helper has only a generic usage comment, not the required Old/New refactor comment. Checklist clause: "Old/New pattern comment: each refactored type/method has `// Refactor (iterN/cluster-XXX): Old pattern: … New principle: …`. Missing or vague → comment." This is not a reject because the changed behavior is a small template substitution helper and the new source-regression test class does carry a proper `Refactor (iter2/cluster-007-work-unit-contract-schema)` Old/New comment.
- `skills/codex-refactor-loop/scripts/controller_lib.sh:125-132` and `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:272-316` are net-new PR changes outside the Phase 9 winning plan's file list (`REFERENCE.md`, `SKILL.md`, `prompts/implement.md`, `prompts/verify.md`, `prompts/meta-judge.md`) and outside the implementation summary's stated modified files (`REFERENCE.md`, `SKILL.md`), while the summary says `SCOPE_EXTEND: None`. Checklist clause: "Scope honesty: diff stays within the cluster's declared `scope_paths` (or has a documented SCOPE_EXTEND in implement summary). Diff drift → comment." Architecturally the additions are aligned with AGENTS' skill rule, "重型参考拆 `REFERENCE.md`;脚本放 `scripts/`;prompt 模板放 `prompts/`。" and with the behavior-test requirement, so this is accounting drift rather than a blocking architecture violation.
- Positive checks: the WorkUnitV1 contract keeps `clusters_*` authoritative and explicitly blocks migrated queue containers or envelope wrappers at `skills/codex-refactor-loop/REFERENCE.md:5-39`; Phase 1 normalization preserves audit compatibility and forbids fabricated non-audit cluster ids at `skills/codex-refactor-loop/SKILL.md:693-703`; prompt identity has been split to `${WORK_UNIT_ID}` with `${CLUSTER_ID}` as v1 compatibility alias at `skills/codex-refactor-loop/prompts/implement.md:1-4`, `skills/codex-refactor-loop/prompts/verify.md:1-4`, and `skills/codex-refactor-loop/prompts/meta-judge.md:1-3`.
- Anti-pattern scan over the PR diff found no `actor.HandleEventAsync(`, `SubscribeAsync<EventEnvelope>`, JSON serializer state/payload surface, production `sleep/delay(`, sync-over-async, raw `HttpClient` constructor, `[Skip]`, `work_units_*`, or `WorkUnitEnvelopeV1`; there are no `.proto` changes.
## What would change your verdict
Document the scope extension in the implement summary/PR comment, or keep the code/test additions but explicitly mark them as implementation-supporting scope expansion; add a short Old/New `Refactor (iter2/cluster-007-work-unit-contract-schema)` comment above the changed `render_template` helper if the team wants strict checklist cleanliness.
REVIEW_DONE:8:architect:comment
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
🤖 Tests review: reject, 缺少 render_template 行为测试TL;DR
详细说明
当前新增测试在 建议补的最小测试:source 📎 完整 codex 原始输出(存档备查) ---
pr: 8
role: tests
verdict: reject
---
## Verdict
Reject: PR 8 adds net-new `render_template` behavior for `{{work_unit_id}}`, but the added tests only inspect source markers and never exercise the rendered output contract.
## Evidence
- `skills/codex-refactor-loop/scripts/controller_lib.sh:131` adds the new `{{work_unit_id}}` substitution with fallback from `WORK_UNIT_ID` to `CLUSTER_ID`; this is executable controller behavior, not pure documentation.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:276` only concatenates source files and asserts marker strings such as the literal perl replacement are present. This would not catch runtime failures in the render pipeline, such as `{{work_unit_id}}` staying unresolved, `WORK_UNIT_ID` not taking precedence, or fallback to `CLUSTER_ID` failing after `envsubst`.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:298` is a valid source-regression guard for forbidden migration-surface tokens, and I did not find added `sleep/delay`, `[Skip]`, manual traits, or loosened existing assertions in the PR diff. The blocker is specifically missing behavior coverage for the new renderer logic.
## What would change your verdict
Add behavior tests that source `controller_lib.sh`, render a temporary template containing `{{work_unit_id}}`, and assert:
- when `WORK_UNIT_ID=unit-123` and `CLUSTER_ID=cluster-007`, the rendered output uses `unit-123`;
- when `WORK_UNIT_ID` is unset and `CLUSTER_ID=cluster-007`, the rendered output falls back to `cluster-007`;
- no unresolved `{{work_unit_id}}` remains in the rendered prompt.
REVIEW_DONE:8:tests:reject
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
📊 Phase 8 fix r2(窄修,不需要人介入)tests r2 reject(新理由,有进展)→ fix r2 窄修(只针对 tests r2,防 scope-runaway)。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 fix r2 已补 renderer 行为测试TL;DR
详细说明本轮按“只修 tests 点名 reject”的约束处理。新增测试会 source architect / quality 的 Old/New 注释和 scope accounting 都是 comment-only,不是本轮 blocker;为避免 scope-runaway,本轮没有改 📎 完整 codex 原始输出(存档备查)Fix report for PR 8 round 2Applied
Rejected as false positive
Blocked (cannot fix this round)
Build status
Recommendation for next round
⟦AI:AUTO-LOOP⟧ ⟦AI:AUTO-LOOP⟧ |
只针对 tests r2 点名 gap 最小修复(test_ensure_project_rules_fixed_points.py), 未扩 scope。详见 fix-pr8-round-2.md。 ⟦AI:AUTO-LOOP⟧
📊 Phase 8 r3 重评审(不需要人介入)fix r2 applied-2(窄修,commit 91fd48f)。预期 tests approve → merge。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 tests review r3: PR #8 测试质量通过TL;DR
详细说明这轮实际打开了 PR diff 和新增测试文件。 可执行逻辑方面, 📎 完整 codex 原始输出(存档备查)---
pr: 8
role: tests
verdict: approve
---
## Verdict
Approve: the PR adds focused source-regression coverage for the WorkUnitV1 contract and behavior tests for the only new executable branch in `render_template`.
## Evidence
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:315` `test_work_unit_v1_contract_markers_are_present` asserts the required WorkUnitV1 identity/schema markers across the skill, reference, prompts, and controller helper instead of only bumping line count.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:337` `test_work_unit_v1_forbidden_migration_surface_is_absent` provides source-regression assertions for the cluster's no-premature-migration rule, including `work_units_planned`, `work_units_active`, `work_units_done`, `work_units_failed`, envelope, and producer-helper surfaces.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:357` `test_render_template_prefers_work_unit_id_over_cluster_alias` covers the new `controller_lib.sh:131` branch where `{{work_unit_id}}` renders from `WORK_UNIT_ID` and keeps `{{cluster_id}}` as the legacy alias.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:364` `test_render_template_falls_back_to_cluster_id_when_work_unit_id_is_unset` covers the fallback branch from `WORK_UNIT_ID` to `CLUSTER_ID`.
- Search of the PR diff found no added `[Skip]`, `[Trait("Category","Manual")]`, `Thread.Sleep`, `Task.Delay`, or test pacing sleeps; no polling allowlist changes were added.
- Verified the PR-head test file runs cleanly: `python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py` ran 22 tests and passed.
REVIEW_DONE:8:tests:approve
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
🤖 quality review: commentTL;DR
详细说明
建议删掉这个 exact string marker,或换成文档/源码里的业务级说明,例如“ 📎 完整 codex 原始输出(存档备查)---
pr: 8
role: quality
verdict: comment
---
## Verdict
Comment: the WorkUnitV1 contract is readable and focused, but one new source-regression assertion is more brittle than the behavior it protects.
## Evidence
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:323`-`330`: `test_work_unit_v1_contract_markers_are_present` requires the exact implementation string `s/\{\{work_unit_id\}\}/($ENV{WORK_UNIT_ID} || $ENV{CLUSTER_ID})/ge` to appear in `controller_lib.sh`. That duplicates implementation detail instead of business intent. The same file already has behavior-level coverage at `test_render_template_prefers_work_unit_id_over_cluster_alias` and `test_render_template_falls_back_to_cluster_id_when_work_unit_id_is_unset`, so this exact-substitution marker adds brittleness without improving readability or contract clarity.
## What would change your verdict (only if comment or reject)
Remove the exact Perl substitution marker from `required_markers`, or replace it with a business-level marker such as "render_template supports {{work_unit_id}} with CLUSTER_ID fallback" in the docs/source. Keep the two render behavior tests as the real guard for the helper behavior.
REVIEW_DONE:8:quality:comment
⟦AI:AUTO-LOOP⟧⟦AI:AUTO-LOOP⟧ |
🤖 架构复审:comment-only,契约方向通过但需补两处合规说明TL;DR
详细说明
我标了两个 comment-only 项。第一, 已检查: 没有 📎 完整 codex 原始输出(存档备查)pr: 8
|
✅ Phase 8 共识达成 — 自动合并r3 最终:architect comment · tests approve · quality comment(0 reject)。收敛:r1/r2 tests reject → fix r1(applied-7)→ fix r2 窄修(applied-2)→ r3 tests approve。 合并到 auto-refact-dev,关联 #3 自动关闭。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
摘要
实现 issue #3(cluster-007)Phase 9 r3 共识(structural):v1 work-unit 契约 ——
clusters_*保持权威 + 新增WorkUnitV1items +WORK_UNIT_IDalias,零迁移(不引入 WorkUnitEnvelopeV1),向后兼容。文档层落于 SKILL.md + REFERENCE.md。契合 maintainer 指令「重构即开发也可以,别过度泛化」——薄契约、零迁移。
共识链路:#3 r1→r3(converge×2 → 3/3 consensus structural)。完整记录见 issue #3。
🤖 Auto-loop · 共识 implement(#3)
⟦AI:AUTO-LOOP⟧