Skip to content

skill: Human label taxonomy 收敛为两个 active label(#15 共识)#22

Merged
loning merged 1 commit into
auto-refact-devfrom
refactor/iter3-skill-human-label-taxonomy
May 25, 2026
Merged

skill: Human label taxonomy 收敛为两个 active label(#15 共识)#22
loning merged 1 commit into
auto-refact-devfrom
refactor/iter3-skill-human-label-taxonomy

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented May 25, 2026

🤖 skill: Human label taxonomy 收敛为两个 active label(#15 r4 共识)

Phase 9 r4 structural 共识落地。active Human label 仅保留:

  • 🤖 human:auto-推进
  • 👤 human:需-maintainer-决策

移除 🆘 human:卡死 / 🆘 human:卡死-需-rework 出 active taxonomy / bootstrap / producer / waiting 判定;rework / deadlock / CI-耗尽 等原因改放 reason surface(auto-loop-stuck、phase label、banner/comment、PushNotification、marker reason)。

5 文件:SKILL.md、concurrency_monitor.py、controller_lib.sh、peek.sh、test_ensure_project_rules_fixed_points.py(+5 source-regression 测试)。

共识依据:.refactor-loop/runs/phase9-issue15-r4-judge.md

Closes #15

⟦AI:AUTO-LOOP⟧

Phase 9 r4 structural 共识:active Human label 仅保留 🤖 human:auto-推进
与 👤 human:需-maintainer-决策;移除 🆘 human:卡死 / 🆘 human:卡死-需-rework
出 active taxonomy/bootstrap/producer/waiting 判定;rework/deadlock/CI-耗尽
等原因移到 reason surface(auto-loop-stuck label / phase label / banner /
PushNotification / marker reason)。新增 5 个 source-regression 测试。

Closes #15

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 状态卡片 — reviewer 派出

维度
阶段 派出 codex(role=reviewer)
codex log review-pr22-architect.log
工作目录 /Users/auric/consensus-rnd
no-output stall window 5400s(~90 min 无输出窗口)
上下文 Phase 8 review(architect 角度):#15 Human label taxonomy 收敛共识落地,3 reviewer 独立裁决。
下一步自动会做 1. 三 reviewer 完成 verdict marker 2. controller 计算 consensus 3. unanimous → auto-merge / reject → fix r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 状态卡片 — reviewer 派出

维度
阶段 派出 codex(role=reviewer)
codex log review-pr22-tests.log
工作目录 /Users/auric/consensus-rnd
no-output stall window 5400s(~90 min 无输出窗口)
上下文 Phase 8 review(tests 角度):#15 Human label taxonomy 收敛共识落地,3 reviewer 独立裁决。
下一步自动会做 1. 三 reviewer 完成 verdict marker 2. controller 计算 consensus 3. unanimous → auto-merge / reject → fix r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

📊 状态卡片 — reviewer 派出

维度
阶段 派出 codex(role=reviewer)
codex log review-pr22-quality.log
工作目录 /Users/auric/consensus-rnd
no-output stall window 5400s(~90 min 无输出窗口)
上下文 Phase 8 review(quality 角度):#15 Human label taxonomy 收敛共识落地,3 reviewer 独立裁决。
下一步自动会做 1. 三 reviewer 完成 verdict marker 2. controller 计算 consensus 3. unanimous → auto-merge / reject → fix r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 Quality review: approve

TL;DR


详细说明

这次改动集中在 Human label taxonomy:active label 收敛到 🤖 human:auto-推进👤 human:需-maintainer-决策,把 rework / deadlock / CI-stuck 这类原因放到 reason banner/comment 里。实现侧没有新增单实现抽象,也没有把旧 🆘 human: 重新作为生产路径暴露出来。

我完整打开了所有 touched file。concurrency_monitor.py 的 waiting predicate 保持为一条直接条件,controller_lib.sh 只做 cleanup label 扩展,peek.sh 的 operator hint 和 taxonomy 对齐,新增 source-regression tests 的 helper 命名也能直接表达业务意图。Refactor self-doc comment 在主要落点都存在且能说明 Old/New。


📎 完整 codex 原始输出(存档备查)
---
pr: 18
role: quality
verdict: approve
---

## Verdict
Approve: the change is focused on the Human label taxonomy, the new names express the two active label states clearly, and I found no dead code, over-engineering, avoidable duplication, or unclear refactor self-doc blocks.

## Evidence
- `skills/codex-refactor-loop/SKILL.md:2395`: the Label group now exposes exactly the two active Human states, `🤖 human:auto-推进` and `👤 human:需-maintainer-决策`, with rework/deadlock/CI-stuck moved into the reason surface instead of new label names.
- `skills/codex-refactor-loop/scripts/concurrency_monitor.py:184`: `compute_expected` remains a small single-purpose function; the changed waiting predicate is one direct condition and does not introduce a new helper or abstraction.
- `skills/codex-refactor-loop/scripts/controller_lib.sh:62` and `skills/codex-refactor-loop/scripts/controller_lib.sh:154`: the cleanup changes remain scoped to removing active and legacy Human labels; no `🆘 human:` label is produced on any new `--add-label` path.
- `skills/codex-refactor-loop/scripts/peek.sh:145`: the routing hint now names the active maintainer-decision label plus reason banner, keeping the operator-facing wording aligned with the two-label taxonomy.
- `skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:519`: the new source-regression test class uses readable, domain-specific helper names (`CANONICAL_HUMAN_LABELS`, `NON_AUTO_HUMAN_LABEL`, `REMOVED_HUMAN_LABELS`) and each helper is used by the tests.
- Refactor self-doc comments are present and understandable at each touched implementation/test/doc site I checked, including `SKILL.md:172`, `concurrency_monitor.py:186`, `controller_lib.sh:62`, `peek.sh:113`, and `test_ensure_project_rules_fixed_points.py:520`.

⟦AI:AUTO-LOOP

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 tests review: PR #22 通过

TL;DR

  • 这是什么: tests 视角独立 review,只看测试覆盖与测试质量。
  • 结论: approve,新增测试覆盖了 Human label taxonomy 的核心回归点。
  • 下一步: controller 可继续等待其他 reviewer verdict,tests 侧无需返工。

详细说明

这轮改动没有碰 src/agents/,所以 test/.../<TypeName>Tests.cs 映射规则不适用。实际测试落在 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py,覆盖了 Human label 表、bootstrap 创建、escalation 路由、concurrency_monitor.compute_expected 的等待判定、controller cleanup 与 peek.sh routing hint。

我也扫了稳定性红线:没有新增 sleep / delay 测试节奏,没有 polling allowlist 改动,没有 [Skip],没有 [Trait("Category","Manual")],没有弱化既有断言,也没看到 mock-only pseudo-coverage。从 PR head 的 isolated archive 跑 python3 -m py_compile ... && python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py,结果是 35 tests OK。


📎 完整 codex 原始输出(存档备查)

pr: 18
role: tests
verdict: approve

Verdict

Approve: the PR adds meaningful regression coverage for the two-active-human-label taxonomy and does not weaken test quality or stability.

Evidence

  • skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:538 adds test_human_label_taxonomy_has_single_non_auto_label, which verifies the documented Human label table and bootstrap block contain exactly the two active labels and no longer create the removed 🆘 human:* labels.
  • skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:556 adds test_human_escalation_routes_use_reason_surface, which asserts escalation routes use 👤 human:需-maintainer-决策 plus reason/banner/PushNotification surface instead of the removed emergency labels.
  • skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:579 exercises the changed concurrency_monitor.compute_expected behavior directly: 👤 human:需-maintainer-决策 suppresses expected codex, while the removed 🆘 human:* labels no longer suppress no-gap monitoring.
  • skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:601 and :612 add source-regression assertions for controller cleanup and peek routing, blocking accidental reintroduction of 🆘 as a produced/recommended Human label while still allowing legacy cleanup paths.
  • No src/ or agents/ production files are touched, so the test/.../<TypeName>Tests.cs mapping rule does not apply to this markdown/script skill PR.
  • Diff scan found no added sleep / delay test pacing, no polling allowlist change, no [Skip], no [Trait("Category","Manual")], no bump-only assertion, no loosened existing assertion, and no mock-only pseudo-coverage.
  • Verified PR head from an isolated archive: python3 -m py_compile skills/codex-refactor-loop/scripts/concurrency_monitor.py skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py && python3 skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py passed, 35 tests OK.

REVIEW_DONE:22:tests:approve

⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 25, 2026

🤖 Architect review: comment,源码注释语言需收敛

TL;DR

  • 这是什么:PR skill: Human label taxonomy 收敛为两个 active label(#15 共识) #22 的架构合规 review,结论是 comment,不阻塞架构合并。
  • 现在结论:Human label taxonomy 收敛本身符合 Phase 9 授权范围,但新增的 .py / .sh # Refactor 注释用了中文。
  • 需要 maintainer 做什么:建议作者把这些 source comments 改成英文,保持 Old/New pattern 内容不变。

详细说明

从架构角度看,diff 只改了 skills/codex-refactor-loop/SKILL.md 和同目录 scripts/ 下的 helper/test,匹配 #15 Phase 9 共识范围;没有 .proto、版本号、外部 repo reference,也没有新增 actor/serializer/HttpClient/disabled-test 这类红线。

唯一需要 surfacing 的是源码语言 policy:这个 skill 自己写明 source files 内部的注释、docstring、log/error 字符串应为英文。PR 新增的 # Refactor 注释在 concurrency_monitor.pycontroller_lib.shpeek.shtest_ensure_project_rules_fixed_points.py 里包含中文说明。建议改成类似: Old pattern: four Human labels, including two emergency labels, scattered no-gap/escalation predicates. New principle: exactly two active Human labels; causes move to reason surfaces (#15 structural consensus).


📎 完整 codex 原始输出(存档备查)

pr: 22
role: architect
verdict: comment

Verdict

comment-only: the taxonomy change stays inside the Phase 9 authorized scope, but several new source comments violate the skill's own source-English policy and should be rewritten before merge if the team wants zero lint drift.

Evidence

  • skills/codex-refactor-loop/scripts/concurrency_monitor.py:187: new # Refactor comment uses Chinese prose in a source file (四个 Human label...). The skill's hard rule states: Source files are English-only; external user-facing artifacts are 中文 by default; the detailed policy states: 英文适用对象:所有源文件(.rs / .lua / .sh / .py / .ts)内部自然语言与代码元素,包括注释、docstring、log.{info,warn,error} 字符串、error / panic 文本、代码 identifier、代码内构造的 commit-body 模板字符串。
  • skills/codex-refactor-loop/scripts/controller_lib.sh:155: same issue in a shell source comment added for this refactor.
  • skills/codex-refactor-loop/scripts/peek.sh:114: same issue in a shell source comment added for this refactor.
  • skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:28 and skills/codex-refactor-loop/scripts/test_ensure_project_rules_fixed_points.py:521: same issue in Python test source comments added for this refactor.
  • Scope check: changed files are limited to skills/codex-refactor-loop/SKILL.md and its scripts/ helpers/tests, matching the Phase 9 consensus file list. Root CLAUDE.md allows this exception: ``skills/codex-refactor-loop/为 host 项目移植,**verbatim**,仍带"重构"外壳与少量 host 主张。泛化路线见README.md` 的「泛化路线」。脱壳 / 重命名前不要改它的正文逻辑;例外:经 Phase 9 deep consensus 明确授权的 host-agnostic bootstrap / policy 注入修正可先落地,但不得引入具体 host 事实,且必须配套行为测试。`
  • Anti-pattern grep found no new actor.HandleEventAsync(, SubscribeAsync<EventEnvelope>, JSON serializer state/payload path, production sleep/delay(, sync-over-async marker, raw HttpClient constructor, Dictionary<,> cross-request state, or [Skip]/disabled tests in the added diff. No .proto, version, or external-repo-reference changes were present.

What would change your verdict (only if comment or reject)

Rewrite the newly added # Refactor source comments in .py and .sh files to English while keeping the Old/New principle content intact, for example Old pattern: four Human labels, including two emergency labels, scattered no-gap/escalation predicates. New principle: exactly two active Human labels; causes move to reason surfaces (#15 structural consensus).

REVIEW_DONE:22:architect:comment

⟦AI:AUTO-LOOP⟧

⟦AI:AUTO-LOOP⟧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-loop codex-refactor-loop 管理的 issue/PR 🤖 human:auto-推进 🎉 phase:merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant