Skip to content

Auto refact dev#36

Merged
loning merged 143 commits into
devfrom
auto-refact-dev
May 28, 2026
Merged

Auto refact dev#36
loning merged 143 commits into
devfrom
auto-refact-dev

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented May 25, 2026

No description provided.

loning and others added 30 commits May 25, 2026 16:47
dogfood /codex-refactor-loop 自审第一轮:
- 新增 skills/codex-refactor-loop/host.env.example —— 泛化路线 #2 的
  enabler,把 SKILL.md「Host 配置」表固化成可复制模板,含 BUILD_CMD
  空格坑与 GH_REPO/gh-CLI 冲突坑的内联说明。
- 新增 IMPROVEMENT-BACKLOG.md —— 持久化发现清单(F1-F7)+ 自审判定线
  + loop 迭代日志,使发现跨上下文压缩存活。

仅卫生 / 增量改动,未触 SKILL.md 正文逻辑(遵守 CLAUDE.md 脱壳前约束)。
未 push。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
baseline 测试(全新 controller 读当前 SKILL.md)暴露三个洞:
首轮会只 bootstrap state + 派 audit,启动 0/5 daemon、不建 labels
(daemon/label 被写成"别处已起好"的 steady-state 检查,入口段从不引用)。

三处修正:
1. Quick start:首次唤醒必须按序跑完 Phase 0 强制序列;默认走
   GitHub 全流程,❌ 严禁降级成本地分析/报告任务。
2. Phase 0 新增「首次唤醒强制序列」:host.env 自检(缺失即停不臆造)
   → state+分支 → 建全套 labels → 起并挂载全部 5 daemon(pgrep 验=1)
   → 派 audit。附首轮反模式 ❌ 清单。
3. Controller 段:默认 worker = codex CLI(codex exec),❌ 严禁用
   Claude Agent/Task subagent 替代——无人值守不变量(harness-track /
   log sweep / floor 计数 / task-notification)全建立在 worker 是
   codex 进程之上。

未碰 phase/路由/marker 语义等正文逻辑,属强化既有强制项 + 维护者
明确指令(默认 github / 默认起 daemon / 默认 codex 替代 subagent)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dogfood 实跑暴露:comment-monitor.sh fails-closed,MAINTAINER_WHITELIST
为空即 FATAL 退出。5 daemon 必须全起 → 它是事实必填项,host.env.example
原标「可选」矛盾。改标 + backlog 记 F8/F9/F10 + iter-2 日志。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
需求:确保始终 ≥2 个本仓库 codex 并行。两个根因(dogfood 实测):

1. 跨 host 过计:floor 计数用相对子串 .refactor-loop/logs/,同机两个
   loop 共享该子串 → 互相过计(实测他 host 在跑时本仓库 actual=10/14,
   实际本仓库 0-1)→ floor 永远"满"、本仓库永远补不上。
   修:按 $REPO_ROOT 绝对路径 scope(caller 必须传绝对 --cd)。

2. 单 codex 双计:每个 codex 派生 真 supervisor(bash spawn-codex.sh)
   + 一个 shell -c wrapper(harness 命令回显),两个都含 spawn-codex.sh
   → 每个真 codex 算成 2,CODEX_FLOOR=2 被单个真 codex 满足。
   修:排除含 ` -c ` 的行,只数真 supervisor。

3. floor 参数化:CODEX_FLOOR(host.env,默认 5),硬下限 2(<2 按 2),
   小型仓宜设 2。

改:concurrency_monitor.count_in_flight_codex / peek.list_loop_codex /
SKILL 两处判定脚本 + floor 节 + 过计节;host.env.example 文档化。
验证:旧逻辑报 10(他 host 污染),新逻辑报 0(本仓库真值)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 9 r4 consensus(minimal framing):
- 新增 scripts/ensure_project_rules_fixed_points.py(+ 测试):幂等向 host
  $PROJECT_RULES 写一个带 sentinel 的 managed 不动点区块,不覆盖 host 已有
  内容、可重复 bootstrap 不重复写、host 可在区块外自由扩展。不动点条目取自
  fkst/aevatar CLAUDE.md 提炼的泛化理论。
- SKILL.md Phase 0「首次唤醒强制序列」增 ProjectRulesFixedPointEnsurer 步骤。
- prompt PROJECT_RULES 接线(7 处 SCOPE_EXTEND 已在实施摘要声明)。
- CLAUDE.md:dogfood 运行 ensurer,注入本仓库的不动点区块。

待 Phase 8 三 reviewer 共识后 merge,关 #2。

⟦AI:AUTO-LOOP⟧
Phase 8 fix round 1:补强 ensure_project_rules_fixed_points 的测试覆盖
与实现(applied-4 / rejected-0 / blocked-0)。详见 fix-pr7-round-1.md。

⟦AI:AUTO-LOOP⟧
Phase 8 fix round 2:补强 ensurer 测试 + 对齐 audit/solver/review-fix prompt
的 PROJECT_RULES 语义(applied-13 / rejected-0 / blocked-0)。详见 fix-pr7-round-2.md。

⟦AI:AUTO-LOOP⟧
reflector retry-fix:只针对 tests r3 点名 gap 最小修复,未扩 scope
(仅 test_ensure_project_rules_fixed_points.py)。blocked-1 见 fix-pr7-round-3.md。

⟦AI:AUTO-LOOP⟧
…-md-fixed-points

cluster host-claude-md-fixed-points: Phase 0 不动点注入(#2 共识)
Phase 9 r3 consensus(structural):clusters_* 保持权威 + 新增 WorkUnitV1
items + WORK_UNIT_ID alias,零迁移(契合 maintainer「别过度泛化」)。
文档层引入契约于 SKILL.md + REFERENCE.md,向后兼容。

待 Phase 8 三 reviewer 共识后 merge。

⟦AI:AUTO-LOOP⟧
把 v1 work-unit 契约接入 implement/meta-judge/verify prompt + controller_lib
+ 补测试(applied-7 / rejected-0 / blocked-0)。详见 fix-pr8-round-1.md。

⟦AI:AUTO-LOOP⟧
只针对 tests r2 点名 gap 最小修复(test_ensure_project_rules_fixed_points.py),
未扩 scope。详见 fix-pr8-round-2.md。

⟦AI:AUTO-LOOP⟧
…-work-unit-contract-schema

cluster-007: v1 work-unit 契约(#3 共识)
Phase 9 consensus(structural):audit.md 零改动;producer 规范化文档化于
REFERENCE/SKILL + triage-external-issue 接 manual-issue intake。基于 #3 v1 契约。

待 Phase 8 三 reviewer 共识后 merge。

⟦AI:AUTO-LOOP⟧
窄修 reviewer reject;1 项判为误报附理由(见 fix-pr9-round-1.md)。
audit.md 保持零改动。

⟦AI:AUTO-LOOP⟧
…-audit-producer-adapter

cluster-008: audit-as-producer 适配(#4 共识)
Phase 9 consensus(minimal):纯 public copy/policy 编辑(README/REFERENCE/SKILL)
澄清「通用 work-unit 共识引擎,refactor 是合法隐喻之一」;不重命名、不加 alias、
不引入 SkillIdentityV1。契合 maintainer「重构即开发也可以」。

待 Phase 8 三 reviewer 共识后 merge。

⟦AI:AUTO-LOOP⟧
只修 architect 点名项,无重命名/结构改动。详见 fix-pr10-round-1.md。

⟦AI:AUTO-LOOP⟧
…0-rename-alias-strategy

cluster-010: 澄清通用定位(#6 共识 minimal,不重命名)
Phase 9 consensus(structural):minimal docs+test 固化 marker(SOLVER_DONE 等)/
GitHub label 为稳定 v1 operational tokens(保持现状,不重命名);不引入
OperationalNamePolicyV1。脊柱 4 簇最后一簇。

待 Phase 8 三 reviewer 共识后 merge。

⟦AI:AUTO-LOOP⟧
architect 关切落为流程门禁措辞;不重命名 marker/label(守 #5 minimal)。

⟦AI:AUTO-LOOP⟧
…9-marker-label-compat-migration

cluster-009: marker/label 稳定 v1 tokens(#5 共识)
Phase 9 consensus(delete):删 concurrency_monitor 误导性 low-threshold 路径;
CODEX_FLOOR 补给仅 controller wakeup step 1.5;SKILL 澄清职责。

⟦AI:AUTO-LOOP⟧
Phase 9 r4 structural 共识:active Human label 仅保留 🤖 human:auto-推进
与 👤 human:需-maintainer-决策;移除 🆘 human:卡死 / 🆘 human:卡死-需-rework
出 active taxonomy/bootstrap/producer/waiting 判定;rework/deadlock/CI-耗尽
等原因移到 reason surface(auto-loop-stuck label / phase label / banner /
PushNotification / marker reason)。新增 5 个 source-regression 测试。

Closes #15

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(skill): 契约 source-regression 测试套件 + TEST_CMD 可跑(#16 共识)

Phase 9 r4 structural 共识:在现有 test_ensure_project_rules_fixed_points.py
追加 SkillContractSourceRegressionTests;host.env.example 文档化 TEST_CMD
可跑契约测试。

⚠️ SCOPE_EXTEND(6 项,已声明,供 Phase 8 reviewer 评估):为使新契约测试通过,
implement codex 顺带修了被测违反点 —— CI_GUARDS 执行条件化(test-add/verify/
remote-ci-fix/SKILL)、design-issue-reply 用 GH_REPO_SLUG 契约、spawn_with_banner.py
转硬失败 tombstone。这些与 #20(host 语言 policy)/ self-audit #13 主题重叠,
reviewer 若判越界应 reject → fix 收窄回 test 文件 + host.env.example。

Closes #16

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#23 review r1 — tombstone 行为测试 + Refactor Old/New 注释(#16)

Phase 8 r1:tests reject(tombstone 缺执行级行为测试)+ quality reject
(新 class 缺 Refactor Old/New 自文档)。applied-4:
- spawn_with_banner.py tombstone 加执行级行为测试(断言 hard-fail)
- SkillContractSourceRegressionTests + tombstone 改动补 Refactor Old/New 注释
未回退已声明 SCOPE_EXTEND(reviewer 接受)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 9 r2 structural 共识:SKILL.md 宽泛 "所有 prompt 直接 post" 段收窄为
两组明确清单 —— direct-post prompts(含 ## GitHub post 引用 _github-post-rules.md)
vs marker/artifact-only prompts(仅 sentinel + marker 契约)。新增可枚举行为测试
test_github_post_contract_matches_prompt_roster。

SCOPE_EXTEND: triage-external-issue.md 补 _github-post-rules.md 引用(direct-post
roster 成员但缺该引用,否则测试 GREEN 不可能;同逻辑一行修复)。

Closes #13

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(skill): 脚本健壮性/可移植性 7 项机械修复(#21,skill-improvement WU-1,4,5,6,7,8,9)

- WU-1: .codex-plugin/plugin.json + GEMINI.md 描述改为 Consensus R&D 身份
- WU-4: REVIEW_BASE_BRANCH fallback develop → dev(dev_sync_daemon/controller_lib)
- WU-5: dev_sync merge-in-progress 检测改用 git rev-parse --git-path(worktree 安全)
- WU-6: dev-sync resolver in-flight 检查按绝对 REPO_ROOT scope
- WU-7: triage-monitor 改 claimed/spawned/failed/done 状态机 + 重试
- WU-8: spawn-codex.sh 拒绝复用缺终止 EXIT= 的 log
- WU-9: controller_lib eval → argv 数组(label 注入安全)
+ WU-5/WU-9 确定性测试

Closes #21

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#24 review r1 — spawn-codex log flag rationale + 行为测试(#21)

Phase 8 r1:tests reject(--overwrite-finished-log flag 缺 caller/test/rationale)。
applied-4:补 flag rationale 注释 + 确定性行为测试(finished log 含 EXIT= 时
无 flag 拒绝 / 带 flag 允许覆盖),含 quality comment 点。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#24 review r2 — triage-monitor 状态机 loop 分支确定性测试(#21)

applied-2:补 WU-7 triage-monitor 主循环状态机分支(claimed/spawned/failed/done)
的确定性测试,覆盖实际 dispatch 路径而非仅 helper。merge auto-refact-dev 已解
test 文件冲突(保留双方测试)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(skill): SKILL.md 拆为 controller 契约+phase index,重型内容移 REFERENCE.md(#12 共识)

Phase 9 r2 structural 共识:SKILL.md 2537→665 行(controller 契约 + phase index +
硬不变量);完整 banner/escalation/consensus 模板、state schema、recovery playbook、
daemon 命令体、label bootstrap、历史 bilingual 笔记移入 REFERENCE.md(2443 行),稳定
anchor 链接。新增 test_skill_entrypoint_contract.py + test_skill_reference_anchors.py
守卫 frontmatter/行预算/anchor 完整/必留不变量。

Closes #12

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#30 review r1 — 契约测试接入 test path + split-boundary 断言(#12)

tests reject:2 新契约测试未接入 test path。applied-1:host.env.example TEST_CMD
改 unittest discovery(scripts/ 已 ≥3 个 test_*.py,越 #16 split 阈值);更新
test_ensure_project_rules_fixed_points.py 的 source-contract 断言匹配 #12 split
边界(SKILL 留 controller 不变量 + lazy link,重型细节在 REFERENCE)。
live dogfood host.env TEST_CMD 已由 controller 同步为 discovery。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#30 review r2 — 补 split 的 Refactor Old/New 自文档块(#12)

quality reject:大重构缺自身 Refactor(iter3/skill-md-controller-split) Old/New 块。
applied-1:在 SKILL.md 加该自文档块说明 split。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…共识) (#33)

* refactor(skill): host-agnostic skill root 自定位,脱 .claude/skills 硬编码(#19 共识)

Phase 9 r4 structural 共识(4 轮收敛):dev_sync_daemon.py / triage-monitor.sh
inline self-location(CODEX_REFACTOR_LOOP_SKILL_ROOT 可选 override,否则
__file__/BASH_SOURCE parent 自定位 + 验证 SKILL.md/spawn-codex.sh/prompts/,
invalid override fatal,不 fallback .claude/skills);SKILL.md/REFERENCE.md active
launch/dispatch 路径示例改 skill-relative + Skill root contract 段;9 个 prompt +
_github-post-rules.md 的 locator 措辞 skill-relative。+ 4 个 self-location 测试。
不引入 SkillRootLocatorV1/skill_root.py/外部 $SKILL_ROOT/host.env 字段。

Closes #19

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#33 review r1 — skill_root 分支行为测试 + 注释格式 + 清 stale(#19)

tests reject:dev_sync_daemon.skill_root() 一分支缺行为测试。applied-3:补该分支
确定性行为测试 + 修 2 处 Refactor Old/New 注释格式 + 清 1 行与 skill-root 语义
冲突的 stale 注释。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skill): PR#33 review r2 — triage 默认自定位行为测试 + 测试类拆分 + SKILL scope 注(#19)

按 Phase 8 r2 verdict 修 reject + 2 advisory comment:
- 给 triage-monitor.sh 加 CODEX_REFACTOR_LOOP_SKILL_ROOT_PRINT short-circuit
  hook,在自定位之后/任何 mutation 之前打印 resolved skill root 并退出。
  补 SkillRootContractSourceRegressionTests::test_triage_monitor_default_self_location
  断言 unset override 时该 hook 解析到 tmp 安装目录(default 分支行为测试)。
- 把新增的 skill-root locator 回归测试从 SkillContractSourceRegressionTests
  (~395 LOC,超本文件 250 LOC 阈值)拆到新 class
  SkillRootContractSourceRegressionTests,让原 class 回 contiguous-readable。
- SKILL.md 「Skill Root Contract」段末尾加 scope 行,说明详细路径示例和
  host 安装变体留 REFERENCE.md,SKILL.md 仅留 controller-level 不变量。

verify: python3 -m unittest discover -s skills/codex-refactor-loop/scripts
        -p 'test_*.py' → 70 tests passed.

⟦AI:AUTO-LOOP⟧

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
loning and others added 28 commits May 29, 2026 03:56
…ective)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ssue143-actor-heartbeat-lease

# Conflicts:
#	skills/codex-refactor-loop/REFERENCE.md
#	skills/codex-refactor-loop/SKILL.md
…(round 1)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…heartbeat-lease

fix(#143): actor-owned DaemonHeartbeatLease(daemon heartbeat 跟随工作循环,杜绝 liveness 谎报)
…ssue139-wake-source-contract

# Conflicts:
#	skills/codex-refactor-loop/REFERENCE.md
#	skills/codex-refactor-loop/SKILL.md
#	skills/codex-refactor-loop/scripts/test_skill_entrypoint_contract.py
…ource-contract

fix(#139): 统一 wake-source 契约措辞(每会话必维持 Monitor bridge)
…regression)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sion + scheduler behavior test)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nv-example

feat(#140): host.env.example 补全(v1.0 泛化 A)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-host-hardcode

fix(#126): prompt 去硬编码 host 占位(复用 host.env surface)
…round 1)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ssue141-install-walkthrough

# Conflicts:
#	skills/codex-refactor-loop/SKILL.md
#	skills/codex-refactor-loop/scripts/test_anti_stop_restart_helper_contract.py
…l-walkthrough

feat(#141): 下游装机 walkthrough(v1.0 泛化 B)
…ssue138-test-daemon-leak

# Conflicts:
#	skills/codex-refactor-loop/scripts/test_restart_daemons.py
…aemon-leak

fix(#138): 修 test_restart_daemons 泄漏 detached daemon 进程
…_pr_merges_min

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-marker-parser

fix(#149): router marker parser 修复(终结 dispatch-gap)
…-pr-merges-writer

fix(#145): merge_pr 写 recent-pr-merges.json(修 release gate 信号)
… daemon-heartbeats.json

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ion(architect/tests reject)(round 1)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eartbeat-source

fix(#154): release gate fresh_heartbeats 读真实 heartbeat 源(解最后一个永红信号)
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 28, 2026

🚀 dev rollup(release 前置)

把本轮 v1.0 全量工作(merge-reference 单文件、#143 heartbeat-lease、#149 router parser、#139/#140/#141/#126/#138/#145/#154 等)从 auto-refact-dev rollup 到 review base dev。CI 4/4 绿,MERGEABLE/CLEAN。merge 后 dev 跑 required checks → 解 required_checks_recent_green。实际发版仍由 auto_release_gate --dispatch 把关。

⟦AI:AUTO-LOOP⟧

@loning loning merged commit 45b6935 into dev May 28, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant