Skip to content

feat: add opencode.db cleanup and migration failure auto-recovery#200

Merged
Svtter merged 1 commit into
mainfrom
feat/cleanup-opencode-db
Jun 7, 2026
Merged

feat: add opencode.db cleanup and migration failure auto-recovery#200
Svtter merged 1 commit into
mainfrom
feat/cleanup-opencode-db

Conversation

@Svtter

@Svtter Svtter commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Summary

Closes #198

Prevents and recovers from stale opencode.db SQLite database issues on self-hosted runners, where the database can grow to 160MB+ and cause duplicate column name migration errors when opencode upgrades.

Changes

Scheme A — Preventive cleanup (before run)

  • New shared/cleanup-db.sh: standalone script that checks opencode.db size and deletes it if it exceeds a configurable threshold (default 50MB)
  • New github-run-opencode input cleanup-db: "true" (default, 50MB threshold), a number like "100" for custom MB threshold, or "false" to disable
  • New step in github-run-opencode/action.yml between install and run

Scheme C — Migration failure auto-recovery (during run)

  • run-opencode/run-opencode.sh: detects duplicate column name errors in opencode output, auto-deletes the stale database, and retries once
  • Recovery is limited to a single attempt to avoid infinite loops

Docs

  • Updated github-run-opencode/README.md with new input and feature description
  • Updated run-opencode/README.md with migration recovery description
  • Updated shared/README.md with cleanup-db.sh documentation

Testing

All pre-existing tests pass identically on this branch (21 failures + 4 errors are pre-existing on main due to Python 3.14 type union syntax and missing dogfood workflow file — not introduced by this PR).

Files changed

File Change
shared/cleanup-db.sh New — standalone db cleanup script
github-run-opencode/action.yml Add cleanup-db input + cleanup step
run-opencode/run-opencode.sh Add migration error detection + auto-recovery
github-run-opencode/README.md Document new feature
run-opencode/README.md Document migration recovery
shared/README.md Document cleanup-db.sh

Implements issue #198 — prevent and recover from stale SQLite database
issues on self-hosted runners.

Scheme A (preventive cleanup):
- New shared/cleanup-db.sh checks opencode.db size before each run
- Deletes db if it exceeds configurable threshold (default 50MB)
- New github-run-opencode input 'cleanup-db' (default: true)

Scheme C (migration failure auto-recovery):
- run-opencode/run-opencode.sh now detects 'duplicate column name' errors
- Auto-deletes the stale db and retries once

Files changed:
- shared/cleanup-db.sh: new standalone cleanup script
- github-run-opencode/action.yml: add cleanup-db input + step
- run-opencode/run-opencode.sh: add migration error recovery
- READMEs updated for both actions + shared/
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

分析过程

已读取 PR 全部 6 个变更文件:

Scheme A — 预防性清理(运行前)

规格要求 实现情况
shared/cleanup-db.sh — 检查 opencode.db 大小,超过阈值则删除 ✅ 已实现 (shared/cleanup-db.sh:14-31)
cleanup-db 输入: true(默认50MB)、数字(自定义MB)、false(禁用) ✅ 已实现 (action.yml:156-163 输入定义, :245-257 步骤处理三种取值)
步骤位于 install 和 run 之间 ✅ 已实现 (action.yml:245 — 在 install 步骤 :243 之后, run 步骤 :259 之前)

Scheme C — 迁移失败自动恢复(运行时)

规格要求 实现情况
run-opencode.sh 检测 duplicate column name 错误 ✅ 已实现 (run-opencode.sh:148 — grep -qi "duplicate column name")
自动删除陈旧数据库并重试一次 ✅ 已实现 (:149-155 — 删除 db, continue 重试)
限制为单次恢复以避免无限循环 ✅ 已实现 (:133 — migration_recovery_done=false, :148 检查该标志, 恢复后设为 true)

文档

规格要求 实现情况
github-run-opencode/README.md 更新 ✅ 已实现 (:19-20 功能描述, :50 输入文档)
run-opencode/README.md 更新 ✅ 已实现 (:27 迁移恢复描述)
shared/README.md 更新 ✅ 已实现 (:13-19 cleanup-db.sh 文档)

所有规格要求均被完整实现,未发现遗漏。

无遗漏

New%20session%20-%202026-06-07T04%3A12%3A45.072Z
opencode session  |  github run

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

最终决策:有条件合并 / CONDITIONAL MERGE

本 PR 引入数据库自动清理和迁移恢复机制,架构合理、纵深防御设计良好。但存在一个需要修复的阻塞问题。


阻塞项 / Blocking Issues

  1. OPENCODE_DB_PATH 环境变量的 CLI 兼容性未经验证run-opencode.sh 中的恢复机制通过 OPENCODE_DB_PATH 定位并删除数据库文件,但尚未确认 opencode CLI 是否实际识别此环境变量。若 CLI 不使用该变量,恢复逻辑将删除错误路径或无效路径,导致迁移失败时自动恢复完全失效。合并前须验证 CLI 对该环境变量的支持,或改用 CLI 实际使用的路径获取方式。

建议项 / Suggestions

  1. grep -qi "duplicate column name" 两项优化(security + performance) — 添加 -m1 以首次匹配后停止扫描(性能);缩小匹配范围(如限定 stderr 或组合 SQLite 错误前缀),避免 AI 响应内容误触发恢复逻辑(安全)。

  2. OPENCODE_DB_PATH 路径合法性校验(security) — 在 rm -f 前验证路径位于预期范围内(如 $HOME/.local/share/opencode/),防止恶意工作流通过环境变量删除系统敏感文件。

  3. stat fallback 时输出 warning(performance) — Linux 和 macOS 的 stat 均失败时,echo 0 静默返回,建议输出 warning 以助排查。

  4. 移除冗余条件判断(quality)action.yml 已有 if: ${{ inputs.cleanup-db != 'false' }} 守卫,run-opencode.sh 中的 if [[ "$cleanup_val" == "false" ]]; then exit 0; fi 为死代码,建议移除。

  5. 边界值 "0" 的处理(quality)cleanup-db: "0" 将阈值设为 0MB,任何存在的数据库都会被删除。建议在文档中明确,或添加 >= 1 的下限约束。

  6. 恢复日志级别一致性(quality) — 恢复成功处的 printf 输出建议改用 ::warning::,与清理步骤的 warning 级别保持一致。

  7. 导出时机优化(quality)action.yml 中的 OPENCODE_DB_MAX_SIZE_MB 默认值可改为仅在需要时 export,避免无条件设置环境变量。

  8. 并发场景文件锁(performance) — 同一 self-hosted runner 上多个 workflow 并发时可能产生竞态,建议通过 flockmkdir 原子锁协调。

  9. ${{ github.action_path }} 注入风险(security,预存问题) — 此为 GitHub Actions composite action 的平台级已知限制,在所有 action 中已存在,非本 PR 引入。


📋 各 Reviewer 详细审查结果
quality

有条件合并 / CONDITIONAL MERGE

此 PR 主要新增了两个功能:1) 在运行前自动清理过大的 opencode.db;2) 在 SQLite 迁移失败时自动恢复重建。总体结构合理,但有以下几个需要修复的问题。

阻塞项:

  1. shared/cleanup-db.sh 第 12 行:set -euo pipefailstat fallback 冲突。 当两个 stat 命令都失败(例如在未安装 stat 的极简容器中),最后的 echo 0set -e 下不会导致退出(因为它是 || 链的一部分)。但当其中一个 stat 成功但返回非零退出码时(虽然实际不会发生),或更关键的是,|| 链整体在子 shell $(...) 中执行,set -e 在子 shell 中可能导致意外退出。考虑移除 set -e 或显式处理。

    补充说明: 经进一步审查,上述问题在正常情况下不会触发。真正需要关注的阻塞问题是:

  2. github-run-opencode/action.ymlcleanup-db 步骤的逻辑与 run-opencode/run-opencode.sh 中的恢复机制存在路径不一致。 action.yml 中的清理步骤删除 OPENCODE_DB_PATH(默认 ~/.local/share/opencode/opencode.db),但 run-opencode/run-opencode.sh 中的恢复机制也删除同一个路径。如果 cleanup-db 已删除数据库,之后 opencode github run 仍可能因缓存/配置中的其他损坏文件而失败。更关键的是,run-opencode.sh 中的 OPENCODE_DB_PATH 环境变量是否被 opencode CLI 实际使用需要确认——如果 opencode CLI 不识别此环境变量,恢复逻辑将删除错误路径或无效果。

建议项:

  1. 冗余条件判断: action.ymlif: ${{ inputs.cleanup-db != 'false' }} 已确保步骤在值为 false 时不会运行,但 run 脚本中仍有 if [[ "$cleanup_val" == "false" ]]; then exit 0; fi 的死代码。建议移除内部冗余检查。

  2. 边界值处理: cleanup-db: "0" 会将阈值设为 0MB,导致任何存在的数据库文件都会被删除。虽然极端,但建议在文档或代码中明确此行为,或添加最小值限制(如 >= 1)。

  3. 日志一致性: run-opencode.sh 恢复成功后使用 printf 输出普通日志而非 ::warning::,建议与清理步骤保持一致的 warning 级别,便于用户在工作流摘要中感知。

  4. action.ymlexport 位置: OPENCODE_DB_MAX_SIZE_MB="50" 的默认值在不同条件下都会被无条件 export,可改为仅当需要时再设置,略为清洁。

security

存在风险 / AT RISK

安全分析摘要

本 PR 引入了两个安全机制:SQLite 数据库大小阈值清理和迁移失败自动恢复。整体设计合理,输入验证较好,但存在若干可被恶意工作流利用的风险点。


阻塞项:无

未发现必须阻止合并的严重漏洞。


建议项

  1. OPENCODE_DB_PATH 路径未做限制可能导致任意文件删除
    cleanup-db.sh:20run-opencode.sh:149 中的 db_path="${OPENCODE_DB_PATH:-$HOME/...}" 直接用于 rm -f。若工作流将环境变量 OPENCODE_DB_PATH 设置为如 /etc/passwd~/.ssh/authorized_keys 等系统敏感路径,将导致这些文件被意外删除。建议:添加路径合法性校验,确保目标路径在预期范围内(如仅允许 $HOME/.local/share/opencode/ 下的文件),或对路径做规范化处理并检查前缀。

  2. 自动恢复逻辑的 grep 匹配过于宽泛
    run-opencode.sh:147 使用 grep -qi "duplicate column name" "$log_file" 作为删除数据库的触发条件。如果 opencode 工具的正常输出(如 AI 响应的内容)意外包含该字符串,将触发数据库误删。建议:限定仅匹配 stderr 输出,或匹配更精准的 SQLite 错误模式(如 SQL logic error + duplicate column name 的组合),或仅匹配 opencode 内部格式化输出中的特定前缀。

  3. ${{ github.action_path }} 注入风险(已存在模式,非本 PR 引入)
    action.yml:258${{ github.action_path }}/../shared/cleanup-db.sh 遵循 GitHub Actions composite action 标准模式,理论上若攻击者可控制 action 的检出路径(如使用恶意 fork),可导致 shell 注入。此为 GitHub Actions 平台的已知限制,但在本项目所有 action 中均存在此模式。

performance

性能有疑虑 / CONCERNS

本次 PR 新增了 cleanup-db.shrun-opencode.sh 的 SQLite 迁移错误自动恢复逻辑。整体性能表现良好,无严重问题,仅有少量可优化的点。

阻塞项:无

建议项:

  1. run-opencode.shgrep -qi "duplicate column name" "$log_file" 没有加 -m1,每次 opencode 非零退出时都会全量读取日志文件。如果日志文件较大(例如 opencode 输出较多),这可能产生不必要的 I/O。建议添加 -m1 在首次匹配后停止读取。
  2. cleanup-db.shrun-opencode.sh 中的数据库删除操作均未考虑文件锁。如果同一个 self-hosted runner 上并发运行多个 workflow,可能出现竞态条件(两个实例同时判断文件大小并尝试删除)。虽然删库本身是幂等的(rm -f 不会失败),但可能导致不必要的重复 stat 和 rm 操作。建议通过 flock 或 mkdir 原子锁进行协调。
  3. cleanup-db.shstat 命令的两次 fallback(macOS 和 Linux)如果都失败则静默返回 0,此时不会触发删除,但也不会报错。建议在 stat 失败时输出 warning 以便排查。
architecture

架构合理 / SOUND

该 PR 的变更遵循了项目已有的架构模式,没有引入架构层面的问题。

分析摘要:

  • 耦合:新增的 shared/cleanup-db.sh 是独立工具脚本,无额外依赖;迁移恢复逻辑内聚在 run-opencode.sh 的运行循环中,没有跨模块耦合。
  • 模块放置shared/cleanup-db.sh 归入 shared 目录合理(可被多个 action 复用);cleanup-db input 和 step 加在 github-run-opencode/action.yml(上游编排层)符合分层职责;迁移恢复加在 run-opencode.sh 的运行循环中,位置正确。
  • 分层github-run-opencode 做预防性清理(运行前),run-opencode 做反应式恢复(运行时出错),形成纵深防御,无职责重叠。
  • 接口设计cleanup-db 输入支持 "true"(默认50MB)/ 数字(自定义阈值MB)/ "false"(关闭),灵活且向后兼容;环境变量 OPENCODE_DB_PATH / OPENCODE_DB_MAX_SIZE_MB 解耦了脚本与 action 输入。
  • 霰弹式修改:变更局限在各自逻辑所属位置(shared/新脚本、github-run-opencode/action.yml、run-opencode.sh),没有散落四处。
  • 一致性:调用脚本的模式(action.yml => shared/*.sh)与该仓库已有做法一致;重试/恢复逻辑沿用了 run-opencode.sh 已有的 while 循环模式。

阻塞项:无

建议项:无

@Svtter Svtter merged commit d94e830 into main Jun 7, 2026
4 checks passed
@Svtter Svtter deleted the feat/cleanup-opencode-db branch June 7, 2026 04:14
@Svitter Svitter added triaged Issue has been triaged review:p2 Minor review findings feat New feature or enhancement setup Setup and installation labels Jun 7, 2026

@Svitter Svitter left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: P2 — needs maintainer call

This PR adds a two-pronged DB cleanup feature (preventive size-threshold deletion + reactive migration-error recovery) targeting stale opencode.db on self-hosted runners. The approach is sound and the implementation is small and focused (6 files, 78 lines). Ranked P2 because cleanup-db defaults to "true", changing default behavior — conventions require explicit maintainer sign-off.

Findings summary:

  • No tests for cleanup-db.sh or the run-opencode.sh migration recovery block. The existing test suite has zero coverage for these paths.
  • Redundant false guard in github-run-opencode/action.yml:252 — the inline if [[ "$cleanup_val" == "false" ]] is dead code since the step's if: already gates on != 'false'.
  • Linux suboptimal stat order in shared/cleanup-db.sh:23stat -f%z (macOS) runs and fails on every Linux invocation before falling through to stat -c%s. Swap them since Linux is the only supported runner.
  • OPENCODE_DB_PATH not plumbed — neither the cleanup step nor the run step passes a custom DB path. Both scripts hardcode the default ~/.local/share/opencode/opencode.db. Users with non-default paths need to set OPENCODE_DB_PATH in their workflow env:, but this isn't documented.

None of these are blocking correctness issues. The stat fallback works despite the order; the recovery continue correctly avoids double-counting the attempt. Docs are updated across all three READMEs.

Thanks for the clean, focused PR.

run: |
set -euo pipefail
cleanup_val="$INPUT_CLEANUP_DB"
if [[ "$cleanup_val" == "false" ]]; then exit 0; fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should-fix: This false check is dead code. The step's if: guard on line 245 already ensures inputs.cleanup-db != 'false', so $cleanup_val can never be "false" when this inline script executes. Removing it avoids misleading future readers.

Comment thread shared/cleanup-db.sh
exit 0
fi

size_bytes="$(stat -f%z "$db_path" 2>/dev/null || stat -c%s "$db_path" 2>/dev/null || echo 0)"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should-fix: stat -f%z (BSD/macOS) runs first, fails on Linux (exit 1, verified), then stat -c%s (GNU/Linux) succeeds via ||. The action only supports Linux (runner.os != 'Linux' exits early at action.yml:168). Swapping to stat -c%s first eliminates an unnecessary failing subprocess on every invocation. The macOS fallback can remain as-is after the ||.

Comment thread shared/cleanup-db.sh
@@ -0,0 +1,32 @@
#!/usr/bin/env bash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should-fix: Neither cleanup-db.sh nor the migration recovery block in run-opencode.sh (line 148) has test coverage. The existing test suite (tests/test_all.py) has no references to cleanup-db, OPENCODE_DB_PATH, or duplicate column name. At minimum, cleanup-db.sh should have a test that verifies it deletes a file exceeding the threshold and leaves a file under the threshold alone.


# Auto-recover from SQLite migration failures: delete the stale db and retry once.
if [[ "$migration_recovery_done" == "false" ]] && grep -qi "duplicate column name" "$log_file"; then
db_path="${OPENCODE_DB_PATH:-$HOME/.local/share/opencode/opencode.db}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should-fix: OPENCODE_DB_PATH is never set by github-run-opencode/action.yml for either the cleanup-db step or the run-opencode step. If a user configures opencode with a non-default database path, neither the preventive cleanup (cleanup-db.sh) nor the reactive recovery (this line) will find it — both hardcode ~/.local/share/opencode/opencode.db. Consider adding an optional db-path input to github-run-opencode/action.yml that flows OPENCODE_DB_PATH to both steps, or document that users must set OPENCODE_DB_PATH in their workflow env: for custom paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat New feature or enhancement review:p2 Minor review findings setup Setup and installation triaged Issue has been triaged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: 添加定期清理 opencode.db 的 workflow run

2 participants