From c3aa1c9eaeb13239c93f999c3379a9f9751e58ad Mon Sep 17 00:00:00 2001
From: Ralf Anton Beier <ralf_beier@me.com>
Date: Thu, 30 Apr 2026 05:52:09 +0200
Subject: [PATCH] =?UTF-8?q?docs:=20agent-fleet=20wave-1=20synthesis=20?=
 =?UTF-8?q?=E2=80=94=20quickstart=20+=20prioritized=20bug=20list?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Output of a 14-agent parallel review of Temper. Each agent had a distinct
persona (junior dev, safety-critical architect, QA, security, DevOps, docs
reviewer, performance, day-2 ops, Probot/Octokit expert, supply-chain, LLM
prompt engineer, rivet integration, test quality, senior reviewer) and a
non-overlapping scope.

13 of 14 reported successfully. Aggregated into two artifacts:

## docs/agent-fleet/quickstart.md
Hand-off doc for future agents: what Temper is, what's verified working
(don't redesign), what's confirmed broken, what's deliberately deferred,
and how to pick up a fix without repeating wave-1's investigation.

## docs/agent-fleet/bugs.md
Prioritized list of 35 findings, severity-tagged (5 critical, 10 high,
15 medium, 5 low). Each carries a file:line citation and at least one
wave-1 agent flag. Ends with action priorities ("if only one PR this
week", "if a sprint").

## Why these matter

Wave-1 surfaced bugs that 9 months of iterative dev hadn't:

- `synchronizeIssueLabels` silently deletes user-created labels on every
  webhook. Destructive across the org.
- `WEBHOOK_SECRET` defaults to literal "development" when env unset.
  Trust boundary fail-open.
- Dashboard POST routes are unauthenticated.
- Test mock retrofit fakes the broken `octokit.issues.X` namespace —
  exactly why PRs #22, #29, #34 each had to fix it again at a new call site.
- `success: true` returned when configuration silently degraded.
- `/health` always returns 200.

The cross-cutting theme: "looks like it ran" failure pattern across
ruleset fallback, signed-signatures swallow, scheduler tick errors, and
review-save IO. Fix by class, not case-by-case.

## Not in this commit

Wave-2 work — picking up specific fixes, multi-lens AI review, GitHub
Check Run integration, splitting `src/app.js`. Documented as open
questions in `quickstart.md`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/agent-fleet/bugs.md       | 345 +++++++++++++++++++++++++++++++++
 docs/agent-fleet/quickstart.md | 133 +++++++++++++
 2 files changed, 478 insertions(+)
 create mode 100644 docs/agent-fleet/bugs.md
 create mode 100644 docs/agent-fleet/quickstart.md

diff --git a/docs/agent-fleet/bugs.md b/docs/agent-fleet/bugs.md
new file mode 100644
index 0000000..d9cbd32
--- /dev/null
+++ b/docs/agent-fleet/bugs.md
@@ -0,0 +1,345 @@
+# Temper — prioritized bug list (wave-1 fleet output)
+
+Findings from 13 specialist agents (one pending). Each has a file:line citation
+and at least one wave-1 agent flag. Severity tiers:
+
+- 🔴 **Critical** — destructive, exploitable, or trust-boundary
+- 🟠 **High** — silent wrongness, lost work, audit-trail gap
+- 🟡 **Medium** — quality, performance, hygiene
+- 🟢 **Low** — polish
+
+Sort: severity, then independent-flagger count, then ease-of-fix.
+
+---
+
+## 🔴 Critical
+
+### 1. `synchronizeIssueLabels` silently deletes user-created labels (`src/labels.js:69-78`)
+Runs on every `repository.created`, every `/sync-all-repos`, every
+`/configure-repo`, every `reconcile-repo` task. Any label not in `config.yml
+issue_labels` is destroyed. Open issues lose their tags. **There is no opt-out
+and no preserve-list.** Single `/sync-all-repos` invocation triggers this
+across every repo in the org.
+- **Flagged by:** QA bug-hunter (#1)
+- **Fix:** add `synchronize_mode: 'merge' | 'replace'` (default `'merge'` —
+  only create/update; never delete). Behind a config flag for explicit
+  opt-in to destructive sync.
+
+### 2. `WEBHOOK_SECRET` fail-open default
+Probot defaults the webhook secret to literal `"development"` when the env var
+is unset. `index.js` and `mapLegacyEnvVars` (`src/app.js:1136-1149`) don't
+fail-fast on missing secret. A misconfigured deploy silently accepts forged
+webhooks signed with `"development"`.
+- **Flagged by:** Security auditor (#1 high)
+- **Fix:** in `index.js` before `await run(registerApp)`:
+  ```js
+  if (!process.env.WEBHOOK_SECRET || process.env.WEBHOOK_SECRET === 'development') {
+    throw new Error('WEBHOOK_SECRET is required and cannot equal "development"');
+  }
+  ```
+
+### 3. Dashboard POST routes are unauthenticated (`src/dashboard.js:1287-1495`)
+`POST /dashboard/actions/sync`, `/actions/review`, `/actions/analyze-org`, and
+JSON APIs `/api/org/repos`, `/api/org/health` are reachable without auth. Any
+caller able to hit the bot's port can trigger org-wide reconfiguration or
+exfiltrate repo data.
+- **Flagged by:** Security auditor (#3 high)
+- **Fix:** Bearer / shared-secret middleware for all `/dashboard/*` and
+  `/api/*` routes; or bind to `127.0.0.1` only behind an authenticating
+  reverse proxy.
+
+### 4. `issues.opened` provisioning lacks the auth gate (`src/app.js:327-417`)
+Comment-driven commands enforce `requireOrgMember` + `allowed_command_users`.
+The issue-form provisioning path (controller_repo) doesn't. Anyone able to
+file a labelled issue in the controller repo provisions a real org repo.
+- **Flagged by:** Security auditor (#2 high)
+- **Fix:** add the same allowlist + org-member check before
+  `enqueueTask('provision-repo', ...)`.
+
+### 5. Two SQLite connections to the same `dedup.db` (`src/app.js:1039-1047`)
+`initKVStore` is called twice (idempotency + ai_rate_limits) against the same
+file. WAL tolerates multi-connection reads, but writes from two connections
+can `SQLITE_BUSY` outside a single transaction. When `markProcessed` and
+`recordReview` race, one throws → handler crashes mid-flight → `markProcessed`
+skipped → **same delivery is reprocessed and the AI review runs twice.**
+- **Flagged by:** QA bug-hunter (#3 critical)
+- **Fix:** share one `Database` connection between the two KV instances.
+
+---
+
+## 🟠 High
+
+### 6. Chatops handler doesn't `markProcessed` on early returns or exceptions
+`handleChatopsIssue` early-returns on auth failure / missing fields without
+`markProcessed`. Exception during `closeIssue()` after `commentBack()`
+succeeded leaves the wrapping handler unable to mark the delivery — Probot
+retries → **duplicate analysis-report issues, duplicate auth-rejection
+comments**. Same pattern for `/check-config`, `/sync-all-repos`,
+`/allow-merge-commit`.
+- **Flagged by:** QA bug-hunter (#2 critical)
+- **Fix:** `try/finally` wrapping with `markProcessed` always, plus
+  idempotency via dedup key on the action itself.
+
+### 7. `success: true` returned when configuration silently degraded
+- `applyBranchEnforcement` falls back to legacy on rulesets 422/403/404
+  silently (`src/repository.js:30-58`).
+- `setRequiredSignatures` swallows 404/422 with a `warn`
+  (`src/branch-protection.js:55-62`).
+- Result: `configureRepository` returns `{success: true}` even when signed
+  commits aren't enforced and no rulesets are applied.
+- **Flagged by:** Safety-critical architect (#1, structural), QA, DevOps
+- **Fix:** read-back-and-assert pattern. After applying, fetch the actual
+  state and assert it matches the desired set. Surface mismatches as
+  `{success: false, missing: [...]}`. Emit a GitHub Check Run on the repo's
+  default branch with conclusion `failure` so an org-required check refuses
+  merges until the bot succeeds.
+
+### 8. `/health` always returns 200 (`src/app.js:984+`)
+Returns `status: 'healthy'` unconditionally. Scheduler stuck, SQLite locked,
+AI endpoint dead, disk full: all return 200. PM2 never restarts on hang.
+- **Flagged by:** DevOps/SRE (#1 critical), Safety-critical architect
+- **Fix:** Real liveness signal: scheduler last-tick-age threshold, DB ping,
+  free-disk threshold, last-error counter. Return 503 on any. Wire to PM2
+  `--max-memory-restart` and external uptime monitor.
+
+### 9. Inline AI review on `pull_request.opened` is not durable (`src/app.js:840-929`)
+Auto AI review runs in the webhook handler with no task-store enqueue. PM2
+restart mid-`reviewPullRequest` = silent loss. Same for the `enableAutoMerge`
+mutation. Compare to `provision-repo`, `generate-dependabot`,
+`revert-merge-settings`, `reconcile-repo` which ARE durable.
+- **Flagged by:** DevOps/SRE, Performance, Safety architect
+- **Fix:** enqueue `auto-review` and `auto-enable-merge` tasks; webhook
+  handler returns within ~50 ms.
+
+### 10. `reviews.json` non-atomic full-rewrite (`src/ai-review.js:622-651`)
+`fs.writeFileSync(REVIEWS_PATH, JSON.stringify(_reviews))` on every review.
+If process killed mid-write the file is truncated → next `loadReviews()`
+swallows via bare `catch {}` (line 631) → **all review history silently
+resets to `[]`.**
+- **Flagged by:** QA, Performance, DevOps
+- **Fix:** replace with a SQLite table next to `tasks.db`. Append, never
+  rewrite. Bonus: add tamper-evidence (SHA chain) for the audit-trail claim
+  the safety architect flagged.
+
+### 11. Test mock retrofit hides production bugs (`__tests__/integration/{ai-review,app}.test.js`)
+`createMockOctokit` retrofits `octokit.issues.X` jest.fn()s over the real
+`octokit.request` route table. **Production code that calls `.issues.X` passes
+tests and crashes in prod.** This is exactly the PR #22 / #29 / #34 regression
+class still active. The smoke `webhook.test.js` accepts `[200, 202, 401, 500]`
+for `issue_comment.created`, swallowing the very 500 a `.issues` call would
+produce.
+- **Flagged by:** Test-quality engineer, Senior reviewer (pending)
+- **Fix:** replace with a "real-shape" octokit fake exposing only `.request()`
+  and `.paginate()` — Probot v14's actual surface. Tighten e2e to `[200, 202]`
+  only.
+
+### 12. Slop regex too greedy (`src/ai-review-prompt.js:65+`)
+`\b(could|should|may|consider)\b` drops legitimate findings:
+- "this **could** panic on null deref at line 42"
+- "function **should** not be called from async context"
+- "**Consider**[ed]" — `\bconsider\s+/` kills "the constructor doesn't consider the empty-list case"
+Quote-or-die `String.includes` is whitespace-fragile (CRLF/tab normalisation).
+Schema accepts `quoted_line: "+"` (1-char minimum) — every diff has `+`.
+- **Flagged by:** LLM prompt engineer
+- **Fix:** convert single-word patterns to phrase patterns ("we should
+  consider", "might be wise to"). Normalise whitespace before quote check.
+  Tighten schema: `quoted_line` `pattern: "^[+-].+"`, `claim minLength: 30`.
+
+### 13. `pull_request.opened` has no idempotency guard (`src/app.js:840+`)
+Only the 5-min AI rate limit prevents duplicate auto-reviews on webhook
+redelivery. `enablePullRequestAutoMerge` GraphQL is called every time without
+guard.
+- **Flagged by:** QA bug-hunter
+- **Fix:** standard `isProcessed`/`markProcessed` pattern at handler entry.
+
+### 14. Self-update: no signature verification, no rollback
+`src/self-update.js:6-28` spawns `tools/self-update/target/release/temper-self-update`
+inheriting full env (line 21, `...process.env` leaks all secrets to child).
+No SHA / signature check on the binary. `git reset --hard origin/main` is not
+atomic with `npm ci`; if `npm ci` fails after reset, you're on new code with
+old/partial deps and no rollback.
+- **Flagged by:** Safety architect, DevOps
+- **Fix:** snapshot SHA before reset, smoke `node -e "require('./index.js')"`
+  before `pm2 restart`, auto-revert on failure. Verify a signed release tag
+  before exec'ing the binary. Whitelist env vars passed to the child.
+
+### 15. Tarball extraction lacks symlink/path-traversal safety (`src/rivet-fetch.js:48`)
+`tar -xz --strip-components=1 -C destDir` — no `--no-same-owner`, no `-P`
+rejection, no symlink guard. Malicious PR head ref can include a symlink
+pointing outside `destDir`; subsequent `runRivetOracle` reads `rivet.yaml`
+from that target.
+- **Flagged by:** Security auditor (medium)
+- **Fix:** drop symlinks via JS `tar` lib with `filter`, or `--no-overwrite-dir`
+  + pre-validate entries.
+
+---
+
+## 🟡 Medium
+
+### 16. Multiple `.repos.X` / `.pulls.X` calls in dashboard.js (PR #22 risk class)
+`octokit.repos.get` (`dashboard.js:79`), `getCommitSignatureProtection` (:83),
+`pulls.list` (:87), `getBranch` (:432), `compareCommits` (:440), `listBranches`
+(:457), `actions.listWorkflowRunsForRepo` (:74), `checks.listForRef` (:103).
+Plus `app.js:268` `context.octokit.repos.getBranch` — exact same shape as the
+PR #22/#29/#34 saga.
+- **Flagged by:** Probot/Octokit expert
+- **Fix:** sweep these to `.request()` form. dashboard.js builds its own
+  octokit (line 31), so the namespace IS reliable there — but app.js:268 is
+  on `context.octokit` and shares the failure mode.
+
+### 17. Documentation references npm scripts that don't exist
+README.md:141-144 and DEVELOPMENT.md:230-233 list `npm run deploy:docker`,
+`deploy:heroku`, `deploy:server`, `deploy:netcup`. Only `docker:build` exists
+in package.json.
+- **Flagged by:** Junior dev, Documentation reviewer
+- **Fix:** delete the fictional scripts from docs OR add them to package.json.
+
+### 18. GITHUB_APP_SETUP.md uses obsolete env var names
+Uses `GITHUB_APP_ID`/`GITHUB_PRIVATE_KEY`/`GITHUB_WEBHOOK_SECRET` (legacy);
+`.env.example` and DEVELOPMENT.md use the canonical `APP_ID`/`PRIVATE_KEY`/
+`WEBHOOK_SECRET`. Junior devs copy whichever they read last.
+- **Flagged by:** Junior dev, Documentation reviewer
+- **Fix:** update GITHUB_APP_SETUP.md to canonical names.
+
+### 19. Sequential GitHub fetches in `reviewPullRequest` (`src/ai-review.js:382-398`)
+3 strictly sequential REST calls (PR meta, diff media-type, files): ~600-1500
+ms before any work. Independent — could be `Promise.all`.
+- **Flagged by:** Performance
+- **Fix:** parallelise; saves 0.5-1 s per review.
+
+### 20. Sequential rivet head + base validate (`src/ai-review.js:417-451`)
+Two tarball fetches + two extracts + two rivet runs, sequentially. ~50%
+wall-time savings if parallelised.
+- **Flagged by:** Performance
+- **Fix:** `Promise.all` the head and base oracle runs.
+
+### 21. `synchronizeAllRepositories` no concurrency cap (`src/organization.js:34-52`)
+Fully sequential `for...of`. ~30-60 s for 13 repos. With 100 repos it can
+also exceed the 5000/h GitHub rate limit.
+- **Flagged by:** Performance, Probot expert
+- **Fix:** `p-limit(3)` bounded concurrency.
+
+### 22. `provenance` job attests `package.json` only (`.github/workflows/ci.yml:75-92`)
+Doesn't attest the Docker image, the self-update Rust binary, or the SBOM.
+Actions pinned by tag (`@v4`, `@v0`), not commit SHA. `node:20-alpine` is a
+floating tag.
+- **Flagged by:** Supply-chain
+- **Fix:** build actual artifacts, capture digests, feed digests into
+  `attest-build-provenance@v2` and `attest-sbom@v2` as `subject-digest`.
+  SHA-pin every action. Pin base image to digest.
+
+### 23. `npm audit ... || true` swallows findings (`.github/workflows/ci.yml:21`)
+Audit is advisory, not enforcing. Vulnerable deps merge to main without alarm.
+- **Flagged by:** Supply-chain
+- **Fix:** drop `|| true`. Set `--audit-level=high` if you want to allow
+  moderate transitive findings.
+
+### 24. eslint@8 is end-of-life
+No security backports as of 2024-10. Quality gate against deprecated toolchain.
+- **Flagged by:** Supply-chain
+- **Fix:** migrate to eslint v9 (flat config) — sizable but isolated.
+
+### 25. `provisioning.js` regex accepts invalid GitHub repo names
+`REPO_NAME_RE = /^[a-zA-Z0-9._-]+$/` accepts `.`, `..`, `.git`, leading
+dots, `-`, `--`. GitHub 422s on these; user sees generic "❌ Failed to
+create repository".
+- **Flagged by:** QA, Junior dev
+- **Fix:** stricter regex; reject leading dots / dashes / `.git` suffix.
+
+### 26. `_reviewTimestamps` in-memory Map grows unbounded (`src/ai-review.js:22`)
+Dead code when persistent store wired; memory leak when not. No `cleanup()`.
+- **Flagged by:** QA, DevOps
+- **Fix:** delete the in-memory fallback now that SQLite KV is the canonical path.
+
+### 27. `dependabot.js` order-sensitive equality check (`src/dependabot.js:108`)
+`JSON.stringify(currentConfig) === JSON.stringify(targetConfig)` — yaml.load
+preserves source order; identical configs in different key orders compare
+unequal → unnecessary writes (and PRs when `use_pull_requests: true`).
+- **Flagged by:** QA
+- **Fix:** deep-equal a normalised form, not stringified.
+
+### 28. Module-level `loadConfig()` and `startAutoCleanup()` at import time
+`src/config.js:82`, `src/idempotency.js:74`. Side effects at module import
+make the bot untestable in isolation and prevent a "validated config or
+refuse to start" boot policy.
+- **Flagged by:** Safety architect, Senior reviewer
+- **Fix:** explicit `init()` called from `index.js`; refuse to start on
+  invalid config.
+
+### 29. AI prompt-injection mitigation incomplete (`src/ai-review.js:68-71`)
+`sanitizeAIOutput` only strips `::workflow-command::`. Doesn't redact
+`<img>` HTML, mass `@mentions`, or fake "Approval" verdicts. PR diffs are
+attacker-controlled.
+- **Flagged by:** Security auditor (medium), LLM expert
+- **Fix:** widen sanitizer; consider posting AI output as a quote block to
+  prevent markdown rendering of model output.
+
+### 30. Rivet binary version drift
+Bot ships with rivet v0.4.3; rivet repo's main may use newer schema (this
+already happened during dev). Conversion bugs become silent
+"unknown→unknown" findings.
+- **Flagged by:** Rivet expert
+- **Fix:** on startup, run `rivet --version` and assert against a pinned
+  `expected_rivet_version`. On mismatch, log loudly and disable the oracle.
+
+---
+
+## 🟢 Low
+
+### 31. `CONTRIBUTING.md` links to non-existent `CODE_OF_CONDUCT.md`
+- **Flagged by:** Junior dev, Documentation
+- **Fix:** add the file or remove the link.
+
+### 32. `CHANGELOG.md` migration guide tells users to install probot v12, octokit v19
+`package.json` ships probot v14, octokit v22.
+- **Flagged by:** Documentation
+- **Fix:** delete the outdated migration block.
+
+### 33. `verify-netcup-deployment.sh` is a printf script with no enforcement
+Every check is informational; only missing-files exits non-zero. Suggests
+`node index.js &` with no PM2.
+- **Flagged by:** DevOps
+- **Fix:** make checks enforcing or delete; commit `ecosystem.config.js`
+  for PM2 instead.
+
+### 34. No `LICENSE` file at repo root; `package.json` declares MIT
+18 BSD-3 + 12 Apache-2.0 transitive deps require attribution that the
+repo doesn't carry.
+- **Flagged by:** Supply-chain
+- **Fix:** add LICENSE + NOTICE.
+
+### 35. Probot v14 events not subscribed but should be
+- `pull_request.synchronize` — bot doesn't re-review on force-push (10-line fix)
+- `pull_request.reopened` — symmetrical to `.opened`
+- `pull_request_review` / `pull_request_review_comment` — bot can't react to
+  human reviews
+- `check_run.rerequested` — would let `/review-pr` be triggered from Checks UI
+- `installation.created` / `installation_repositories.added` — could replace
+  manual `/sync-all-repos` flow
+- **Flagged by:** Probot/Octokit expert
+
+---
+
+## What's NOT here
+
+- The senior code reviewer's report wasn't in when this was written. When it
+  arrives, fold into the relevant categories above (likely module-cohesion
+  and complexity hotspots, which the safety architect already covered).
+- Wave-2 will add new findings as fixes are attempted (and surface bugs
+  in fixes themselves).
+
+---
+
+## Action priorities for the human
+
+If only one PR this week: **#1 (label deletion)**. It's destructive, runs
+on every webhook, and the fix is small.
+
+If two: add **#2 (`WEBHOOK_SECRET` fail-fast)**. One line; closes a real
+trust boundary.
+
+If three: add **#11 (real-shape octokit test fake)**. Single change that
+prevents the entire PR-22 regression class from ever shipping again.
+
+If a sprint: 1, 2, 4, 5, 7, 8, 9 — covers destructive, trust, audit-trail,
+liveness in one sweep.
diff --git a/docs/agent-fleet/quickstart.md b/docs/agent-fleet/quickstart.md
new file mode 100644
index 0000000..de43b40
--- /dev/null
+++ b/docs/agent-fleet/quickstart.md
@@ -0,0 +1,133 @@
+# Temper — agent-fleet quickstart
+
+Hand-off doc for wave-2 agents. Read this **before** investigating Temper for
+the first time so you don't repeat work that 14 wave-1 agents already did.
+
+## What Temper is, in one paragraph
+
+Probot v14 (Node.js, ESM) GitHub App that hardens `pulseengine` org repos to a
+declarative `config.yml`. Reacts to `repository.created`, `issue_comment.created`,
+`issues.opened`, `pull_request.{opened,closed}`, and `push`. Persists state to
+SQLite (`data/tasks.db`, `data/dedup.db`). Runs a 5-minute scheduler tick.
+Deployed on a single Netcup VPS (3.8 GB RAM, 2 cores, no swap) under PM2 with
+self-update on `push` to `main`. Talks to a local Ollama
+(`qwen2.5-coder:3b`) for AI code review on PRs, and to a `rivet` CLI for
+mechanical traceability oracle on `pulseengine/rivet`-instrumented repos.
+
+## What's been verified to work (don't redesign these)
+
+- **Repository Rulesets API is the right replacement** for legacy branch
+  protection on empty repos (`src/rulesets.js`). The translator + drift
+  detection works correctly when contexts are non-empty.
+- **Persistent task store** (`src/task-store.js`) — atomic claim via
+  `UPDATE ... RETURNING`, dedup by key, exponential backoff, 3-attempt cap.
+- **Idempotency + AI rate-limit** moved to SQLite KV (`src/persistent-kv.js`,
+  `src/idempotency.js`, `src/ai-review.js`). Survives PM2 restart.
+- **AI review pipeline shape** — strict-JSON system prompt + grammar-constrained
+  Ollama `response_format` + parser + slop filter + quote-or-die + deterministic
+  `computeVerdict` is **the right architecture for a 3 B model**. Don't redesign.
+- **Rivet oracle integration** — `runRivetValidate`, `runRivetImpact`,
+  `subtractFindings` (delta vs base), `groupOracleFindings` (collapse identical
+  messages). Tarball-fetch is side-effect-clean.
+- **chatops_repo restriction** — `chatops_repo.enabled: true` plus a private
+  admin repo correctly hides ChatOps surface from public repos.
+- **Issue forms in `docs/temper-ops-template/`** dispatch to ChatOps via
+  `handleChatopsIssue` based on `chatops:<command>` label. Smoke-tested.
+
+## What's confirmed broken (the bugs.md file is the prioritized list)
+
+Top categories from 13 specialists:
+
+1. **Silent state mutation** — `synchronizeIssueLabels` deletes labels not in
+   target list. **Destructive** on every webhook tick.
+2. **Trust-boundary fail-open** — `WEBHOOK_SECRET` defaults to literal
+   `"development"` in Probot when env var is unset. Index doesn't fail-fast.
+3. **Unauthenticated dashboard POST routes** can trigger org-wide actions and
+   exfiltrate repo data.
+4. **`issues.opened` provisioning has no auth gate** (the comment-driven
+   commands do; the issue-form path doesn't).
+5. **`/health` lies** — always returns 200 even when scheduler is dead. PM2
+   never restarts on hang.
+6. **Silent degradation reported as success** — rulesets 422 → fallback to
+   legacy → "✅ Configured". `setRequiredSignatures` 404 → swallowed → audit
+   trail says "configured" but enforcement isn't there.
+7. **Test mocks fake the broken namespace** — `createMockOctokit` retrofits
+   `octokit.issues.X` over `octokit.request` so production code that uses
+   `.issues.X` *passes tests* and *crashes in prod*. This is exactly the PR #22
+   regression class, still active in 2 test files.
+8. **Module-level singletons everywhere** — `_taskStore`, `_scheduler`,
+   `_octokit`, `_reviews`, `activeLogger`, `config`. No injection seam.
+9. **Inline AI review on `pull_request.opened`** is not in the task store; PM2
+   restart mid-review = silent loss.
+10. **No GitHub Check Run** — AI review can't be a required check, so its
+    "request_changes" verdict cannot actually gate merges.
+
+## What's been deliberately left for wave-2 (open questions)
+
+- **Multi-lens AI review** — discover (3 narrow lenses) → fresh-session validate
+  → emit. The single-shot strict-JSON path is wave-1; multi-lens is the next
+  quality jump and was scoped but not built.
+- **GitHub Check Runs API integration** — converting AI review verdicts into
+  `check_run` results that can be required-checked. Architecture decision pending.
+- **`pull_request.synchronize` event subscription** — bot doesn't re-review on
+  force-push today. Probot expert flagged this as a 10-line change.
+- **Replace `reviews.json` with a SQLite table** — DevOps + Performance both
+  flagged the unbounded full-file-rewrite pattern.
+- **Split `src/app.js`** — 1159 lines doing event handling, ChatOps slash
+  commands, ChatOps issue-form dispatch, persistence init, scheduler init,
+  helper utilities. Senior code reviewer's recommended next refactor.
+- **CLI `temper-admin`** — single binary wrapping every diagnostic in the
+  runbook. Day-2 ops agent's #1 missing tool.
+- **Per-repo config overlay** — currently global only. Many fixes touch this
+  surface; do it before adding policy variations.
+
+## How to pick up a fix
+
+1. Skim `docs/agent-fleet/bugs.md` — pick a bug.
+2. Verify the file:line citation is still accurate (the codebase moves; some
+   findings may have been fixed in PR #34+).
+3. Branch off `main`. Open a focused PR. Don't bundle.
+4. Tests: **don't trust** `__tests__/integration/{ai-review,app}.test.js` —
+   their `createMockOctokit` retrofit fakes the broken namespace. If your fix
+   touches an octokit call, add a "real-shape" test (only `.request()` and
+   `.paginate()`, no `.issues`/`.repos` namespaces).
+5. PR description should cite which wave-1 agent (or multiple) flagged the bug,
+   and quote the file:line reference.
+
+## Reference: the 14 wave-1 personas + their non-overlapping scopes
+
+| # | Persona | Primary scope |
+|---|---|---|
+| 1 | Junior dev | README/DEVELOPMENT/SETUP onboarding experience |
+| 2 | Safety-critical EE architect | Failure modes vs ASIL-D/safety-critical fitness |
+| 3 | QA bug-hunter | Race conditions, error swallowing, off-by-one |
+| 4 | Security auditor | OWASP top 10, signature, injection, traversal |
+| 5 | DevOps / SRE | Production readiness, observability, recovery |
+| 6 | Documentation reviewer | Docs vs code drift |
+| 7 | Senior code reviewer | Module cohesion, complexity hotspots (pending) |
+| 8 | Performance engineer | Hot paths, latency, parallelism |
+| 9 | Day-2 operations | Runbook for 8 incident types |
+| 10 | Probot/Octokit expert | v14 idioms, deprecated patterns, missing events |
+| 11 | Supply-chain compliance | SBOM, provenance, deps, license, lockfiles |
+| 12 | LLM prompt engineer | Strict-JSON contract, slop filter, multi-lens |
+| 13 | Rivet integration | Oracle correctness, schema drift, missed features |
+| 14 | Test quality | Coverage shape, mock leakage, flakiness |
+
+Each agent's full report is in `/private/tmp/claude-501/.../tasks/<id>.output`
+on the orchestrator's machine; the synthesis here is the canonical record.
+
+## Cross-cutting themes (multiple agents converged)
+
+- **"Looks like it ran" failure pattern** — repeated across the codebase:
+  ruleset 422 → silent legacy fallback, signed-signatures 404 → silent skip,
+  scheduler tick error → silent skip, save-review IO error → silent reset to
+  `[]`. Fix by class, not case-by-case.
+- **The bot can't actually gate commits** — it posts a comment, not a Check
+  Run. Every "AI review enforces standards" framing in docs is hollow until
+  this is wired.
+- **Mocking too much** — `app.test.js` jest.mocks 12 modules. Most tests
+  exercise routing logic against ghosts. Add at least one wired integration
+  test per dispatch path.
+- **Module-level state is the architectural debt** — every singleton blocks
+  isolation, testing, and multi-installation. Untangling this is the
+  highest-leverage refactor.