feat(privacy): local-only operator-report mode for the wiki privacy gate#3450
Conversation
…y gate When the promotion gate blocks, its public output stays redacted to leak-N labels. The new --operator-report mode prints the offending page paths and a remediation sequence so an operator or agent doesn't have to reverse-engineer the gate — but it hard-refuses to run in CI (GITHUB_ACTIONS set or CI=true → exit 2 before any detection), so canonical names can never reach a public Actions log. Detection is shared through a single collectLeaks helper so the redacted and operator paths cannot drift, and a runCli seam makes the CI-refusal and both output paths directly testable without spawning. Closes #3408
fro-bot
left a comment
There was a problem hiding this comment.
The shape here is right: one detection pipeline (collectLeaks) feeding two formatters, with the redacted gate and the operator report sharing a single source of truth. That kills the drift risk between what CI sees and what an operator sees — the two outputs can't disagree about which pages leaked.
The privacy-critical ordering is correct. In runCli the CI check fires before collectLeaks, so when GITHUB_ACTIONS is set the function returns exit 2 with zero file I/O and zero canonical names touched. The refusal predicate (env.GITHUB_ACTIONS !== undefined || env.CI === 'true') is conservative in the right direction — any value of GITHUB_ACTIONS, even empty string, refuses. That's fail-closed, which is the only acceptable default when the failure mode is a private owner--repo name hitting a public log.
I verified the gate wiring: merge-data.yaml:54 invokes the script with no flag, so the promotion gate stays on the redacted path. --operator-report is never wired into any workflow step. Good.
Verdict: PASS
Blocking issues
None.
Non-blocking concerns
- The
runCliseam injectsenvbut not the filesystem; the local-operator path tests drive behavior through the hoistednode:fs/promisesmock rather than real I/O. That's consistent with the existinggateway-announce.tsconvention in this repo, so no change requested — just noting the operator path's real-FS behavior (reading actual data-branch pages) is exercised by the mock contract, not an integration test. Acceptable for a gate whose security property is the CI short-circuit, which is directly tested. formatOperatorReport's "do not paste into public logs" warning is load-bearing documentation, not an enforced control. The enforcement lives entirely in therunClirefusal. That's the correct place for it, but it means the only thing standing between this and a leak is that no future caller invokesformatOperatorReportoutside the guarded branch. The doc comment says as much; keep it that way.
Missing tests
None. Coverage hits the load-bearing cases: CI-refusal via both GITHUB_ACTIONS and CI, the assertion that refusal stdout/stderr carry no .md substring, normal-mode-in-CI still running redacted, the local actionable path emitting the filename, and the negative assertion that operator output does not use the leak-N redacted format. 100/100 pass locally.
Risk assessment: LOW
Additive change to a single script plus tests. No workflow, permission, or dependency surface touched. The one genuinely high-stakes property — no canonical name reaching a public log — is enforced by a pre-detection short-circuit and asserted by tests that check for the absence of any .md substring in CI output. The redacted gate path is unchanged.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/.github |
| Run ID | 26996184167 |
| Cache | hit |
| Session | ses_169de39e2ffec6JnGSrtlCW5ku |
When the promotion-time privacy gate blocks, its public output is redacted to per-run
leak-Nlabels — canonicalowner--repopage names must never reach the public Actions log. But that left the remediation path under-specified: an operator or agent had to reverse-engineer the gate to find which pages to fix.This adds a local-only
--operator-reportmode that prints the offending page paths and a remediation sequence, while keeping the public CI output redacted.--operator-reportrefuses to run whenGITHUB_ACTIONSis set orCI=true— it returns exit 2 with a short message before any detection runs, so no canonical name can ever reach a public log. Verified against a real leak fixture: under CI it refuses with zero filename output; locally it lists the offending page; the normal redacted gate is unchanged.collectLeakshelper, so the two outputs cannot drift.runCli(argv, env)seam makes the CI-refusal and both output paths directly testable without spawning a process. 16 new tests, including the CI short-circuit and the local actionable path.Closes #3408