docs(knowledge): record the wiki-presence provenance boundary as an accepted residual#3451
Conversation
…ccepted residual The promotion gate's attribution check trusts a page's self-declared structured sources rather than independently proving the body concerns that repo. Document why that is safe — the real provenance boundary is the upstream Fro Bot identity, data-branch sole-writer, and authority-guard controls, so no untrusted actor can plant a page with forged sources — and why content-level provenance is intentionally not built (no reachable attack; the machinery is not worth it). Recorded so the decision is explicit and revisitable if upstream controls weaken. Closes #3418
fro-bot
left a comment
There was a problem hiding this comment.
This is the kind of residual worth carving into the schema: a security decision that looks like an omission until someone writes down why it's an omission. The note is accurate against the code it describes.
I verified the claims rather than taking them on faith:
scripts/check-wiki-private-presence.tsexists and does exactly what the note says.parseFrontmatterSources(L52) extracts structuredsources[].url, andsourceUrlMatchesRepo(L108) enforces an exacthttps://github.com/{owner}/{repo}match — strict protocol, host, port, and first-two-path-segment checks. The structured-sources path (L195-207) deliberately ignores body substring, which is precisely the "trusts the page's own declared sources" boundary the note describes.- The "it does not independently prove the page body genuinely concerns that repository" claim is correct. The gate verifies a declared URL matches the slug; it never inspects whether the prose is actually about that repo.
- The upstream controls cited are real:
check-wiki-authority.tsexists and gates wiki writes tofro-bot/fro-bot[bot](schema.md:88-90), and the private-presence gate is wired intomerge-data.yaml. Thedata-sole-writer invariant holds, so the "no path to plant a page with forged self-declared sources" reasoning is sound — the attribution check is genuinely defense-in-depth, not the load-bearing wall.
The threat model is honest about its own shape: it names the gap (content-level provenance), explains why the gap is unreachable today (identity + sole-writer), and leaves a tripwire — "revisitable if the upstream controls ever weaken." That last clause is the part that matters. Residuals rot when the conditions that made them safe drift out from under them silently. This one carries its own expiry condition.
Additive, docs-only, no behavior change. Closes #3418. Approving.
Verdict: PASS
Blocking issues
None.
Non-blocking concerns
None. The note correctly scopes itself to wiki/repos/ slug attribution and does not overclaim coverage of non-repo wiki areas (consistent with schema.md:8, which already flags those as relying on the in-progress companion scan).
Missing tests
None required. Docs-only change to knowledge/schema.md, which is explicitly editable through normal PRs to main (schema.md:105). No code, no behavior, nothing to test. The gate logic this note describes is already covered by the script's own exported, testable seams (detectPrivateWikiLeaks, sourceUrlMatchesRepo, runCli).
Risk assessment: LOW
Single-file, +8/-0, prose-only. No workflow, dependency, or permissions surface touched. Worst case if the note were wrong is a misleading comment — but it isn't wrong; it tracks the implementation.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/.github |
| Run ID | 26996674533 |
| Cache | hit |
| Session | ses_169d01d9affeBBMfmzLeJenSaQ |
Records the wiki-presence gate's provenance boundary as an explicit, accepted residual in
knowledge/schema.md.The promotion gate's attribution check confirms a slug-matching
wiki/repos/page declares its public repository in structuredsources[].urlwith an exact owner/repo match. That stops decoy-URL mis-attribution, but it trusts the page's own declared sources — it does not independently prove the body concerns that repository.The note documents why that is safe and why nothing more is built:
databranch is the sole writer, and the authority guard rejects any other origin. A page can only reach the attribution check by passing through those controls, so there is no path to plant a page with forged self-declared sources.Docs-only; no code or behavior change.
Closes #3418