feat: oracle delta + grouping + honest "Automated review" header by avrabe · Pull Request #31 · pulseengine/temper

avrabe · 2026-04-27T20:36:34Z

Why

spar#175 got an oracle review with 91 findings, every one pre-existing, the same message repeated 50+ times. User feedback: "not sure if i like it, unclear what to do with it". A PR review should answer "did this PR make things worse?" — not "what's the static state of the project?".

Three changes

1. Delta filter — `subtractFindings`

Run `rivet validate` at HEAD AND BASE. Surface only findings present at HEAD but not BASE. Pre-existing backlog filtered. On a docs PR: 91 → 0. Cost: one extra tarball + validate (~5–15 s).

2. Grouping — `groupOracleFindings`

Same `(source, severity, message)` → one finding with `artifact_ids[]` list. 50 "every requirement should be satisfied by ..." lines → 1 line listing all affected REQ ids.

3. Honest header rename

Old "AI Code Review" was misleading when the AI contributed only the summary line. Now:

Header: `## Automated review for PR #N`
HTML marker `` for supersede (independent of header text)
Footer breakdown: `Findings: X mechanical (rivet) · Y from local AI model.`
`supersedePreviousReviews` still matches the legacy `AI Code Review` string so existing comments get marked outdated.

Test plan

788 tests pass (was 773 — +15 covering subtract / group / rename / footer / grouped render)
eslint clean
After deploy + `/review-pr` on spar#175: (a) finding count drops drastically (delta only), (b) grouped messages, (c) new header with breakdown.
Docs-only PRs → oracle reports 0 new findings → comment skipped entirely.

Risk & rollout

Risk: medium. Three behaviour changes coupled. Each is gated by existing config.
Rollout: self-update on merge.

🤖 Generated with Claude Code

## Why spar#175 (the docs PR) got an oracle review with **91 findings, every one pre-existing**, with the *same message* repeated 50+ times. The user's reaction was "not sure if i like it, unclear what to do with it" — the right reaction. A PR review should answer "did this PR make things worse", not "list every diagnostic in the project". Three independent changes, shipped together because they're all the same "make the review legible" theme: ## 1. Delta filter (`subtractFindings`) Run `rivet validate` at HEAD AND at BASE. Surface only findings present in HEAD but not in BASE. Pre-existing diagnostics get filtered. On a docs PR this drops 91 findings to 0 — which is the correct answer. Cost: one extra tarball fetch + extract + validate (~5–15 s). Acceptable. ## 2. Grouping (`groupOracleFindings`) Findings sharing the same `(source, severity, message)` collapse into one finding listing all affected `artifact_ids`: Before: 50 lines of "REQ-XXX: every requirement should be satisfied by …" After: 1 line: "Every requirement should be satisfied by … — affecting: REQ-INST-003, REQ-TRANSFORM-002, … (+45 more)" `message` is now stored separately on findings (was inlined into `claim`). `maxIdsShown` defaults to 10, then "… (+N more)". ## 3. Honest header rename "AI Code Review" was misleading on PRs where the AI contributed nothing beyond the summary line (most PRs to date). Now: - Header: "## Automated review for PR #N" - HTML marker `` for supersede detection (independent of visible header text) - Footer breakdown: "_Findings: X mechanical (rivet) · Y from local AI model._" - supersedePreviousReviews matches BOTH the new marker AND the legacy "AI Code Review" string so old comments still get marked outdated. ## Test plan - [x] All 788 tests pass (was 773 — added 15 covering subtract / group / header rename / breakdown footer / grouped finding render) - [x] eslint clean - [ ] After deploy: `/review-pr` on spar#175 (or any non-trivial PR) should now show: (a) a much shorter findings list (delta only), (b) grouped messages, (c) the new "Automated review" header with breakdown. - [ ] On a docs-only PR the oracle should report 0 new findings → comment either skipped (verdict: approve, findings: []) or shows just the model summary. ## Risk & rollout - Risk: medium. Three behaviour changes in one release. Each is opt-in via existing config (rivet_oracle.enabled), and the verdict-from-findings pipeline gracefully handles 0 findings (silence > slop). - Rollout: self-update on merge. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

avrabe merged commit 310fc3d into main Apr 28, 2026
5 checks passed

avrabe deleted the feat/oracle-delta-grouping-rename branch April 28, 2026 16:50

avrabe mentioned this pull request May 1, 2026

docs: fix doc drift (deploy scripts, env vars, stale CHANGELOG) #42

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: oracle delta + grouping + honest "Automated review" header#31

feat: oracle delta + grouping + honest "Automated review" header#31
avrabe merged 1 commit into
mainfrom
feat/oracle-delta-grouping-rename

avrabe commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Apr 27, 2026

Why

Three changes

1. Delta filter — `subtractFindings`

2. Grouping — `groupOracleFindings`

3. Honest header rename

Test plan

Risk & rollout

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant