Skip to content

feat: oracle delta + grouping + honest "Automated review" header#31

Merged
avrabe merged 1 commit into
mainfrom
feat/oracle-delta-grouping-rename
Apr 28, 2026
Merged

feat: oracle delta + grouping + honest "Automated review" header#31
avrabe merged 1 commit into
mainfrom
feat/oracle-delta-grouping-rename

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented Apr 27, 2026

Why

spar#175 got an oracle review with 91 findings, every one pre-existing, the same message repeated 50+ times. User feedback: "not sure if i like it, unclear what to do with it". A PR review should answer "did this PR make things worse?" — not "what's the static state of the project?".

Three changes

1. Delta filter — `subtractFindings`

Run `rivet validate` at HEAD AND BASE. Surface only findings present at HEAD but not BASE. Pre-existing backlog filtered. On a docs PR: 91 → 0. Cost: one extra tarball + validate (~5–15 s).

2. Grouping — `groupOracleFindings`

Same `(source, severity, message)` → one finding with `artifact_ids[]` list. 50 "every requirement should be satisfied by ..." lines → 1 line listing all affected REQ ids.

3. Honest header rename

Old "AI Code Review" was misleading when the AI contributed only the summary line. Now:

  • Header: `## Automated review for PR #N`
  • HTML marker `` for supersede (independent of header text)
  • Footer breakdown: `Findings: X mechanical (rivet) · Y from local AI model.`
  • `supersedePreviousReviews` still matches the legacy `AI Code Review` string so existing comments get marked outdated.

Test plan

  • 788 tests pass (was 773 — +15 covering subtract / group / rename / footer / grouped render)
  • eslint clean
  • After deploy + `/review-pr` on spar#175: (a) finding count drops drastically (delta only), (b) grouped messages, (c) new header with breakdown.
  • Docs-only PRs → oracle reports 0 new findings → comment skipped entirely.

Risk & rollout

  • Risk: medium. Three behaviour changes coupled. Each is gated by existing config.
  • Rollout: self-update on merge.

🤖 Generated with Claude Code

## Why
spar#175 (the docs PR) got an oracle review with **91 findings, every one
pre-existing**, with the *same message* repeated 50+ times. The user's
reaction was "not sure if i like it, unclear what to do with it" — the
right reaction. A PR review should answer "did this PR make things worse",
not "list every diagnostic in the project".

Three independent changes, shipped together because they're all the same
"make the review legible" theme:

## 1. Delta filter (`subtractFindings`)
Run `rivet validate` at HEAD AND at BASE. Surface only findings present in
HEAD but not in BASE. Pre-existing diagnostics get filtered. On a docs PR
this drops 91 findings to 0 — which is the correct answer.

Cost: one extra tarball fetch + extract + validate (~5–15 s). Acceptable.

## 2. Grouping (`groupOracleFindings`)
Findings sharing the same `(source, severity, message)` collapse into one
finding listing all affected `artifact_ids`:

  Before: 50 lines of "REQ-XXX: every requirement should be satisfied by …"
  After:  1 line: "Every requirement should be satisfied by … —
                  affecting: REQ-INST-003, REQ-TRANSFORM-002, … (+45 more)"

`message` is now stored separately on findings (was inlined into `claim`).
`maxIdsShown` defaults to 10, then "… (+N more)".

## 3. Honest header rename
"AI Code Review" was misleading on PRs where the AI contributed nothing
beyond the summary line (most PRs to date). Now:

- Header: "## Automated review for PR #N"
- HTML marker `<!-- temper-automated-review -->` for supersede detection
  (independent of visible header text)
- Footer breakdown: "_Findings: X mechanical (rivet) · Y from local AI model._"
- supersedePreviousReviews matches BOTH the new marker AND the legacy
  "AI Code Review" string so old comments still get marked outdated.

## Test plan
- [x] All 788 tests pass (was 773 — added 15 covering subtract / group /
      header rename / breakdown footer / grouped finding render)
- [x] eslint clean
- [ ] After deploy: `/review-pr` on spar#175 (or any non-trivial PR) should
      now show: (a) a much shorter findings list (delta only), (b) grouped
      messages, (c) the new "Automated review" header with breakdown.
- [ ] On a docs-only PR the oracle should report 0 new findings → comment
      either skipped (verdict: approve, findings: []) or shows just the
      model summary.

## Risk & rollout
- Risk: medium. Three behaviour changes in one release. Each is opt-in
  via existing config (rivet_oracle.enabled), and the verdict-from-findings
  pipeline gracefully handles 0 findings (silence > slop).
- Rollout: self-update on merge.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe merged commit 310fc3d into main Apr 28, 2026
5 checks passed
@avrabe avrabe deleted the feat/oracle-delta-grouping-rename branch April 28, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant