feat(review-pr): auto-filter low-risk files before chunking by derekmisler · Pull Request #215 · docker/cagent-action

derekmisler · 2026-05-21T19:38:57Z

Summary

Fixes the root cause of large-PR review timeouts by making the risk-scoring pipeline automatically filter out low-risk files before diff chunking, without requiring per-repo exclude-paths configuration.

Root causes addressed

Risk scoring happened after chunking — scores were computed but never used to filter the diff. Every file (including test files, generated code, etc.) got chunked and processed by the full multi-agent pipeline.
TEST_FILE_RE in score-risk.ts missed Rust/Ruby test files — Rust test/bench/spec files and Ruby spec files scored non-zero and were included in chunks unnecessarily.

Changes

Change 1 — Extend `TEST_FILE_RE` in `src/score-risk/score-risk.ts`

Extended the regex to also match:

Rust suffix patterns: _test.rs, _bench.rs, _spec.rs
Ruby spec files: _spec.rb
Any file under a tests/, test/, benches/, __tests__/, or specs/ directory component

Added 9 new Vitest tests covering these new patterns.

Change 2 — New module `src/auto-filter-diff/`

Pure TypeScript module with two-phase filtering:

Phase 1 (auto-exclude): Removes all score-0 files. Unknown files (not in riskScores) are kept as they need review.
Phase 2 (progressive cap): If remaining diff > maxDiffLines (default 3000), removes lowest-risk files first until it fits. Always keeps at least 1 file.

Includes 18 Vitest unit tests covering all phases and edge cases.

Change 3 — Wire up in `review-pr/action.yml`

Added max-diff-lines input (default "3000", set "0" to disable)
Moved "Score file risk" step to run before "Split diff into chunks"
Added new "Auto-filter low-risk files" step between scoring and chunking

New step order: Filter excluded paths → Score file risk → Auto-filter → Split into chunks → Generate file history

Change 4 — Propagate through `.github/workflows/review-pr.yml`

Added max-diff-lines as a workflow-level number input (default 3000) and passed it through to the review-pr action call.

Change 5 — `tsup.config.ts`

Added 'src/auto-filter-diff/index.ts' to the entry map so tsup bundles it as dist/auto-filter-diff.js.

Test results

Test Files  20 passed (20)
     Tests  358 passed (358)

Biome formatting and TypeScript type-check pass cleanly. The pre-existing actionlint warning about the node24 runner in test-e2e.yml is unrelated to this PR (present on main before these changes).

docker-agent

Assessment: 🟡 NEEDS ATTENTION

This PR correctly implements a two-phase auto-filter mechanism and moves risk scoring before diff chunking. The new TEST_FILE_RE additions for Rust/Ruby patterns look correct, and the tsup.config.ts entry is properly wired. One medium-severity logic issue was found in Phase 2 of autoFilterDiff.

derekmisler

Assessment: 🟡 NEEDS ATTENTION

No new findings from this review pass the quality gate. One confirmed bug is already flagged in an existing inline comment (line 164 of src/auto-filter-diff/auto-filter-diff.ts): the Phase 2 progressive cap uses slice(0, -1) to protect the last file in the ascending-risk sort, but when multiple unscored files exist they all sort to Infinity and only the very last is protected — the others remain removable, violating the "unknown = needs review" invariant.

All other items examined were verified clean: missing set -euo pipefail in the auto-filter step (single-command step, no functional impact), remainingLines preamble accounting (no production trigger path), TEST_FILE_RE regex (works correctly), step ordering (correct: filter → score → auto-filter → chunk), lock-check guard asymmetry (harmless: pr.diff is never created when skip=true), and extractFilePath edge cases (correct for normal paths).

Signed-off-by: Derek Misler <derek.misler@docker.com>

docker-agent reviewed May 21, 2026

View reviewed changes

Comment thread src/auto-filter-diff/auto-filter-diff.ts Outdated

derekmisler commented May 21, 2026

View reviewed changes

derekmisler force-pushed the feat/auto-filter-large-diffs branch 3 times, most recently from da46da7 to 77bdd24 Compare May 22, 2026 15:00

feat(review-pr): auto-filter low-risk files before chunking

0b9207e

Signed-off-by: Derek Misler <derek.misler@docker.com>

derekmisler force-pushed the feat/auto-filter-large-diffs branch from 77bdd24 to 0b9207e Compare May 22, 2026 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(review-pr): auto-filter low-risk files before chunking#215

feat(review-pr): auto-filter low-risk files before chunking#215
derekmisler wants to merge 1 commit into
mainfrom
feat/auto-filter-large-diffs

derekmisler commented May 21, 2026

Uh oh!

docker-agent left a comment

Uh oh!

Uh oh!

derekmisler left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

derekmisler commented May 21, 2026

Summary

Root causes addressed

Changes

Change 1 — Extend TEST_FILE_RE in src/score-risk/score-risk.ts

Change 2 — New module src/auto-filter-diff/

Change 3 — Wire up in review-pr/action.yml

Change 4 — Propagate through .github/workflows/review-pr.yml

Change 5 — tsup.config.ts

Test results

Uh oh!

docker-agent left a comment

Choose a reason for hiding this comment

Assessment: 🟡 NEEDS ATTENTION

Uh oh!

Uh oh!

derekmisler left a comment

Choose a reason for hiding this comment

Assessment: 🟡 NEEDS ATTENTION

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Change 1 — Extend `TEST_FILE_RE` in `src/score-risk/score-risk.ts`

Change 2 — New module `src/auto-filter-diff/`

Change 3 — Wire up in `review-pr/action.yml`

Change 4 — Propagate through `.github/workflows/review-pr.yml`

Change 5 — `tsup.config.ts`