From 9afd5580d20ef4dca8d84b3668ffeacf91e3df88 Mon Sep 17 00:00:00 2001
From: Jarek Potiuk <jarek@potiuk.com>
Date: Wed, 27 May 2026 22:41:48 +0200
Subject: [PATCH 1/4] fix(pr-management-triage): match Copilot bots without
 requiring `[bot]` suffix
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The `copilot_review_stale` precondition (row 2 of the decision table)
listed `copilot-pull-request-reviewer[bot]`, `copilot[bot]`, etc. as
the matchable logins. GitHub's GraphQL `Actor.login` field returns
some of these without the `[bot]` suffix — most visibly for
`copilot-pull-request-reviewer` itself, which appears as
`copilot-pull-request-reviewer` (no suffix) on the PR's `author.login`
even though the underlying account is the same bot.

Strict `[bot]`-suffix matching excludes those threads from row 2.
Real-world impact seen during triage on `apache/airflow`: a PR with
two stale (55-day-old) Copilot review threads classified as
`deterministic_flag → comment` (only static-check failure surfaced)
instead of `stale_copilot_review → draft`. The maintainer had to
manually re-route the action.

Relax the matcher to "case-insensitive substring `copilot` in login;
`[bot]` suffix optional". The 7-day age threshold and the
no-author-reply qualifier are unchanged — only the login-string
match is widened to align with what GraphQL actually returns.
---
 .../pr-management-triage/classify-and-act.md  | 22 ++++++++++++++-----
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/.claude/skills/pr-management-triage/classify-and-act.md b/.claude/skills/pr-management-triage/classify-and-act.md
index 7d5df4c6..18854f8e 100644
--- a/.claude/skills/pr-management-triage/classify-and-act.md
+++ b/.claude/skills/pr-management-triage/classify-and-act.md
@@ -379,12 +379,22 @@ with the reply visible.
 ### `copilot_review_stale`
 
 The PR has at least one unresolved review thread whose first
-comment author matches a Copilot bot login (case-insensitive:
-`copilot-pull-request-reviewer[bot]`, `copilot[bot]`,
-`github-copilot[bot]`, or any login matching `copilot*[bot]` /
-`github-copilot*[bot]`) AND that comment's `createdAt` is ≥ 7
-days ago AND no author reply in the same thread or on the PR
-after that timestamp.
+comment author matches a Copilot bot login (case-insensitive
+substring match on the login). The skill matches any of:
+`copilot-pull-request-reviewer`, `copilot`, `github-copilot`,
+or any login containing the substring `copilot` followed
+optionally by `[bot]`. The `[bot]` suffix is matched when
+present but NOT required — GitHub's GraphQL `Actor.login`
+returns the bot username without the `[bot]` suffix for some
+Copilot integrations (e.g. `copilot-pull-request-reviewer`),
+even though the same bot appears as `copilot-pull-request-reviewer[bot]`
+on the REST API. Requiring `[bot]` excludes real Copilot
+threads from this rule and lets them age past the 7-day
+threshold without triggering. The threshold itself is unchanged.
+
+The thread must also satisfy: that comment's `createdAt` is
+≥ 7 days ago AND no author reply in the same thread or on the
+PR after that timestamp.
 
 ### `static_check`
 

From 91b5eadf90e0f2371b54a991ef16544f0992f667 Mon Sep 17 00:00:00 2001
From: Jarek Potiuk <jarek@potiuk.com>
Date: Wed, 27 May 2026 22:42:19 +0200
Subject: [PATCH 2/4] fix(pr-management-triage): route mixed static-check
 failures to `comment`, not `rerun`

Insert decision row 12b between rows 12 and 13. The original ordering
left a hole: row 12 fires only when *every* failed check is a static
check; if even one non-static failure was mixed in, row 12 missed and
row 13 (`failed_count <= 2 -> rerun`) caught the case and proposed a
rerun. Rerunning is a waste when at least one of the failures is a
static check that needs a code fix to pass.

Real-world impact seen during triage on `apache/airflow`: a PR with
`CI image checks / Static checks: FAILURE` plus
`provider distributions tests / Compat 3.0.6: FAILURE` (one static
+ one possibly-flaky compat test, total = 2 failures) classified as
`deterministic_flag -> rerun`. The maintainer overrode to `comment`
because the static-check failure will deterministically re-fail
without a code change.

Row 12b is intentionally placed *after* row 12 so the all-static
case still takes the more specific row (same action, different
reason template clarifies which case the maintainer is seeing).
---
 .claude/skills/pr-management-triage/classify-and-act.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.claude/skills/pr-management-triage/classify-and-act.md b/.claude/skills/pr-management-triage/classify-and-act.md
index 18854f8e..d587ce4c 100644
--- a/.claude/skills/pr-management-triage/classify-and-act.md
+++ b/.claude/skills/pr-management-triage/classify-and-act.md
@@ -85,6 +85,7 @@ Action verbs are defined in [`actions.md`](actions.md).
 | 10 | [`ci_failures_only`](#ci_failures_only) AND every failure ∈ `recent_main_failures`             | `deterministic_flag`       | `rerun`                | All N CI failures also appear in recent main-branch PRs — likely systemic, suggest rerun |
 | 11 | [`ci_failures_only`](#ci_failures_only) AND any failure ∈ `recent_main_failures`               | `deterministic_flag`       | `rerun`                | K/N CI failures match recent main-branch PRs — likely systemic |
 | 12 | [`ci_failures_only`](#ci_failures_only) AND every failed check is a [static check](#static_check) | `deterministic_flag`     | `comment`              | Only static-check failures — needs a code fix, not a rerun |
+| 12b | [`ci_failures_only`](#ci_failures_only) AND **any** failed check is a [static check](#static_check) (and not all — row 12 captured that case) | `deterministic_flag`     | `comment`              | Mixed failures including a static-check failure — code fix needed; a rerun would re-fail on the static check |
 | 13 | [`ci_failures_only`](#ci_failures_only) AND `failed_count <= 2` AND `commits_behind <= 50`     | `deterministic_flag`       | `rerun`                | N CI failure(s) on otherwise clean PR — likely flaky, suggest rerun |
 | 14a | [`author_confirmation_received`](#author_confirmation_received)                                | `author_confirmed_ready`   | `mark-ready`           | Author confirmed PR is ready for maintainer review — apply label |
 | 14b | [`pending_author_confirmation`](#pending_author_confirmation)                                  | `awaiting_author_confirmation` | `skip`             | Awaiting author confirmation requested M days ago |

From 878ab1f09db02011597aef2c9682c07e2a759de9 Mon Sep 17 00:00:00 2001
From: Jarek Potiuk <jarek@potiuk.com>
Date: Wed, 27 May 2026 22:42:51 +0200
Subject: [PATCH 3/4] fix(pr-management-triage): widen F5b to scan the last 5
 collaborator comments
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

F5b previously inspected only the *most recent* collaborator comment
for an unanswered `@`-mention to another maintainer. That misses the
common case where:

  1. Maintainer A pings maintainer B for an opinion ("@B what do
     you think of this approach?") in comment #N
  2. Maintainer C (often the viewer) posts a quality-criteria
     comment in comment #N+1 because of an orthogonal issue (CI
     red, merge conflicts, etc.)

The A->B ping is still outstanding — B has not replied — but F5b
only looks at comment #N+1, doesn't see the older mention, and the
PR slides into the decision table. The maintainer ends up auto-
proposing actions on a PR where two maintainers are mid-conversation.

Real-world impact seen during triage on `apache/airflow`: a PR
where one maintainer asked another for an opinion 2 days ago, then
the viewer drafted the PR yesterday for quality issues. The
classifier proposed `request-author-confirmation` today — talking
over the older unanswered ping.

Widen F5b to walk the last 5 collaborator comments, newest-first,
and fire on the first one matching the unanswered-`@`-mention
condition. The 5-comment window matches the bounded depth the
batch query already fetches; no additional GraphQL cost.
---
 .claude/skills/pr-management-triage/classify-and-act.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.claude/skills/pr-management-triage/classify-and-act.md b/.claude/skills/pr-management-triage/classify-and-act.md
index d587ce4c..1f0d5825 100644
--- a/.claude/skills/pr-management-triage/classify-and-act.md
+++ b/.claude/skills/pr-management-triage/classify-and-act.md
@@ -46,7 +46,7 @@ filter is skipped silently from the main triage flow.
 | F3 | Draft and not stale | `isDraft == true` and any activity within the last 14 days. Stale-sweep classifications in [`stale-sweeps.md`](stale-sweeps.md) may still pull the PR back in. |
 | F4 | Already marked ready, no regression | `labels` contains `ready for maintainer review` AND CI green AND `mergeable != CONFLICTING` AND no unresolved **collaborator** threads (same collaborator-author qualifier as rows 19/20 / [`unresolved_threads_only`](#unresolved_threads_only) — contributor-author threads alone don't count as a regression for an already-ready PR). **Regression bypasses this filter** — any of: CI red, new conflict, or a new unresolved collaborator thread whose triggering event (failing check `startedAt`, conflict detection, thread `createdAt`) is *after* the label-add timestamp. The typical case is a contributor pushing a rebase or fixup commit to a ready-for-review PR that re-introduces deterministic failures, OR a maintainer leaving a new review thread post-label-add. PRs bypassing F4 fall through to the decision table normally; the cross-cutting [`strip-ready-on-downgrade` hard rule](#hard-rules-cross-cutting-the-table) ensures the label comes off if a `deterministic_flag` row fires. |
 | F5a | Recent collaborator comment (author cooldown) | Most recent comment from the **union of** general-issue comments (`comments(last:10)`) and **review-thread comments** (`reviewThreads.nodes.comments`) is by a `COLLABORATOR`/`MEMBER`/`OWNER`, `createdAt < 72h` ago, AND posted after `commits(last:1).committedDate`. The review-thread leg is essential — a maintainer asking a clarifying question in-thread is just as much an active conversation as a top-level comment, and treating only the latter routes the PR to `ping` / `request-author-confirmation` while the maintainer is still mid-sentence. |
-| F5b | Maintainer-to-maintainer ping unanswered | Most recent collaborator comment from the **union** described in F5a above `@`-mentions one or more logins other than the PR author AND none of those mentioned logins have posted on the PR (general comments **or** review threads) or in `latestReviews` after that comment. Team mentions (e.g. `@<upstream>-committers`) are conservatively treated as F5b matches. |
+| F5b | Maintainer-to-maintainer ping unanswered | **Walk the most recent 5 collaborator comments** from the **union** described in F5a above (newest-first). For each, if it `@`-mentions one or more logins other than the PR author AND none of those mentioned logins have posted on the PR (general comments **or** review threads) or in `latestReviews` after that comment, F5b fires. The 5-comment window catches the case where maintainer A pings maintainer B, then a different maintainer C (often the viewer) posts a quality-criteria comment on the PR — the more-recent quality comment is not itself a ping, but the older A→B ping is still unanswered and the PR is still mid-conversation between maintainers. Without the deep scan F5b only inspects the most recent collaborator comment and the older ping slides past unfiltered. The 5-comment cap is the same one driving `comments(last:10)` / `reviewThreads.comments(first:5)` — bounded scan, no extra GraphQL. Team mentions (e.g. `@<upstream>-committers`) are conservatively treated as F5b matches. |
 | F6 | Maintainer co-drafted | `isDraft == true` AND any of: (a) `latestReviews` has a node with `authorAssociation ∈ {OWNER, MEMBER, COLLABORATOR}` AND `author.login ≠ <viewer>` AND `state ∈ {COMMENTED, CHANGES_REQUESTED, APPROVED}` AND `submittedAt > commits(last:1).committedDate` AND review body is non-empty (avoids the "review with only inline thread comments and an empty top-level body" false positive — those are already counted by row 14/15 unresolved-thread logic); (b) `comments(last:10)` has a node with `authorAssociation ∈ {OWNER, MEMBER, COLLABORATOR}` AND `author.login ≠ <viewer>` AND `length(bodyText) ≥ 80` AND `createdAt > commits(last:1).committedDate`. Trivial signals (emoji-only, `+1`, `lgtm`, pure `@team` pings without prose) do not count — those are already covered by F5a/F5b or are below the substantive-engagement threshold. **Stale-sweep classifications in [`stale-sweeps.md`](stale-sweeps.md) may still pull the PR back in** — F6 only suppresses duplicate-proposal rows from the decision table, not eventual-resurfacing on a different action. |
 
 F5a, F5b, and F6 override every signal in the decision table —

From b284aff73d3b7ab052656978d424a33c97c6f10f Mon Sep 17 00:00:00 2001
From: Jarek Potiuk <jarek@potiuk.com>
Date: Wed, 27 May 2026 22:44:00 +0200
Subject: [PATCH 4/4] feat(pr-management-triage): add row 0 to skip
 stale-abandoned first-timer PRs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Decision row 1 fires `approve-workflow` for any first-time-contributor
PR with no real CI run, regardless of how long ago the maintainer
already gave the contributor feedback or how long the contributor has
been silent. The result: a PR that was triaged 8+ weeks ago, with no
new commits since, keeps surfacing every triage sweep and the
maintainer either has to re-approve CI on stalled code (which then
re-fails on the same unaddressed quality issues) or override to `skip`
manually.

Add row 0 + the `first_time_stale_abandoned` precondition. The row
fires *before* row 1 and routes to `skip` when all of these hold:

  - author is FIRST_TIME_CONTRIBUTOR / FIRST_TIMER,
  - a prior viewer triage marker exists,
  - no commits have been pushed since the marker, and
  - the PR's head commit is >= 30 days old.

The PR is then left to the stale-sweep flow (sweep 1b — "untriaged
draft >= 14 days" — or its successor), which is the appropriate
closure pathway.

The 30-day threshold is deliberately longer than the grace windows
(24h / 96h) and the F5a cooldown (72h). It captures *abandonment*,
not slow response — a contributor who replies within a week and
then stalls for another week is not abandoned, just busy.

Real-world impact seen during triage on `apache/airflow`: a
first-time-contributor PR triaged on 2026-04-02 with no push since
classified as `approve-workflow` 8 weeks later. The maintainer
skipped it manually; row 0 retires it automatically.
---
 .../pr-management-triage/classify-and-act.md  | 35 +++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/.claude/skills/pr-management-triage/classify-and-act.md b/.claude/skills/pr-management-triage/classify-and-act.md
index 1f0d5825..2c76e631 100644
--- a/.claude/skills/pr-management-triage/classify-and-act.md
+++ b/.claude/skills/pr-management-triage/classify-and-act.md
@@ -72,6 +72,7 @@ Action verbs are defined in [`actions.md`](actions.md).
 
 | #  | Precondition (all must hold)                                                                  | Classification             | Action                  | Reason template |
 |----|-----------------------------------------------------------------------------------------------|---------------------------|------------------------|-----------------|
+| 0  | [`first_time_stale_abandoned`](#first_time_stale_abandoned) — first-time contributor PR with a prior viewer triage marker and no commits since the marker, ≥ 30 days old | `first_time_stale_abandoned` | `skip` | First-time contributor's PR was triaged ≥ 30d ago, no push since — let the stale-sweep retire it rather than re-approving CI |
 | 1  | `head_sha` appears in the per-page `action_required` REST index, OR ([`first_time_no_real_ci`](#first_time_no_real_ci))     | `pending_workflow_approval` | `approve-workflow`     | First-time contributor — review the diff and approve CI, or flag suspicious |
 | 2  | [`copilot_review_stale`](#copilot_review_stale)                                               | `stale_copilot_review`     | `draft` (include the specific Copilot thread URL in the violation body) | Unaddressed Copilot review ≥ 7 days old — convert to draft |
 | 3  | Viewer comment containing the triage marker exists, posted after last commit, age < 7 days, sub-state `waiting` | `already_triaged`         | `skip`                 | Already triaged M days ago — still waiting on author |
@@ -430,6 +431,40 @@ Count of PRs by the same author seen on the **current page** that
 already matched a `deterministic_flag` row. Per-page only — does
 not persist across sessions.
 
+### `first_time_stale_abandoned`
+
+All of:
+
+- `authorAssociation == FIRST_TIME_CONTRIBUTOR` or
+  `FIRST_TIMER`.
+- A viewer triage marker exists in `comments(last:10)` (i.e.
+  a comment by the viewer containing the literal string
+  `Pull Request quality criteria`).
+- The PR's `commits(last:1).committedDate` is **at or before**
+  the triage marker's `createdAt` — author has not pushed
+  since the maintainer's feedback.
+- `<now> - committedDate >= 30 days`.
+
+Order matters: this precondition is evaluated by **row 0**,
+which fires *before* row 1 (`pending_workflow_approval`).
+Without it, an abandoned first-time PR that has sat for months
+since being triaged keeps surfacing in the
+`approve-workflow` group every sweep, where the maintainer
+either re-approves CI on dead code or has to skip it manually.
+
+A row-0 hit routes to `skip` — the [stale-sweep
+flow](stale-sweeps.md) is the right place to retire the PR
+(via sweep 1b's "untriaged draft >= 14 days" trigger, since
+the row-0 marker test does *not* set the draft state). The
+classifier's job here is just to keep the PR out of the
+workflow-approval group.
+
+The 30-day threshold is intentionally longer than the
+[grace periods](#grace-periods) (24h / 96h) and the F5a
+cooldown (72h). It captures *abandonment*, not slow response —
+a contributor who replies within a week and then stalls for
+another week is not abandoned, just busy.
+
 ### `first_time_no_real_ci`
 
 Author is `FIRST_TIME_CONTRIBUTOR` or `FIRST_TIMER` AND one of: