Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 125 additions & 42 deletions .github/skills/pr-finisher/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,91 @@
---
name: pr-finisher
description: Prepare an open pull request for merge by fixing local validation and failing checks.
description: Prepare an open pull request for merge from a GitHub Copilot cloud agent. Drives Reviews, local validation, and Mergeable to a ready state. Does not merge, and cannot trigger CI.
---

# PR Finisher

Use this skill when asked to get a pull request to a merge-ready state.
Drive an open PR for the current branch to a merge-ready state from a **GitHub Copilot cloud agent**. **Do not merge.** When everything you can act on is done, report ready-for-human-merge and stop.

## Goal
## Execution context

Take the current pull request from "almost done" to "ready to merge" by running the standard local checks, fixing failures, checking GitHub status, and pushing the resulting changes.
This skill runs inside a GitHub Copilot cloud agent, not on a developer's machine.

## Required Workflow
- **The agent's pushes do not trigger CI.** Workflows on the PR will not re-run after the agent commits. Any existing `statusCheckRollup` reflects an earlier HEAD and goes stale the moment the agent pushes.
- **Local `make` targets are the agent's authoritative correctness signal** before push. CI is observational only.
- **Re-running CI is a hand-off to a human** (close/reopen the PR, `workflow_dispatch`, or a push from a maintainer). The agent must surface this in its summary.
- **No watch / no sleep loops.** The agent has no async wait state; one pass + summary + stop.

1. Confirm the pull request context for the current branch.
2. Run `make fmt`.
3. Run the `copilot-review` skill to collect and address all in-scope pull request review comments.
4. Run `make lint` and fix any lint failures.
5. Run `make test-unit` and fix any failing unit tests.
6. Check the current GitHub checks for the pull request and identify every failing check.
7. If `make test` is failing locally or the failing checks show the equivalent failure in CI, fix those failures too.
8. If wasm golden tests fail, or a test fix changes the expected wasm compiler output, run `make update-wasm-golden`.
9. Re-run the affected local validation until it passes.
10. Push the final changes.
## Three merge-ready conditions

## Check Inspection
A PR is merge-ready when **all three** are satisfied. Work them **concurrently**.

- Use GitHub tools to inspect the current pull request status and check runs.
- When the user mentions CI, build, test, or workflow failures, fetch the failing job logs before deciding on a fix.
- Distinguish between failures already covered by local commands (`fmt`, `lint`, `test-unit`, `test`) and unrelated external failures.
| Condition | Definition | Agent's signal |
|---|---|---|
| **Reviews** | Every unresolved review thread is addressed on its merits, replied to, and resolved. Code changes alone do not satisfy this. | `copilot-review` skill + GraphQL `reviewThreads` |
| **Checks** | Local `make fmt` / `make lint` / `make test-unit` / `make test` pass. Last-known CI runs reviewed at log level. | `make` targets locally; `gh pr checks` / `gh run view --log-failed` for prior runs |
| **Mergeable** | PR is OPEN, not draft, `mergeable: MERGEABLE`, not `BEHIND` if the repo requires up-to-date branches. | `gh pr view --json mergeable,mergeStateStatus,state,isDraft` |

## Command Order
Because the agent cannot re-trigger CI, "Checks" is satisfied at the agent's level when **local validation passes and prior CI failures have been root-caused and fixed in the pushed commits**. Final green CI requires a human to re-trigger after the agent stops.

Run commands in this order unless a failure requires an earlier retry:
Top-level PR comments and review bodies are useful feedback but **not** a merge gate. Read and action useful ones; do not block on them.

## Hard rules

- **Do not merge.** Never run `gh pr merge`, enable auto-merge, or enqueue. This skill stops at "ready for merge."
- **Do not post stand-alone PR comments.** Only reply on existing review threads / comments that need a response. Do not ping reviewers or CODEOWNERS.
- **Always disable pagers** for `gh`: prefix with `GH_PAGER=""` or pipe through `cat`. Without this, commands hang in non-interactive shells.
- **Never wait for CI to re-run.** No `bash sleep`, no `gh run watch`, no `gh pr checks --watch`, no re-check loop after push. The agent's pushes will not trigger workflows; waiting is futile.
- **Local validation is non-negotiable before push.** Because CI will not re-run, the only correctness gate the agent gets is `make ...` locally. Treat a green local run as the bar.
- **Reviews are not done until reply + resolve both succeed.** Code change alone ≠ thread handled.
- **Smallest fix that works.** Don't change unrelated code. Fix lint before tests.
- **Pre-existing unrelated failures** → identify explicitly in the summary; do not guess-fix.

## CI-fix anti-patterns (do not do these)

A failing CI step is a signal, not a nuisance. Even though the agent cannot re-run CI to confirm, the following are **forbidden** and should trigger `ask_user` instead:

- Disabling, skipping, or neutering shared tooling (build caches, lint rules, type checks, env vars, required checks) to make a failure go away.
- "Temporary" disables with a TODO to re-enable later. They outlive the PR and become permanent.
- Lowering coverage thresholds, removing assertions, or loosening a test until it passes. If the test is wrong about product behavior, fix its **logic** (assertions, fixtures, setup); don't relax it.
- Bundling a workaround with a real fix ("belt and suspenders"). Ship one real fix or escalate. Never both.
- Special-casing one OS/runner to hide a failure on that platform.

**Anti-pattern test:** if the change would make the failure invisible on future PRs without solving it, stop and escalate.

**Before declaring a tool broken on a platform:** reproduce locally, check version/config, look for transient causes (timeouts, network, runner state). Most "X is broken on macOS/Windows" reports are transient flakes on healthy tooling.

**For flaky infra** (caches, registries, runners): prefer narrow fixes — targeted retry, higher timeout, pre-flight health check. If a narrow fix doesn't land in one or two attempts, escalate via `ask_user`.

## Workflow

The agent runs this once. There is no monitoring loop.

### 1. Triage

```bash
GH_PAGER="" gh pr view <number> --json state,isDraft,reviewDecision,mergeable,mergeStateStatus,statusCheckRollup,headRefOid
GH_PAGER="" gh pr checks <number>
```

If merged/closed, report and stop. Otherwise classify each condition as ✅ / ❌ / ⏳ / ❓. The CI snapshot here is your **only** view of CI for this run — capture which checks failed and why before changing anything, because after you push it will be stale.

### 2. Address Reviews

Delegate to the `copilot-review` skill. For each unresolved thread: make change → commit → reply → resolve. A thread is not handled until reply + resolve both succeed. Push only after all in-scope threads are handled (batch with other fixes below).

### 3. Address Mergeable

```bash
GH_PAGER="" gh pr view <number> --json mergeable,mergeStateStatus
```

- `CONFLICTING` → resolve conflicts using the repo's conventions. If you cannot determine the correct resolution, `ask_user`.
- `mergeStateStatus: BEHIND` → update branch from base. After updating, scan the new commits for tooling drift (lockfiles, toolchains, lint configs); re-run installs if manifests changed, and flag drift in the summary so any new errors read as drift, not regressions.

### 4. Address Checks (local + prior CI)

**Local validation** — the agent's only correctness signal. Run in order; fix at each step before moving on:

```bash
make fmt
Expand All @@ -41,34 +94,64 @@ make test-unit
make test
```

Only run `make test` after `make test-unit` and lint are clean, or when failing PR checks indicate the broader test suite is still broken.

## Fix Scope
If a `make test` fix changes wasm compiler output, or wasm golden tests fail:

- Make the smallest changes needed to get the pull request green.
- Fix lint issues before test issues.
- Do not change unrelated code just because it is nearby.
- If a failure is pre-existing and unrelated to the pull request, report that explicitly in your final user or PR update instead of guessing.
```bash
make update-wasm-golden
```

## Wasm Golden Files
Then re-run the affected tests.

When a `make test` fix changes wasm compiler output or wasm golden tests fail, regenerate the wasm goldens with:
**Prior CI failures** — for each failure captured during triage, pull logs and fix the root cause:

```bash
make update-wasm-golden
GH_PAGER="" gh run view <run_id> --log-failed
```

Classify as: real product/test bug, infra flake, or third-party flake. Apply the fix in the agent's commits and, where possible, **reproduce the fix locally** via the matching `make` target. If the failure can't be reproduced locally (infra-only), state that in the summary so the human re-triggers CI with eyes open. Per anti-pattern rules: 1–2 narrow attempts, then `ask_user`.

### 5. Push and stop

Push once, after all fixes are in. **Do not re-check `gh pr checks` expecting a new run.** Print the summary and stop.

## Summary format

At the stopping point, print:

```
- ✅ Reviews — <plain language>
- ✅ Checks (local) — <plain language>
- <status> Checks (CI) — stale after agent push; needs human re-trigger. Prior failures: <fixed | open | not reproducible locally>
- ✅ Mergeable — <plain language>

Actions taken: <what changed in this run>
Hand-off: CI must be re-triggered by a maintainer (close/reopen PR, workflow_dispatch, or push) before merge.
Still needed: <human review, anything not actionable from the agent>
```

Then re-run the relevant tests.
Status vocabulary:
- ✅ satisfied — checked and passing
- ❌ failing — checked and failing
- ⏳ pending — running, waiting for signal (rare for the agent; never use for the post-push CI state)
- ❓ unknown — could not be checked (API error, indeterminate, or CI stale after agent push). Never use ❌ for this.

**Translate status into plain language.** Don't write bare labels. Always state explicitly that CI on the agent's HEAD is unverified until a human re-triggers it.

## Stopping conditions

- **Ready for merge (pending human CI re-trigger)** — local validation green, Reviews resolved, Mergeable clean. Summarize and stop.
- **Nothing actionable remains** — non-actionable blocker (human approval, external service). Summarize and stop.
- **Truly stuck** — unresolvable conflicts, ambiguous feedback, irreproducible failures. `ask_user` with context.

## Completion Standard
## Completion standard

The task is complete only when all of the following are true:
The task is complete only when all are true:

- `make fmt` has been run.
- The `copilot-review` skill was run and all in-scope review comments were addressed.
- `make lint` passes, or any unrelated pre-existing failure is explicitly identified.
- `make test-unit` passes, or any unrelated pre-existing failure is explicitly identified.
- Failing pull request checks were inspected and summarized.
- `make test` was fixed when it was part of the failing state.
- Wasm goldens were regenerated when required.
- The final changes were pushed.
- `make fmt`, `make lint`, `make test-unit` all pass (or unrelated pre-existing failures explicitly identified).
- `make test` was run and fixed when it was part of the failing state; wasm goldens regenerated when required.
- The `copilot-review` skill addressed all in-scope review threads (reply + resolve succeeded for each).
- Mergeable condition was checked; conflicts resolved and `BEHIND` updated when present.
- Prior CI failures were inspected at the log level and either fixed at the root cause (with a local reproduction where possible) or explicitly flagged as not locally reproducible / escalated.
- Final changes were pushed exactly once at the end. No post-push re-check loop.
- A structured ✅/❌/⏳/❓ summary was printed, including an explicit hand-off line for the human CI re-trigger.
- No `gh pr merge` was run.
Loading