Skip to content

Suppress failure-tracking issues from Build Failure Analysis workflows#8726

Merged
Evangelink merged 1 commit into
mainfrom
dev/amauryleve/bfa-suppress-failure-issues
Jun 1, 2026
Merged

Suppress failure-tracking issues from Build Failure Analysis workflows#8726
Evangelink merged 1 commit into
mainfrom
dev/amauryleve/bfa-suppress-failure-issues

Conversation

@Evangelink

Copy link
Copy Markdown
Member

Stops the Build Failure Analysis workflow from spamming tracking issues (e.g. #8685) when its AI agent flakes on otherwise-successful builds.

Why

The workflow runs on every PR. gh-aw does not expose a job-level if: hook to gate the auto-generated Copilot CLI step, so the agent is invoked unconditionally; on successful builds it simply noops. Upstream the Copilot/AI service is intermittently flaky and returns "Failed to get response from the AI model"; when that happens during a noop call, the run is marked failed and the gh-aw safe-output machinery auto-files a [aw] Build Failure Analysis failed tracking issue.

Example flake: https://github.com/microsoft/testfx/actions/runs/26727721772 -- build succeeded (0 errors), the binlog-mcp / NuGet MCP / dump-binlog steps correctly skipped (they are gated on steps.build.outcome == ''failure''), but Execute GitHub Copilot CLI still ran, retried 4 outer x 5 inner times, and exited 1 with "Last error: Unknown error".

What

Adds safe-outputs.report-failure-as-issue: false to both:

  • .github/workflows/build-failure-analysis.md
  • .github/workflows/build-failure-analysis-command.md

This is the gh-aw-sanctioned switch for exactly this scenario (the auto-generated tracking-issue body links to it as "Stop reporting this workflow as a failure"). The compiled change in both .lock.yml files is a single env-var flip (GH_AW_FAILURE_REPORT_AS_ISSUE: "true" -> "false"), plus the usual frontmatter-hash and prompt-heredoc-marker churn.

The workflow stays advisory: real analyses still post PR comments / inline suggestion blocks on legit build failures, and any run-level failures remain visible in the Actions UI for anyone who wants to investigate.

Out of scope

A deeper fix (split the build into a separate job and gate the agent job on needs.build.outputs.outcome == ''failure'') would also save the wasted AI API calls on successful builds, but: gh-aw applies the top-level if: to the activation job rather than the agent job (see .github/workflows/adhoc-qa.{md,lock.yml}), NuGet.Mcp.Server would need to be installed on both jobs, and GH_AW_BINLOG_PATH would become invalid across jobs. That refactor can land separately if needed; this PR focuses on stopping the immediate issue spam.

Lock files regenerated with gh aw compile --strict (v0.75.4, the version pinned for the rest of the BFA lock files).

The Build Failure Analysis workflow runs on every PR. Its AI agent is
invoked unconditionally because gh-aw exposes no job-level `if:` hook
to gate the auto-generated Copilot CLI step. On successful builds the
agent immediately noops, but the upstream Copilot/AI service is
intermittently flaky (`Failed to get response from the AI model`), so
those otherwise-pointless invocations occasionally fail the run, which
in turn opens a tracking issue (see #8685 and follow-ups).

Set `safe-outputs.report-failure-as-issue: false` on both
build-failure-analysis.md and build-failure-analysis-command.md. The
workflow remains advisory: real analyses still post comments on legit
build failures, and run-level failures stay visible in the Actions UI
for investigation -- but transient AI flakes no longer spam tracking
issues.

Lock files regenerated with `gh aw compile --strict` (v0.75.4, the
version pinned for the rest of the BFA lock files); the only effective
change downstream is `GH_AW_FAILURE_REPORT_AS_ISSUE: false`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 1, 2026 08:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the gh-aw “Build Failure Analysis” advisory workflows to stop auto-filing failure-tracking issues when the Copilot/AI service flakes during otherwise-successful builds (e.g., issue #8685). The workflows will still fail visibly in the Actions UI, but will no longer spam [aw] ... failed tracking issues.

Changes:

  • Set safe-outputs.report-failure-as-issue: false in both Build Failure Analysis workflow sources.
  • Regenerate both compiled .lock.yml files so they reflect GH_AW_FAILURE_REPORT_AS_ISSUE: "false".
Show a summary per file
File Description
.github/workflows/build-failure-analysis.md Disables gh-aw failure tracking issue creation via safe-outputs.report-failure-as-issue: false.
.github/workflows/build-failure-analysis.lock.yml Regenerated compiled workflow; flips GH_AW_FAILURE_REPORT_AS_ISSUE to "false" and updates hashes/heredoc markers.
.github/workflows/build-failure-analysis-command.md Mirrors the same failure-issue suppression for the slash-command variant.
.github/workflows/build-failure-analysis-command.lock.yml Regenerated compiled workflow; flips GH_AW_FAILURE_REPORT_AS_ISSUE to "false" and updates hashes/heredoc markers.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 0

@Evangelink Evangelink left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Suppress failure-tracking issues from Build Failure Analysis workflows

# Dimension Status Severity
1 Algorithmic Correctness ✅ N/A -
2 Threading & Concurrency ✅ N/A -
3 Security & IPC Contract Safety ✅ N/A -
4 Public API & Binary Compatibility ✅ N/A -
5 Performance & Allocations ✅ N/A -
6 Cross-TFM Compatibility ✅ N/A -
7 Resource & IDisposable Management ✅ N/A -
8 Defensive Coding at Boundaries ✅ N/A -
9 Localization & Resources ✅ N/A -
10 Test Isolation ✅ N/A -
11 Assertion Quality ✅ N/A -
12 Flakiness Patterns ✅ N/A -
13 Test Completeness & Coverage ✅ N/A -
14 Data-Driven Test Coverage ✅ N/A -
15 Code Structure & Simplification ✅ N/A -
16 Naming & Conventions ✅ N/A -
17 Documentation Accuracy ✅ Clean -
18 Analyzer & Code Fix Quality ✅ N/A -
19 IPC Wire Compatibility ✅ N/A -
20 Build Infrastructure & Dependencies ✅ Clean -
21 Scope & PR Discipline ✅ Clean -

✅ 21/21 dimensions clean — no findings.


Overall Assessment

The change is correct, safe, and well-motivated:

  • Lock file consistency: Both .lock.yml files correctly reflect the GH_AW_FAILURE_REPORT_AS_ISSUE: "true" → "false" flip that corresponds to report-failure-as-issue: false in their respective .md sources. The frontmatter hash updates are expected.
  • Motivation is sound: The PR description and inline comments accurately describe the root cause (unconditional agent invocation on every PR, transient AI service flakes producing spurious tracking issues). The referenced example run corroborates the scenario.
  • Trade-off is acknowledged: The comment in build-failure-analysis.md explicitly notes that run-level failures remain visible in the Actions UI, so genuine infrastructure failures are not silently swallowed — they just no longer auto-file issues.
  • gh-aw-sanctioned mechanism: Using the dedicated report-failure-as-issue: false switch is the correct, first-party approach for this scenario rather than a workaround.
  • Scope is disciplined: The deeper fix (splitting build/agent jobs) is explicitly deferred with a concrete explanation of why it's non-trivial; no follow-up issues are needed since the PR description documents it clearly.

No actionable issues found.

Generated by Expert Code Review (on open) for issue #8726 · sonnet46 1.5M

@Evangelink Evangelink merged commit 511fa67 into main Jun 1, 2026
26 of 27 checks passed
@Evangelink Evangelink deleted the dev/amauryleve/bfa-suppress-failure-issues branch June 1, 2026 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants