Enable gh-aw-detection on 50% of agentic workflows#40698
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Test Quality Sentinel completed test quality analysis. No test files were added or modified in this PR. The PR only contains changes to workflow .md and .lock.yml files (enabling gh-aw-detection on 50% of agentic workflows). Test Quality Sentinel skipped. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #40698 does not have the 'implementation' label (has_implementation_label=false) and has 0 new lines of code in business logic directories (requires_adr_by_default_volume=false, no custom config). The PR changes are limited to .github/skills shell/python query files, which are outside the default business logic directories. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
Rolls out the gh-aw-detection feature flag to additional agentic workflows so compiled .lock.yml workflows use the external threat-detect binary path (emitting detection_result.json) instead of the inline execution path.
Changes:
- Added
features: gh-aw-detection: trueto a subset of workflow markdown sources (selected alphabetically). - Recompiled workflows, producing updated
.lock.ymlthat installsthreat-detect, runs it under AWF, uploadsdetection_result.json, and concludes detection viathreat-detect conclude. - Updated detection artifacts to upload both
detection_result.jsonanddetection.log(for workflows on the external detector path).
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/daily-repo-chronicle.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-rendering-scripts-verifier.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-reliability-review.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-performance-summary.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-observability-report.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-news.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-news.lock.yml | Recompiled workflow using external threat-detect binary path and new conclude step. |
| .github/workflows/daily-model-inventory.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-mcp-concurrency-analysis.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-malicious-code-scan.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-malicious-code-scan.lock.yml | Recompiled workflow metadata after enabling gh-aw-detection. |
| .github/workflows/daily-issues-report.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-issues-report.lock.yml | Recompiled workflow using external threat-detect binary path and new conclude step. |
| .github/workflows/daily-hippo-learn.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-hippo-learn.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/daily-geo-optimizer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-function-namer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-function-namer.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/daily-formal-spec-verifier.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-file-diet.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-fact.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-experiment-report.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-doc-updater.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-doc-updater.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/daily-doc-healer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-compiler-threat-spec-optimizer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-compiler-quality.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-community-attribution.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-code-metrics.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-cli-performance.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-caveman-optimizer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-cache-strategy-analyzer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-awf-spec-compiler-surfacing.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-awf-spec-compiler-surfacing.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/daily-aw-cross-repo-compile-check.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-astrostylelite-markdown-spellcheck.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/daily-astrostylelite-markdown-spellcheck.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/daily-assign-issue-to-user.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/craft.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-session-insights.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-pr-prompt-analysis.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-pr-nlp-analysis.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-pr-merged-report.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-opt.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-cli-deep-research.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-agent-analysis.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/copilot-agent-analysis.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/contribution-check.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/constraint-solving-potd.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/commit-changes-analyzer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/commit-changes-analyzer.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/code-scanning-fixer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/cloclo.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/cli-version-checker.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/cli-version-checker.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/cli-consistency-checker.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/claude-code-user-docs-review.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/ci-doctor.md | Adds features: gh-aw-detection: true to a provenance-managed workflow source. |
| .github/workflows/ci-coach.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/ci-coach.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/chaos-pr-bundle-fuzzer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/chaos-pr-bundle-fuzzer.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/breaking-change-checker.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/brave.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/blog-auditor.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/blog-auditor.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/auto-triage-issues.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/auto-triage-issues.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/avenger.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/aw-failure-investigator.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/audit-workflows.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/artifacts-summary.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/architecture-guardian.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/archie.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/approach-validator.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/approach-validator.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/api-consumption-report.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/ai-moderator.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/ai-moderator.lock.yml | Recompiled workflow metadata after enabling gh-aw-detection. |
| .github/workflows/agent-persona-explorer.md | Enables gh-aw-detection via frontmatter feature flag. |
| .github/workflows/agent-persona-explorer.lock.yml | Recompiled workflow using external threat-detect binary path. |
| .github/workflows/agent-performance-analyzer.md | Enables gh-aw-detection via frontmatter feature flag. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 128/128 changed files
- Comments generated: 3
| source: githubnext/agentics/workflows/ci-doctor.md@e15e57b40918dbca11b350c55d02ab61934afa75 | ||
| features: | ||
| gh-aw-detection: true | ||
| --- |
| run: | | ||
| threat-detect conclude --result-file /tmp/gh-aw/threat-detection/detection_result.json |
| run: | | ||
| threat-detect conclude --result-file /tmp/gh-aw/threat-detection/detection_result.json |
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /zoom-out and /improve-codebase-architecture — 3 observations worth addressing before 100% rollout, plus one positive callout. No blocking issues, but the timeout concern (#1 below) is worth a quick follow-up.
📋 Key Themes & Highlights
Key Themes
-
Timeout enforcement boundary shift: The external-detector path drops the YAML step-level
timeout-minutes:that the inline-engine path emitted. The runner-enforced guard is replaced byGH_AW_TIMEOUT_MINUTESin env (binary-honoured). The job-level cap (45 min) is a coarser backstop — a hung AWF host process could starve the conclude/cleanup steps of their time budget. -
Network config silently omitted: The AWF config for the external-detector path drops the
"network"key. This is intentional (threat-detect routes through the api-proxy, no domain allowlist needed at the AWF layer), but a single-line code comment would prevent future reviewers from treating the missing key as a bug. -
Conclude step output contract untested:
TestExternalDetectorPathverifies the step name but not that the binary actually writesconclusion/reason/successto$GITHUB_OUTPUT. The gate is only as good as that contract.
Positive Highlights
- ✅ Replacing the 40-line github-script conclude block with a single
threat-detect concludeinvocation is a clear architecture win — less inline JS, single place of truth in the binary. - ✅ Uploading
detection_result.jsonalongsidedetection.logenables programmatic comparison between the old and new paths during the rollout window. - ✅
set -o pipefailpreserved — non-zero exits fromthreat-detectcorrectly fail the step. - ✅ Binary pinned to
v0.2.2viaDefaultThreatDetectVersionconstant — no floating references.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 287.9 AIC · ⌖ 9.96 AIC · ⊞ 6.9K
| continue-on-error: true | ||
| run: | | ||
| bash "${RUNNER_TEMP}/gh-aw/actions/install_threat_detect_binary.sh" v0.2.2 | ||
| - name: Execute threat detection with AWF |
There was a problem hiding this comment.
[/zoom-out] The new execution step drops the timeout-minutes: YAML field that the inline-engine path had at step level.
Without a step-level timeout, GitHub Actions will not terminate a hung threat-detect process until the job-level cap (45 min here) is exhausted. The old path had timeout-minutes: 20 as a runner-enforced safety net in addition to GH_AW_TIMEOUT_MINUTES: 20. The new path relies solely on the binary honouring that env var; if AWF freezes at the host level before the binary runs, no runner-side guard exists.
💡 Suggestion
Add a step-level timeout to buildExternalDetectorExecutionStep in pkg/workflow/threat_detection.go, matching the pattern in copilot_engine_execution.go line 755:
steps = append(steps, " timeout-minutes: "+timeoutValue+"\n")The timeoutValue can be sourced from DefaultAgenticWorkflowTimeout (same constant used for the main agent step), keeping both paths consistent.
| GH_AW_MAX_AI_CREDITS="${GH_AW_MAX_AI_CREDITS:-400}" | ||
| printf '%s\n' "{\"\$schema\":\"https://github.com/github/gh-aw-firewall/releases/download/v0.27.7/awf-config.schema.json\",\"network\":{\"allowDomains\":[\"api.business.githubcopilot.com\",\"api.enterprise.githubcopilot.com\",\"api.github.com\",\"api.githubcopilot.com\",\"api.individual.githubcopilot.com\",\"github.com\",\"host.docker.internal\",\"registry.npmjs.org\",\"telemetry.enterprise.githubcopilot.com\"]},\"apiProxy\":{\"enabled\":true,\"enableTokenSteering\":true,\"maxRuns\":500,\"maxAiCredits\":${GH_AW_MAX_AI_CREDITS},\"maxCacheMisses\":5},\"container\":{\"imageTag\":\"0.27.7,squid=sha256:deb1d4e19de62d51cee0508057a596a19315c3423ada4d675cad136dc8037c96,agent=sha256:aae231e4635c8999d039c132f1602d3df850fe9b84a00aa2b5ac981179b5661c,api-proxy=sha256:009caf2e3d88fa77b64e9a03a95a228fc58db0f1701c6d324b29ba5a3c7c79b6,cli-proxy=sha256:4757f198a3fa20f88bdbe70be7ae1a05f127d9c0a9e96a5d6460ef40c08fc83d\"}}" > "${RUNNER_TEMP}/gh-aw/awf-config.json" | ||
| GH_AW_MAX_AI_CREDITS="${{ vars.GH_AW_DEFAULT_DETECTION_MAX_AI_CREDITS || '400' }}" | ||
| printf '%s\n' "{\"\$schema\":\"https://github.com/github/gh-aw-firewall/releases/download/v0.27.7/awf-config.schema.json\",\"apiProxy\":{\"enabled\":true,\"enableTokenSteering\":true,\"maxRuns\":500,\"maxAiCredits\":${GH_AW_MAX_AI_CREDITS},\"maxCacheMisses\":5},\"container\":{\"imageTag\":\"0.27.7,squid=sha256:deb1d4e19de62d51cee0508057a596a19315c3423ada4d675cad136dc8037c96,agent=sha256:aae231e4635c8999d039c132f1602d3df850fe9b84a00aa2b5ac981179b5661c,api-proxy=sha256:009caf2e3d88fa77b64e9a03a95a228fc58db0f1701c6d324b29ba5a3c7c79b6,cli-proxy=sha256:4757f198a3fa20f88bdbe70be7ae1a05f127d9c0a9e96a5d6460ef40c08fc83d\"}}" > "${RUNNER_TEMP}/gh-aw/awf-config.json" |
There was a problem hiding this comment.
[/zoom-out] The AWF config for the external detector path omits the "network" key entirely, whereas the inline Copilot engine path emitted an explicit allowDomains list (sourced from GetThreatDetectionAllowedDomains).
The behaviour is intentional — threat-detect uses the AWF API proxy for LLM calls so no explicit domain allowlist is needed — but a comment in buildExternalDetectorExecutionStep explaining this would make it clear to future readers that the missing key is deliberate, not an oversight.
💡 Suggested comment
In pkg/workflow/threat_detection.go, around the threatDetectionData construction (line ~1096):
// NetworkPermissions.Allowed carries only user-specified extra domains.
// The standard Copilot threat-detection domain list (GetThreatDetectionAllowedDomains)
// is intentionally omitted here: threat-detect routes all model calls through the
// AWF API proxy (apiProxy.enabled=true), so no host-level network allowDomains entry
// is required. Omitting the "network" key keeps the AWF config minimal.| } | ||
| } | ||
| run: | | ||
| threat-detect conclude --result-file /tmp/gh-aw/threat-detection/detection_result.json |
There was a problem hiding this comment.
[/zoom-out] The Conclude threat detection step expects threat-detect conclude to write conclusion, reason, and success to $GITHUB_OUTPUT — these feed the job-level outputs detection_conclusion, detection_reason, detection_success used by downstream jobs.
The compiler comment in buildExternalDetectorConcludeStep documents this contract, but TestExternalDetectorPath only checks that the string "threat-detect conclude" appears in the step; it does not assert that the step outputs are wired up. If the binary stops writing to $GITHUB_OUTPUT (e.g. a semver bump changes its interface), the gate silently becomes a no-op.
💡 Suggestion
Extend TestExternalDetectorPath in pkg/workflow/threat_detection_isolation_test.go to verify the output contract:
// Step outputs for detection_conclusion must be set so the job can export them.
if !strings.Contains(detectionSection, "steps.detection_conclusion.outputs.conclusion") {
t.Error("Conclude step outputs must be referenced in job outputs")
}Alternately, add an integration test that runs threat-detect conclude with a fixture detection_result.json and checks $GITHUB_OUTPUT.
| issue: 33280 | ||
|
|
||
| features: | ||
| gh-aw-detection: true |
There was a problem hiding this comment.
[/zoom-out] The 50% rollout selects workflows alphabetically (A–D range visible in this diff). Alphabetical order is not random, so the cohort may share structural characteristics — e.g. all workflows whose names start with letters near the top of the alphabet could cluster by creation era, team, or trigger pattern.
A few questions worth confirming before merging:
- Is there monitoring/alerting that will surface detection failures specifically for the new binary path vs. the inline-engine path?
- Is there a rollback plan documented if the 50% cohort starts producing systematic false positives or detection errors?
- If results look good, what is the intended next step (100% rollout, or graduated increases)?
💡 On sampling strategy
For a statistically representative sample, random shuffling before slicing at 50% is preferable to alphabetical selection. Alphabetical order can introduce bias if there is any correlation between workflow name and workflow characteristics (e.g., daily-* workflows are all scheduled and may behave differently from event-driven workflows). That said, alphabetical is operationally convenient and fine for a first canary pass — just worth noting if the comparison results are used to make conclusions about parity.
| RUNNER_TEMP: ${{ runner.temp }} | ||
| S2STOKENS: true | ||
| TRACEPARENT: ${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }} | ||
| - name: Upload threat detection artifact |
There was a problem hiding this comment.
[/improve-codebase-architecture] ✅ Good improvement here: the upload step now captures both detection_result.json and detection.log instead of just the log.
The structured JSON artifact makes it much easier to programmatically compare results between the old inline path and the new binary path during the rollout. Combined with the cleaner threat-detect conclude step replacing the 40-line github-script module, this is a meaningful reduction in moving parts.
There was a problem hiding this comment.
REQUEST CHANGES — 1 blocking issue found.
Of the 128 changed files (64 .md + 64 .lock.yml), all changes are correct and well-structured with one exception: ci-doctor.md carries a source: frontmatter field that makes it upstream-managed and read-only in this repo. Adding features: gh-aw-detection: true directly to it violates the AGENTS.md policy and will be silently overwritten on the next gh aw update.
Other observations (non-blocking)
- The 64 new detection lock files match the format already used by the 50 pre-existing detection-enabled workflows — consistent.
- Detection job uses
threat-detect v0.2.2binary across all new workflows, same as existing ones. - No step-level or job-level timeout on the
Execute threat detection with AWFstep. This pattern is consistent with existing detection workflows, so it is not new here, but worth tracking as a systemic gap. - The PR description states "previously only 43 had it" — the accurate count is 50 on the base branch (the 43 figure filters to only those with an
engine:field; 7 detection-enabled workflows lackengine:).
🔎 Code quality review by PR Code Quality Reviewer · 238.6 AIC · ⌖ 7.19 AIC · ⊞ 5.1K
| echo "✅ PR pre-analysis complete. Agent should start with $SUMMARY_FILE" | ||
|
|
||
| source: githubnext/agentics/workflows/ci-doctor.md@e15e57b40918dbca11b350c55d02ab61934afa75 | ||
| features: |
There was a problem hiding this comment.
Upstream-managed workflow modified directly — policy violation. ci-doctor.md declares source: githubnext/agentics/workflows/ci-doctor.md@e15e57b..., making it read-only in this repo; features: gh-aw-detection: true must be applied via the upstream, not edited here directly.
💡 Correct approach
Per AGENTS.md, workflows with a source: frontmatter entry are upstream-managed and must not be edited locally:
Treat those workflow source files as read-only in this repository.
Do not manually edit their generated.lock.ymlfiles.
Approved paths to enable gh-aw-detection for ci-doctor:
- Add
features: gh-aw-detection: trueto the workflow in the upstream source (githubnext/agentics/workflows/ci-doctor.md), then pin the new commit hash here and rungh aw update. - If local feature overrides are a supported concept (check
gh awdocs), confirm the mechanism is documented and that the compiled lock is then generated viamake recompilerather than hand-edited.
As-is, the direct edit to ci-doctor.md (and the accompanying manual change to ci-doctor.lock.yml) will be silently overwritten the next time gh aw update is run against the current pinned ref.
Weekly update covering the week of June 15–22, 2026: - Compiler +320% performance regression fix (#40662) - New deferinloop Go linter (#40679) - gh-aw-detection rollout to 50% of workflows (#40698) - JSON-RPC error handling fix (#40715) - Skillet sparse checkout path fix (#40684) - FNV-1a heredoc delimiter generation (#40696) - Agent of the Week: delight ✨ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rolls out the
gh-aw-detectionfeature flag to exactly 50% of agentic workflows (107/214). Previously only 43 workflows (20%) had it enabled.Changes
features: gh-aw-detection: trueto 64 agentic workflows (identified byengine:frontmatter field), selected alphabeticallyWhat
gh-aw-detectiondoesGates the external threat-detect binary path: emits a detection job that runs the pinned
threat-detectbinary and writesdetection_result.jsoninstead of using the inline execution path.