You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[aw-failures] [aw] Skillet floods Actions with startup-failures on copilot/* branch pushes (recurring — 73 failed runs / 6h as o
[Content truncated due to length] #40447
In the 6h window ending 2026-06-20 08:00 UTC, 21 of 23 agentic-workflow failures (91%) were the Skillet workflow (.github/workflows/skillet.lock.yml). Every run is an instantaneous startup-failure with no agent activity. No existing agentic-workflows issue tracks this signature.
createdAt == startedAt == updatedAt (06:08:33Z) — the run never progressed past scheduling.
Audit classification: "Workflow 'Skillet' failed before agent activation — no error logs were available to analyze."
Run-level metadata reports zero failed jobs and zero failed steps — i.e. no job was ever dispatched (startup_failure signature).
audit-diff firewall comparison across runs 27862450162 / 27859225167 / 27856881609: 0 new domains, 0 status changes, 0 anomalies — a stable, systemic signature with no drift.
The Skillet source on main triggers only on workflow_dispatch (centralized slash-command dispatch via agentic_commands.yml). The failing runs, however, fire on event = push against copilot/* feature branches. The most consistent explanation is that those branches carry a stale / divergent compiled skillet.lock.yml (recompiled on main at 2026-06-20 07:59, after all failures) whose on: block still matches push events. GitHub schedules that branch-local workflow on each Copilot push; it fails at startup (0s, no jobs, no logs) because the run context does not match a valid slash-command activation. Result: every Copilot agent push emits one red Skillet run.
Evidence detail
run_summary.json for 27862450162: "event": "push", "headBranch": "copilot/update-custom-css-themes", "displayTitle": "Fix docs text contrast and Safari backdrop blur support", Duration 0, Turns 0.
mainskillet.lock.ymlon: block contains only workflow_dispatch (+ internal aw_context input); there is no push/pull_request trigger, confirming the running config diverges from main.
No firewall, MCP, or tooling anomalies (audit-diff clean) — rules out network/proxy regression.
Proposed remediation
Confirm the on: triggers of skillet.lock.yml as it exists on a live copilot/* branch (e.g. copilot/update-custom-css-themes) and diff against main's compiled lock.
If branch locks are stale, recompile/rebase Copilot branches, or have the compiler emit a guard so a centralized slash-command lock cannot be scheduled by push/pull_request on feature branches (e.g. an early if on github.event_name == 'workflow_dispatch').
Consider scoping Skillet (and other private centralized commands) so branch pushes do not schedule them at all, eliminating the startup-failure noise.
Success criteria / verification
Zero Skillet runs with event = push on copilot/* branches over a subsequent 24h window.
Skillet failure count in the failure-investigator 6h window drops from 21 to ~0 absent a real slash-command failure.
Any future Skillet failure carries an attributed failed job/step and non-zero duration (i.e. real execution, not startup_failure).
Existing-issue correlation
Not a duplicate: no open agentic-workflows issue references Skillet or this startup-failure signature.
Update — 2026-06-20 19:13 UTC (re-confirmed, escalating)
Fix the Skillet startup-failure flood on copilot/* pushes — this one cluster is now 82% of all agentic-workflow failures (73 of 89 in 6h), up ~3.5× from 21 at first report. Same signature, no drift: event = push, copilot/* head branch, 0s duration, zero jobs/steps/logs, createdAt == updatedAt. Root cause and remediation below are unchanged and verified still-applicable; this is now the top reliability-noise source in the repo and should be prioritized.
Representative runs (this window): §27880980627 (copilot/apply-suggestions-from-discussion-40483), §27878842425, §27877892017 (copilot/agent-persona-exploration-another-one). Spot-checked via gh run view: all event=push, createdAt == updatedAt, 0s duration.
New low-frequency watch item (not yet filed, not escalated): Daily Security Observability Report§27876792899 — 0-turn agent failure on main/schedule, 10.4m duration, audit posture delta write_capable→read_only vs successful baseline §27429719048. Single occurrence in 6h; monitoring one more cycle before filing to avoid duplicating the known 0-turn/ERR_CONFIG class.
Not real failures this window (excluded, no action): Smoke CI (5× cancelled) and Deployment Incident Monitor (4× cancelled) — superseded runs, not agent failures.
Overview
In the 6h window ending 2026-06-20 08:00 UTC, 21 of 23 agentic-workflow failures (91%) were the Skillet workflow (.github/workflows/skillet.lock.yml). Every run is an instantaneous startup-failure with no agent activity. No existing agentic-workflows issue tracks this signature.
createdAt == startedAt == updatedAt (06:08:33Z) — the run never progressed past scheduling.
Audit classification: "Workflow 'Skillet' failed before agent activation — no error logs were available to analyze."
Run-level metadata reports zero failed jobs and zero failed steps — i.e. no job was ever dispatched (startup_failure signature).
audit-diff firewall comparison across runs 27862450162 / 27859225167 / 27856881609: 0 new domains, 0 status changes, 0 anomalies — a stable, systemic signature with no drift.
The Skillet source on main triggers only on workflow_dispatch (centralized slash-command dispatch via agentic_commands.yml). The failing runs, however, fire on event = push against copilot/* feature branches. The most consistent explanation is that those branches carry a stale / divergent compiled skillet.lock.yml (recompiled on main at 2026-06-20 07:59, after all failures) whose on: block still matches push events. GitHub schedules that branch-local workflow on each Copilot push; it fails at startup (0s, no jobs, no logs) because the run context does not match a valid slash-command activation. Result: every Copilot agent push emits one red Skillet run.
Evidence detail
run_summary.json for 27862450162: "event": "push", "headBranch": "copilot/update-custom-css-themes", "displayTitle": "Fix docs text contrast and Safari backdrop blur support", Duration 0, Turns 0.
mainskillet.lock.ymlon: block contains only workflow_dispatch (+ internal aw_context input); there is no push/pull_request trigger, confirming the running config diverges from main.
No firewall, MCP, or tooling anomalies (audit-diff clean) — rules out network/proxy regression.
Proposed remediation
Confirm the on: triggers of skillet.lock.yml as it exists on a live copilot/* branch (e.g. copilot/update-custom-css-themes) and diff against main's compiled lock.
If branch locks are stale, recompile/rebase Copilot branches, or have the compiler emit a guard so a centralized slash-command lock cannot be scheduled by push/pull_request on feature branches (e.g. an early if on github.event_name == 'workflow_dispatch').
Consider scoping Skillet (and other private centralized commands) so branch pushes do not schedule them at all, eliminating the startup-failure noise.
Success criteria / verification
Zero Skillet runs with event = push on copilot/* branches over a subsequent 24h window.
Skillet failure count in the failure-investigator 6h window drops from 21 to ~0 absent a real slash-command failure.
Any future Skillet failure carries an attributed failed job/step and non-zero duration (i.e. real execution, not startup_failure).
Existing-issue correlation
Not a duplicate: no open agentic-workflows issue references Skillet or this startup-failure signature.
Overview
In the 6h window ending 2026-06-20 08:00 UTC, 21 of 23 agentic-workflow failures (91%) were the
Skilletworkflow (.github/workflows/skillet.lock.yml). Every run is an instantaneous startup-failure with no agent activity. No existingagentic-workflowsissue tracks this signature.Key metrics (representative run §27862450162)
conclusion = failure,event = push,headBranch = copilot/update-custom-css-themes.Duration = 0,ActionMinutes = 0,Turns = 0,TokenUsage = 0,ErrorCount = 0.createdAt == startedAt == updatedAt(06:08:33Z) — the run never progressed past scheduling.audit-difffirewall comparison across runs 27862450162 / 27859225167 / 27856881609: 0 new domains, 0 status changes, 0 anomalies — a stable, systemic signature with no drift.Affected workflow and run IDs
Skillet(.github/workflows/skillet.lock.yml),private: true,slash_command(centralized,name: "*").pushevents oncopilot/*branches: 27862450162, 27862210273, 27862038117, 27861738057, 27859225167, 27859181269, 27859173283, 27859099139, 27859022467, 27858955332, 27858921530, 27858868996, 27858836010, 27858612004, 27858594866, 27858560026, 27858531804, 27858098105, 27857778483, 27857320814, 27856881609.Probable root cause
The
Skilletsource onmaintriggers only onworkflow_dispatch(centralized slash-command dispatch viaagentic_commands.yml). The failing runs, however, fire onevent = pushagainstcopilot/*feature branches. The most consistent explanation is that those branches carry a stale / divergent compiledskillet.lock.yml(recompiled onmainat 2026-06-20 07:59, after all failures) whoseon:block still matches push events. GitHub schedules that branch-local workflow on each Copilot push; it fails at startup (0s, no jobs, no logs) because the run context does not match a valid slash-command activation. Result: every Copilot agent push emits one redSkilletrun.Evidence detail
run_summary.jsonfor 27862450162:"event": "push","headBranch": "copilot/update-custom-css-themes","displayTitle": "Fix docs text contrast and Safari backdrop blur support",Duration 0,Turns 0.mainskillet.lock.ymlon:block contains onlyworkflow_dispatch(+ internalaw_contextinput); there is nopush/pull_requesttrigger, confirming the running config diverges frommain.Proposed remediation
on:triggers ofskillet.lock.ymlas it exists on a livecopilot/*branch (e.g.copilot/update-custom-css-themes) and diff againstmain's compiled lock.push/pull_requeston feature branches (e.g. an earlyifongithub.event_name == 'workflow_dispatch').Skillet(and otherprivatecentralized commands) so branch pushes do not schedule them at all, eliminating the startup-failure noise.Success criteria / verification
Skilletruns withevent = pushoncopilot/*branches over a subsequent 24h window.Skilletfailure count in the failure-investigator 6h window drops from 21 to ~0 absent a real slash-command failure.Skilletfailure carries an attributed failed job/step and non-zero duration (i.e. real execution, not startup_failure).Existing-issue correlation
agentic-workflowsissue referencesSkilletor this startup-failure signature.Execute GitHub Copilot CLI, BYOK 403); Avenger §27859038019 → [aw-failures] [aw] Avenger agent job fails at "Parse agent logs" — ERR_CONFIG "no structured log entries" despite successful age [Content truncated due to length] #40145 (Parse agent logs, ERR_CONFIG).References:
Related to [aw-failures] [aw] Failure Investigation Report — 6h window (2026-06-17 19:34 UTC) #39883
Update — 2026-06-20 19:13 UTC (re-confirmed, escalating)
Fix the
Skilletstartup-failure flood oncopilot/*pushes — this one cluster is now 82% of all agentic-workflow failures (73 of 89 in 6h), up ~3.5× from 21 at first report. Same signature, no drift:event = push,copilot/*head branch, 0s duration, zero jobs/steps/logs,createdAt == updatedAt. Root cause and remediation below are unchanged and verified still-applicable; this is now the top reliability-noise source in the repo and should be prioritized.copilot/apply-suggestions-from-discussion-40483), §27878842425, §27877892017 (copilot/agent-persona-exploration-another-one). Spot-checked viagh run view: allevent=push,createdAt == updatedAt, 0s duration.ERR_CONFIG: no structured log entries→ [aw-failures] [aw] Avenger agent job fails at "Parse agent logs" — ERR_CONFIG "no structured log entries" despite successful age [Content truncated due to length] #40145 (5 runs this window, e.g. §27880644893); Daily Issues Report Generator → [aw-failures] [aw] Daily Issues Report Generator fails 100% — Copilot Python sample driver crashes (ModuleNotFoundError: 'copilo [Content truncated due to length] #40380 (1 run, §27874987840).Daily Security Observability Report§27876792899 — 0-turn agent failure onmain/schedule, 10.4m duration, auditposturedelta write_capable→read_only vs successful baseline §27429719048. Single occurrence in 6h; monitoring one more cycle before filing to avoid duplicating the known 0-turn/ERR_CONFIG class.Smoke CI(5×cancelled) andDeployment Incident Monitor(4×cancelled) — superseded runs, not agent failures.Overview
In the 6h window ending 2026-06-20 08:00 UTC, 21 of 23 agentic-workflow failures (91%) were the
Skilletworkflow (.github/workflows/skillet.lock.yml). Every run is an instantaneous startup-failure with no agent activity. No existingagentic-workflowsissue tracks this signature.Key metrics (representative run §27862450162)
conclusion = failure,event = push,headBranch = copilot/update-custom-css-themes.Duration = 0,ActionMinutes = 0,Turns = 0,TokenUsage = 0,ErrorCount = 0.createdAt == startedAt == updatedAt(06:08:33Z) — the run never progressed past scheduling.audit-difffirewall comparison across runs 27862450162 / 27859225167 / 27856881609: 0 new domains, 0 status changes, 0 anomalies — a stable, systemic signature with no drift.Affected workflow and run IDs
Skillet(.github/workflows/skillet.lock.yml),private: true,slash_command(centralized,name: "*").pushevents oncopilot/*branches: 27862450162, 27862210273, 27862038117, 27861738057, 27859225167, 27859181269, 27859173283, 27859099139, 27859022467, 27858955332, 27858921530, 27858868996, 27858836010, 27858612004, 27858594866, 27858560026, 27858531804, 27858098105, 27857778483, 27857320814, 27856881609.Probable root cause
The
Skilletsource onmaintriggers only onworkflow_dispatch(centralized slash-command dispatch viaagentic_commands.yml). The failing runs, however, fire onevent = pushagainstcopilot/*feature branches. The most consistent explanation is that those branches carry a stale / divergent compiledskillet.lock.yml(recompiled onmainat 2026-06-20 07:59, after all failures) whoseon:block still matches push events. GitHub schedules that branch-local workflow on each Copilot push; it fails at startup (0s, no jobs, no logs) because the run context does not match a valid slash-command activation. Result: every Copilot agent push emits one redSkilletrun.Evidence detail
run_summary.jsonfor 27862450162:"event": "push","headBranch": "copilot/update-custom-css-themes","displayTitle": "Fix docs text contrast and Safari backdrop blur support",Duration 0,Turns 0.mainskillet.lock.ymlon:block contains onlyworkflow_dispatch(+ internalaw_contextinput); there is nopush/pull_requesttrigger, confirming the running config diverges frommain.Proposed remediation
on:triggers ofskillet.lock.ymlas it exists on a livecopilot/*branch (e.g.copilot/update-custom-css-themes) and diff againstmain's compiled lock.push/pull_requeston feature branches (e.g. an earlyifongithub.event_name == 'workflow_dispatch').Skillet(and otherprivatecentralized commands) so branch pushes do not schedule them at all, eliminating the startup-failure noise.Success criteria / verification
Skilletruns withevent = pushoncopilot/*branches over a subsequent 24h window.Skilletfailure count in the failure-investigator 6h window drops from 21 to ~0 absent a real slash-command failure.Skilletfailure carries an attributed failed job/step and non-zero duration (i.e. real execution, not startup_failure).Existing-issue correlation
agentic-workflowsissue referencesSkilletor this startup-failure signature.Execute GitHub Copilot CLI, BYOK 403); Avenger §27859038019 → [aw-failures] [aw] Avenger agent job fails at "Parse agent logs" — ERR_CONFIG "no structured log entries" despite successful age [Content truncated due to length] #40145 (Parse agent logs, ERR_CONFIG).References:
Related to [aw-failures] [aw] Failure Investigation Report — 6h window (2026-06-17 19:34 UTC) #39883