You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[aw-failures] [aw] Copilot strict bash allowlist starves two more workflows — SPDD Spec Planner & Formal Spec Verifier hit max t
[Content truncated due to length] #40853
Two Copilot (BYOK claude-sonnet-4.6) workflows fail because their strict: true bash/tool allowlists deny routine read-only operations the agent issues. After 5 denials the Copilot SDK driver trips its guard.tool_denials_exceeded threshold and stops the session early (max tool denials threshold reached (5/5)), so the agent job exits failure having produced 0 writes — after burning 16–21 AIC over a single 19–24 minute turn.
This is the same failure class as #40755 (Daily Compiler Threat Spec Optimizer), now confirmed on two additional workflows. The pattern is systemic to strict-allowlist Copilot workflows, not workflow-specific.
audit-diff (27971023563 vs green baseline 27504800878): no firewall anomalies (has_anomalies=false); the failed run consumed 16.577 AIC in a single 24m28s turn — confirming a tool-denial spin loop, not a network/provider fault.
Probable root cause
The strict: true allowlists enumerate exact command forms (e.g. shell(sed -n), specific shell(cat ...) globs) but the agent issues legitimate variants that fall outside them:
sed -n '1,100p' <file> (line-range read) vs the allowed bare sed -n,
the builtin read/view tool on Go source/test files not covered by the shell(cat ...) globs.
The 5-denial guard then aborts the whole session, so a few off-allowlist reads kill an otherwise-functional run.
Proposed remediation
Broaden the bash allowlists in daily-spdd-spec-planner.md and daily-formal-spec-verifier.md to cover read-range and file-read operations the agents actually need (e.g. allow sed -n, read/view of repo source/spec paths).
Problem statement
Two Copilot (BYOK
claude-sonnet-4.6) workflows fail because theirstrict: truebash/tool allowlists deny routine read-only operations the agent issues. After 5 denials the Copilot SDK driver trips itsguard.tool_denials_exceededthreshold and stops the session early (max tool denials threshold reached (5/5)), so theagentjob exitsfailurehaving produced 0 writes — after burning 16–21 AIC over a single 19–24 minute turn.This is the same failure class as #40755 (Daily Compiler Threat Spec Optimizer), now confirmed on two additional workflows. The pattern is systemic to strict-allowlist Copilot workflows, not workflow-specific.
Affected workflows and run IDs
shell(sed -n '1,100p' /tmp/copilot-tool-output-*.txt)read(/home/runner/work/gh-aw/gh-aw/pkg/intent/resolver_test.go)Comparator (SPDD last green): §27504800878 — succeeded read-only in 1 turn on 2026-06-14.
Evidence
failureClass=permission_denied,permissionDeniedCount=11,hasNumerousPermissionDenied=true; harness logsattempt 1: detected numerous permission-denied issues — not retrying (classified as missing tool/permission issue)and emitsmissing_tool.[sdk-driver] max tool denials threshold reached (5/5); stopping SDK session early.audit-diff(27971023563 vs green baseline 27504800878): no firewall anomalies (has_anomalies=false); the failed run consumed 16.577 AIC in a single 24m28s turn — confirming a tool-denial spin loop, not a network/provider fault.Probable root cause
The
strict: trueallowlists enumerate exact command forms (e.g.shell(sed -n), specificshell(cat ...)globs) but the agent issues legitimate variants that fall outside them:sed -n '1,100p' <file>(line-range read) vs the allowed baresed -n,read/viewtool on Go source/test files not covered by theshell(cat ...)globs.The 5-denial guard then aborts the whole session, so a few off-allowlist reads kill an otherwise-functional run.
Proposed remediation
daily-spdd-spec-planner.mdanddaily-formal-spec-verifier.mdto cover read-range and file-read operations the agents actually need (e.g. allowsed -n,read/viewof repo source/spec paths).tool_denialsthreshold (currently 5) so a handful of denied probes does not abort a long session.Success criteria / verification
conclusion=success, >1 turn, and ≥1 intended safe output.guard.tool_denials_exceededevents inagent-stdio.log.permissionDeniedCount< 5 per run.Correlation
Same class as #40755 (strict allowlist denies sed/awk/read). Parent report: #39883.
Related to #39883