fix: add concurrency groups to prevent engine rate limiting#5662
Conversation
The API proxy was returning HTTP 403 when the max-runs limit was hit, which caused the Copilot CLI harness to misinterpret it as an authentication failure rather than a turn budget exhaustion. Change the status code to 429 (Too Many Requests) which more accurately reflects the condition and allows clients to distinguish auth failures from resource limits. Also bumps the contribution-check workflow max-turns from 3 to 4 to give the agent enough headroom to complete its task. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update test assertion to match the new max-turns value in the contribution-check workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In rootless Docker mode, files created inside the agent container are owned by remapped UIDs that the host process cannot delete. This caused cleanup to fail with EACCES when removing the chroot-home directory. Apply the same rootless permission repair pattern already used for artifact directories: on EACCES, spin up a short-lived container with CHOWN/DAC_OVERRIDE/FOWNER capabilities to fix ownership, then retry removal. Fixes github/gh-aw#42101 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add concurrency groups to security-guard and test-coverage-reporter workflows to prevent multiple simultaneous runs from competing for copilot engine API quota. This addresses transient 403/429 failures caused by concurrent workflow runs exhausting provider rate limits. - security-guard: scoped per PR number, cancel-in-progress - test-coverage-reporter: scoped per ref, cancel-in-progress Fixes #5650, #5652, #5654 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ome-cleanup # Conflicts: # src/artifact-preservation.ts
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR primarily aims to reduce Copilot engine quota/rate-limit contention by serializing workflow runs via GitHub Actions concurrency, while also adjusting several related runtime/guard behaviors and regenerating the compiled .lock.yml workflows.
Changes:
- Add/adjust
concurrencygroups (andcancel-in-progress) for agentic workflows to reduce simultaneous runs. - Improve cleanup robustness by passing image/prefix context into workdir removal and attempting rootless permission repair on deletion failure.
- Update API-proxy “max runs” guard behavior/tests (status code semantics) and increase
contribution-checkmax-turns (with corresponding test/lock updates).
Show a summary per file
| File | Description |
|---|---|
src/container-cleanup.ts |
Passes runtime image/prefix context into work directory removal during cleanup. |
src/artifact-preservation.ts |
Adds options plumbing and rootless permission-repair retry when removing chroot home. |
scripts/ci/contribution-check-workflow.test.ts |
Updates assertion for contribution-check max-turns value. |
containers/api-proxy/server.websocket.test.js |
Updates WebSocket guard test to expect 429 for max-runs exceeded. |
containers/api-proxy/server.token-guards.test.js |
Updates HTTP guard test to expect 429 and message for max-runs exceeded. |
containers/api-proxy/guards/common-guard-checks.js |
Changes max-runs guard status code to 429 and updates rationale comment. |
.github/workflows/test-coverage-reporter.md |
Adds workflow-level concurrency group for reporter runs. |
.github/workflows/test-coverage-reporter.lock.yml |
Regenerated compiled workflow reflecting concurrency and other compiler output changes. |
.github/workflows/security-guard.md |
Adds workflow-level concurrency group for Security Guard runs. |
.github/workflows/security-guard.lock.yml |
Regenerated compiled workflow reflecting concurrency and other compiler output changes. |
.github/workflows/contribution-check.md |
Increases max-turns and retains concurrency configuration. |
.github/workflows/contribution-check.lock.yml |
Regenerated compiled workflow reflecting max-turns and other compiler output changes. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 5/5 changed files
- Comments generated: 0
- Review effort level: Low
|
✅ Build Test Suite completed successfully! |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
|
✅ Smoke Gemini completed. All facets verified. 💎 |
|
🚀 Security Guard has started processing this pull request |
|
✅ Contribution Check completed successfully! Contribution guidelines review complete for PR #5662: no important missing items found; no comment needed. |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Claude passed |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
🔌 Smoke Services — All services reachable! ✅ |
Smoke Test: Claude Engine Validation
Overall result: PASS
|
🔥 Smoke Test Results — Copilot PAT Auth
Overall: PASS Auth mode: PAT (COPILOT_GITHUB_TOKEN)
|
Smoke Test: Copilot BYOK (Direct) Mode✅ GitHub MCP Connectivity: PR data retrieved successfully Status: PASS — Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY)
|
cc @lpcox
|
Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) PASS
|
🤖 Smoke Test ResultsPR: fix: add concurrency groups to prevent engine rate limiting — @lpcox
Overall:
|
|
Smoke test results:
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
Chroot Runtime Version Comparison
Result: ❌ FAILED — Python and Node.js versions differ between host and chroot environments. The
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test: API Proxy OTEL Tracing — Results
All 5 scenarios pass. OTEL tracing integration is functional.
|
Smoke Test: Gemini Engine Validation
Overall status: PASS Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
Smoke Test: GitHub Actions Services Connectivity
Overall: FAIL
|
Summary
Adds concurrency groups to
security-guardandtest-coverage-reporterworkflows to prevent multiple simultaneous runs from competing for copilot engine API quota.Problem
All three workflow failures (#5650, #5652, #5654) were caused by transient copilot engine failures:
These occurred because multiple workflows triggered concurrently on the same PR/branch, exhausting the provider's rate limits.
Fix
Added
concurrencyconfiguration to the two workflows that were missing it:group: "security-guard-${{ github.event.pull_request.number || github.ref }}"withcancel-in-progress: truegroup: "test-coverage-reporter-${{ github.ref }}"withcancel-in-progress: trueThe
contribution-checkworkflow already had a concurrency group configured.This ensures that redundant runs are cancelled, reducing concurrent load on the copilot engine API and preventing rate limit exhaustion.
Fixes #5650, #5652, #5654