fix: handle EACCES during chroot-home cleanup in rootless Docker#5653
Conversation
The API proxy was returning HTTP 403 when the max-runs limit was hit, which caused the Copilot CLI harness to misinterpret it as an authentication failure rather than a turn budget exhaustion. Change the status code to 429 (Too Many Requests) which more accurately reflects the condition and allows clients to distinguish auth failures from resource limits. Also bumps the contribution-check workflow max-turns from 3 to 4 to give the agent enough headroom to complete its task. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update test assertion to match the new max-turns value in the contribution-check workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In rootless Docker mode, files created inside the agent container are owned by remapped UIDs that the host process cannot delete. This caused cleanup to fail with EACCES when removing the chroot-home directory. Apply the same rootless permission repair pattern already used for artifact directories: on EACCES, spin up a short-lived container with CHOWN/DAC_OVERRIDE/FOWNER capabilities to fix ownership, then retry removal. Fixes github/gh-aw#42101 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 98.20% | 98.11% | 📉 -0.09% |
| Statements | 98.13% | 98.04% | 📉 -0.09% |
| Functions | 99.54% | 99.54% | ➡️ +0.00% |
| Branches | 94.19% | 93.98% | 📉 -0.21% |
📁 Per-file Coverage Changes (2 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/artifact-preservation.ts |
100.0% → 92.4% (-7.61%) | 100.0% → 92.4% (-7.61%) |
src/workdir-setup.ts |
92.7% → 94.5% (+1.82%) | 92.7% → 94.5% (+1.82%) |
Coverage comparison generated by scripts/ci/compare-coverage.ts
There was a problem hiding this comment.
Pull request overview
This PR addresses a cleanup failure in rootless Docker scenarios by adding an EACCES-recovery path when deleting the *-chroot-home directory, reusing the existing “rootless permission repair” mechanism that chowns files back to the host UID/GID before retrying deletion.
Changes:
- Forward rootless/docker image context from
cleanup()intoremoveWorkDirectories()so permission repair can run during cleanup. - Add
EACCEShandling for*-chroot-homedeletion: attemptrmSync, onEACCESrunfixArtifactPermissionsForRootless, then retryrmSync. - Update contribution-check workflow turn limits and adjust API-proxy max-runs guard semantics/tests (403 → 429).
Show a summary per file
| File | Description |
|---|---|
| src/container-cleanup.ts | Passes docker/rootless context options through to removeWorkDirectories() during cleanup. |
| src/artifact-preservation.ts | Adds EACCES recovery for chroot-home deletion using rootless permission repair + retry. |
| scripts/ci/contribution-check-workflow.test.ts | Updates workflow assertion for max-turns (3 → 4). |
| containers/api-proxy/server.websocket.test.js | Updates WebSocket max-runs guard expectation to 429. |
| containers/api-proxy/server.token-guards.test.js | Updates HTTP max-runs guard expectation to 429 and asserts error message. |
| containers/api-proxy/guards/common-guard-checks.js | Changes max-runs guard status code to 429 and updates rationale comment. |
| .github/workflows/contribution-check.md | Increases workflow max-turns to 4. |
| .github/workflows/contribution-check.lock.yml | Regenerates locked workflow output consistent with updated workflow config (including maxRuns/max-turns). |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 8/8 changed files
- Comments generated: 4
- Review effort level: Low
| type RemoveWorkDirectoriesOptions = { | ||
| dockerHostPathPrefix?: string; | ||
| imageRegistry?: string; | ||
| imageTag?: string; | ||
| agentImage?: string; | ||
| }; | ||
|
|
||
| export function removeWorkDirectories(workDir: string, options: RemoveWorkDirectoriesOptions = {}): void { |
| // In rootless Docker, files created inside the container may be owned by | ||
| // remapped UIDs that the host process cannot delete. Fix permissions via | ||
| // a privileged container, then retry removal. | ||
| if (error && typeof error === 'object' && 'code' in error && error.code === 'EACCES') { | ||
| logger.debug('Chroot home removal failed with EACCES; attempting rootless permission repair'); |
| // Terminal hard cap: 429 signals the agent has exhausted its allowed | ||
| // turns. Unlike transient rate-limits, the Retry-After is absent so | ||
| // well-behaved clients won't retry indefinitely. | ||
| statusCode: 429, |
| pull-requests: read | ||
| issues: read | ||
| max-turns: 3 | ||
| max-turns: 4 |
|
@copilot run pr-finisher skill |
|
✅ Copilot review passed with no inline comments. @lpcox Add the |
|
@copilot address review feedback |
Addressed in I reverted the unrelated workflow/API-proxy changes so this PR is back to the rootless cleanup fix, and kept the focused |
|
📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅ |
|
🚀 Security Guard has started processing this pull request |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
🔌 Smoke Services — All services reachable! ✅ |
|
✅ Smoke Claude passed |
|
✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓 |
|
✅ Contribution Check completed successfully! Contribution guidelines check complete for PR #5653: the PR includes tests for the cleanup change, has a clear description with a related issue reference, updates source files in the appropriate location, and does not appear to need documentation changes. |
|
✅ Build Test Suite completed successfully! |
|
🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅ |
|
✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓 |
|
Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded. |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓 |
|
❌ Smoke Gemini reports failed. Facets need polishing... Smoke test failed: MCP and Connectivity tests ❌. File writing and Bash tools ✅. PR titles partially unavailable. |
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 98.20% | 98.20% | ➡️ +0.00% |
| Statements | 98.13% | 98.13% | ➡️ +0.00% |
| Functions | 99.54% | 99.54% | ➡️ +0.00% |
| Branches | 94.19% | 94.14% | 📉 -0.05% |
📁 Per-file Coverage Changes (2 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/artifact-preservation.ts |
100.0% → 97.8% (-2.18%) | 100.0% → 97.8% (-2.18%) |
src/workdir-setup.ts |
92.7% → 94.5% (+1.82%) | 92.7% → 94.5% (+1.82%) |
Coverage comparison generated by scripts/ci/compare-coverage.ts
Smoke Test: Claude Engine Validation
Overall result: PASS
|
🔥 Smoke Test Results
PR: fix: handle EACCES during chroot-home cleanup in rootless Docker Overall: PASS
|
Smoke Test: Copilot PAT Auth — PARTIAL PASS
Overall: PARTIAL — pre-step template variables ( Auth mode: PAT (COPILOT_GITHUB_TOKEN) | Author: @lpcox
|
|
Smoke Test: Copilot BYOK (Direct) ✅ PASS✅ GitHub MCP connectivity verified Running in direct BYOK mode via
|
🔭 Smoke Test: API Proxy OpenTelemetry Tracing
All scenarios pass. The OTEL tracing integration is complete and fully tested.
|
|
fix: handle EACCES during chroot-home cleanup in rootless Docker\n\n✅ GitHub reads\n✅ Playwright title check\n✅ smoke-test file\n✅ npm ci && npm run build\n\nPASS Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
Chroot Version Comparison Results
Status: FAILED — Python and Node.js versions differ between host and chroot environments.
|
Smoke Test: Services Connectivity
Overall: FAIL —
|
|
@lpcox
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Problem
In rootless Docker mode (e.g. rootless network isolation on GitHub Actions runners), files created inside the agent container (like
.aws/config) are owned by remapped UIDs. When AWF tries to clean up thechroot-homedirectory after execution,fs.rmSyncfails withEACCESbecause the host process doesn't have permission to delete these remapped-UID files.Fix
Apply the same rootless permission repair pattern already used for artifact directories (
fixArtifactPermissionsForRootless):rmSyncfirst (fast path for non-rootless)EACCES, spin up a short-lived Docker container withCHOWN/DAC_OVERRIDE/FOWNERcapabilities tochownthe files back to the host userrmSyncafter permission repairThe
cleanup()function now forwardsdockerHostPathPrefix,imageRegistry,imageTag, andagentImagetoremoveWorkDirectoriesso it can perform the rootless repair.Changes
src/artifact-preservation.ts:removeWorkDirectoriesnow catchesEACCESand retries after rootless permission repairsrc/container-cleanup.ts: Pass rootless context options through toremoveWorkDirectoriesFixes github/gh-aw#42101