fix(cli-proxy): replace gh api rate_limit liveness probe with TCP connectivity check#4677
fix(cli-proxy): replace gh api rate_limit liveness probe with TCP connectivity check#4677Copilot wants to merge 2 commits into
Conversation
Replace gh api rate_limit with a raw TCP check (consistent with healthcheck.sh). The old probe failed if: the TCP tunnel was not yet bound when it ran (race condition), GH_TOKEN was absent, or the gh CLI version changed how it routes API paths for a custom GH_HOST. Changes: - Add a tunnel-readiness loop that polls 127.0.0.1:DIFC_PORT before running the liveness probe, eliminating the startup race condition - Replace gh api rate_limit with a bash /dev/tcp connectivity check directly to DIFC_HOST:DIFC_PORT - no token needed, version-agnostic - Raise default MAX_LIVENESS_ATTEMPTS from 2 to 5 and sleep from 1s to 2s for resilience against transient startup delays - Improve error messages to identify the failing component clearly
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR hardens the awf-cli-proxy sidecar startup sequence by removing a brittle gh api rate_limit liveness probe and replacing it with TCP-based readiness/liveness checks, preventing early exit(1) failures that were blocking Copilot engine workflows.
Changes:
- Adds a bounded wait loop to ensure the local TCP tunnel is listening on
127.0.0.1:${DIFC_PORT}before any external connectivity probing. - Replaces the external DIFC proxy liveness probe with a raw TCP connectivity check to
${DIFC_HOST}:${DIFC_PORT}(noGH_TOKEN/ghCLI dependency). - Increases default retry/backoff parameters and improves startup error messages for clearer diagnosis.
Show a summary per file
| File | Description |
|---|---|
| containers/cli-proxy/entrypoint.sh | Adds tunnel readiness polling and switches DIFC liveness probing from gh api to TCP checks with more resilient retry defaults and clearer errors. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 0
|
fix(cli-proxy): replace gh api rate_limit liveness probe with TCP connectivity check Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test Results: FAIL ❌
|
🔬 Smoke Test Results
Overall: FAIL PR: fix(cli-proxy): replace gh api rate_limit liveness probe with TCP connectivity check
|
Smoke Test Results — Auth mode: PAT (COPILOT_GITHUB_TOKEN)
PR: fix(cli-proxy): replace gh api rate_limit liveness probe with TCP connectivity check Overall: PARTIAL — MCP and HTTP passed; file test unverifiable (template vars not substituted by runner).
|
🔑 Copilot BYOK Smoke Test Results✅ Test 1: GitHub MCP Connectivity — Verified last 2 merged PRs Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) via api-proxy → api.githubcopilot.com Overall: PASS ✅ cc
|
awf-cli-proxywas exiting(1) at startup due to a fragile liveness probe introduced in #4486 and a race condition worsened by the IPv6 tunnel change in #4626, blocking allcopilotengine workflows.Root causes
gh api rate_limitran immediately afternode tcp-tunnel.js &, before the tunnel had boundlocalhost:${DIFC_PORT}, causing instantconnection refusedgh api rate_limitrequires a validGH_TOKEN, depends onghCLI routing API paths correctly for a customGH_HOST(changed between versions), and only retried twice (2 × 6s window)Changes
127.0.0.1:${DIFC_PORT}(up to 10 × 0.5s) before the liveness probe, eliminating the race conditionbash /dev/tcp/${DIFC_HOST}/${DIFC_PORT}— same pattern ashealthcheck.sh; no token, noghCLI, no API path dependencyMAX_LIVENESS_ATTEMPTS2→5, sleep 1s→2s