Skip to content

[awf] awf-cli-proxy: 5-run failure cluster — DIFC proxy unreachable due to IPv4/IPv6 mismatch #4657

Description

@lpcox

Problem: A failure cluster of 5 runs across 3 workflows (PR Sous Chef ×3, Issue Monster, Daily Code Metrics) was killed with 0 token usage and 0 agent turns — the agent was never invoked. All failures share the same signature: awf-cli-proxy cannot reach the host DIFC proxy at host.docker.internal:18443.

Context: github/gh-aw#38201. Window: ~6h on 2026-06-09. Engine-agnostic (Copilot and Claude both affected). Distinct from #37989 (auth failure at turn 1 — this never reaches the agent).

Root Cause: The awf-cli-proxy sidecar resolves localhost to IPv6 [::1] on dual-stack hosts, but the DIFC proxy only listens on IPv4 0.0.0.0:18443. With a 2-retry fail-fast, the sidecar exits before the agent container can start. The agent depends_on the cli-proxy container being healthy, so the whole workflow is aborted.

Proposed Solution:

  1. Fix start_cli_proxy.sh to probe 127.0.0.1 explicitly instead of localhost.
  2. Alternatively, bind the DIFC proxy on :: (dual-stack) to accept both [::1] and 127.0.0.1.
  3. Increase retry count from 2 to ≥10 with exponential backoff.
  4. Improve error messages to distinguish "proxy unreachable" from "proxy not yet ready" for faster triage.

Generated by Firewall Issue Dispatcher · 380.5 AIC · ⊞ 27.8K ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions