Skip to content

[awf] copilot-harness: Intermittent 400 "model not supported" for claude-opus-4.6 due to upstream model catalogue inconsistency #3902

Description

@lpcox

Problem

20–25% of scheduled runs targeting claude-opus-4.6 fail within ~4 seconds with 400 The requested model is not supported, while the identical workflow succeeds minutes later. The model catalogue returned by the Copilot API varies between requests (30 vs 39 models), causing intermittent entitlement mismatches.

Context

Upstream issue: github/gh-aw#35075

The awf-reflect step observes fetched N model(s) counts that differ run-to-run. The harness receives a 400 and does not retry (model not supported — not retrying).

Root Cause

Upstream Copilot model-catalogue/entitlement API is non-deterministic — the set of models returned varies across requests for the same identity. This is not a gh-aw-firewall bug, but the firewall and harness surface it.

Proposed Solution

  1. Route to Copilot platform team to stabilise model-catalogue API responses.
  2. As a firewall-side mitigation: add a limited retry (2–3 attempts with backoff) in the harness for 400 model not supported responses before surfacing as a terminal failure, since the condition is transient.

Generated by Firewall Issue Dispatcher · sonnet46 1.6M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions