api-proxy: fallback unavailable gpt-5.x requests to highest available family model#3602
Conversation
gpt-5.x requests to highest available family model
|
@copilot update the PR to include more comprehensive logging for model aliasing events. put them in token-diag.log and update the specs and schema to reflect their structure. |
Implemented in 81320f3. Model aliasing now emits structured diagnostics to |
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
Smoke Test ResultsPR #3596: Avoid
Status: FAIL cc
|
Smoke Test Results: Claude Engine Validation✅ GitHub API: 2 PR entries confirmed Overall: PASS ✅
|
Smoke Test Results: PASS ✅PR #3602: api-proxy: fallback unavailable
Note: Running in BYOK offline mode (
|
API Proxy OTEL Tracing Validation Results✅ Scenario 1: Module LoadingStatus: PASS
✅ Scenario 2: Test SuiteStatus: PASS (33/33 tests)
✅ Scenario 3: Env Var ForwardingStatus: PASS
✅ Scenario 4: Token Tracker IntegrationStatus: PASS
✅ Scenario 5: OTEL DiagnosticsStatus: EXPECTED (no export when OTLP_ENDPOINT unset)
Overall Result: ✅ All scenarios PASS The api-proxy OTEL integration is production-ready:
Warning Firewall blocked 3 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "127.0.0.1"
- "api.example.com"
- "api.openai.com"See Network Configuration for more information.
|
Smoke Test Results
Overall Status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
There was a problem hiding this comment.
Pull request overview
This PR improves AWF’s API proxy model resolution by adding an in-proxy fallback for unavailable gpt-5.<minor> requests (e.g. gpt-5.5), selecting the highest available gpt-5.* model from the provider’s cached model list to avoid upstream 400s and retry loops. It also adds structured token diagnostics logging for model alias resolution/rewrite steps and corresponding test coverage.
Changes:
- Extend
resolveModel()/rewriteModelInBody()to fall back unavailablegpt-5.<minor>requests to the highest availablegpt-5.*model for the provider. - Emit structured JSONL diagnostics to
token-diag.log(whenAWF_DEBUG_TOKENS=1) for model resolution steps and rewrites. - Add targeted unit tests for the new fallback behavior and for structured diagnostics logging/schema validation.
Show a summary per file
| File | Description |
|---|---|
| containers/api-proxy/model-resolver.js | Adds gpt-5.<minor> unavailable fallback to highest available gpt-5.* model using existing version ordering. |
| containers/api-proxy/model-resolver.test.js | Adds resolver and body-rewrite tests covering the new gpt-5.<minor> fallback behavior. |
| containers/api-proxy/server.js | Writes structured diagnostics for model resolution/rewrite via token-persistence.diag(). |
| containers/api-proxy/server.models.test.js | Adds test verifying structured token-diag.log records are emitted for model alias rewrites. |
| containers/api-proxy/token-persistence.js | Introduces token-diag schema helpers (record builder + validator) and switches diag logging to structured JSONL records. |
| containers/api-proxy/token-tracker.schema.test.js | Adds schema-helper tests for token-diag record building and validation. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 0
Smoke: FAILPRs: Stabilize test-coverage-reporter by isolating main-action unit tests from DinD probing; Remove unused Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
Chroot Runtime Version Test ResultsThe chroot environment was tested to verify runtime versions match the host:
Overall Result: ❌ FAILED The chroot environment has different Python and Node.js versions than the host. Only Go versions match. This indicates the bind mounts are correctly exposing host binaries, but the versions differ due to the base Ubuntu 22.04 image having older runtime versions than the GitHub Actions runner.
|
Smoke Test: Services Connectivity — ❌ FAIL
Result: FAIL — AWF sandbox cannot reach GitHub Actions service containers
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS All build and test operations completed successfully across all 18 projects in 8 different programming language ecosystems.
|
The proxy previously forwarded unknown model IDs unchanged when no alias matched, causing avoidable upstream 400s (e.g.
gpt-5.5on/chat/completions) and retry loops. This change adds in-proxy fallback selection for unavailablegpt-5.<minor>requests using the already-cached provider model list.Model resolution behavior
resolveModel(), after alias and direct-match checks fail, the resolver now detectsgpt-5.<minor>requests and selects the highest availablegpt-5.*model for the current provider.compareByVersion) to keep selection deterministic and consistent with alias candidate ranking.requested model "gpt-5.5" not available, falling back to "gpt-5.4".Body rewrite path
rewriteModelInBody()now rewrites unavailable requests likegpt-5.5to the selected fallback (gpt-5.4) instead of returningnulland letting the invalid model pass through upstream.Focused coverage
gpt-5minor fallback.