Skip to content

fix(api-proxy): add embedding model pricing to resolve unknown model rejection#4936

Merged
lpcox merged 3 commits into
mainfrom
copilot/aw-refactor-opportunity-scanner
Jun 14, 2026
Merged

fix(api-proxy): add embedding model pricing to resolve unknown model rejection#4936
lpcox merged 3 commits into
mainfrom
copilot/aw-refactor-opportunity-scanner

Conversation

Copilot AI commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

The Refactoring Opportunity Scanner (and any workflow with maxAiCredits active) was hitting GH_AW_UNKNOWN_MODEL_AI_CREDITS because the Copilot CLI makes embedding model requests (text-embedding-3-small, text-embedding-ada-002) that were absent from the api-proxy pricing table. With no defaultAiCreditsPricing fallback, every such request returned HTTP 400, setting both error flags even though the main agent work completed successfully.

Changes

  • containers/api-proxy/ai-credits-pricing.js — adds two embedding model entries:

    'text-embedding-3-small': { input: 0.02, cachedInput: null, cacheWrite: null, output: 0.00 },
    'text-embedding-ada-002': { input: 0.10, cachedInput: null, cacheWrite: null, output: 0.00 },

    text-embedding-3-small-inference is covered by the existing prefix-match logic.

  • containers/api-proxy/guards/ai-credits-guard.test.js — adds four tests: credit calculation for each embedding model, prefix-match resolution for text-embedding-3-small-inference, and non-rejection of all three when maxAiCredits is active.

Embedding models (text-embedding-3-small, text-embedding-ada-002) were
missing from the api-proxy AI credits pricing table. With maxAiCredits
active and no defaultAiCreditsPricing fallback, these requests were
rejected with HTTP 400, triggering the GH_AW_UNKNOWN_MODEL_AI_CREDITS
flag that caused the Refactoring Opportunity Scanner to file issue #4935.

Add pricing entries:
- text-embedding-3-small: $0.02/M input, $0.00 output (no output tokens)
- text-embedding-ada-002: $0.10/M input, $0.00 output
- text-embedding-3-small-inference resolves via prefix-match

Closes #4935
Copilot AI changed the title [WIP] Fix AI credits rate limit issue in refactoring scanner fix(api-proxy): add embedding model pricing to resolve unknown model rejection Jun 14, 2026
Copilot finished work on behalf of lpcox June 14, 2026 17:30
Copilot AI requested a review from lpcox June 14, 2026 17:30
@lpcox lpcox marked this pull request as ready for review June 14, 2026 17:32
Copilot AI review requested due to automatic review settings June 14, 2026 17:32
@github-actions

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 96.60% 96.88% 📈 +0.28%
Statements 96.47% 96.75% 📈 +0.28%
Functions 98.80% 98.80% ➡️ +0.00%
Branches 91.18% 91.28% 📈 +0.10%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/workdir-setup.ts 92.6% → 94.4% (+1.85%) 92.6% → 94.4% (+1.85%)
src/artifact-preservation.ts 85.4% → 100.0% (+14.64%) 85.4% → 100.0% (+14.64%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the api-proxy’s AI credits pricing so embedding requests made by the Copilot CLI no longer trigger GH_AW_UNKNOWN_MODEL_AI_CREDITS (and HTTP 400s) when maxAiCredits is enabled and no default fallback pricing is configured.

Changes:

  • Add curated pricing entries for text-embedding-3-small and text-embedding-ada-002.
  • Add Jest coverage to confirm AI credits calculation and non-rejection behavior for these embedding models (including prefix match for text-embedding-3-small-inference).
Show a summary per file
File Description
containers/api-proxy/ai-credits-pricing.js Adds explicit pricing entries for embedding models so they resolve as “known” and don’t trigger unknown-model rejection.
containers/api-proxy/guards/ai-credits-guard.test.js Adds tests covering embedding model pricing resolution and behavior under AWF_MAX_AI_CREDITS.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment on lines +31 to +33
// Embedding models (output tokens are not produced; output cost is 0)
'text-embedding-3-small': { input: 0.02, cachedInput: null, cacheWrite: null, output: 0.00 },
'text-embedding-ada-002': { input: 0.10, cachedInput: null, cacheWrite: null, output: 0.00 },
Comment on lines +124 to +130
it('resolves text-embedding-3-small with input-only pricing', () => {
const usage = applyAiCreditsUsage({ input_tokens: 1_000_000, output_tokens: 0 }, 'text-embedding-3-small');
// input: 1M × $0.02/M / 10_000 denominator = 2.0 AIC
expect(usage).not.toBeNull();
expect(usage.aiCreditsThisResponse).toBeCloseTo(2.0, 5);
expect(usage.outputCreditsThisResponse).toBe(0);
});
…hen tests

- Replace cachedInput: null with cachedInput: 0 for consistency with
  all other pricing entries (avoids null coercion fragility)
- Use non-zero output_tokens in embedding tests to verify that
  output: 0.00 pricing is actually enforced

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color ok ✅ PASS
Go env ok ✅ PASS
Go uuid ok ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #4936 ·

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK (Direct Mode) ✅ PASS

  • ✅ GitHub MCP connectivity (listed 2 closed PRs)
  • ✅ GitHub.com reachable (MCP success)
  • ✅ File write/read (agent I/O working)
  • ✅ BYOK inference path active (responding via api-proxy sidecar → api.githubcopilot.com)

Mode: Direct BYOK (COPILOT_PROVIDER_API_KEY via api-proxy sidecar, dummy key in agent)

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

Copy link
Copy Markdown
Contributor

🔥 Smoke Test Results — PASS

Test Status
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read

PR: fix(api-proxy): add embedding model pricing to resolve unknown model rejection
Author: @Copilot | Assignees: @lpcox @Copilot

Overall: PASS

📰 BREAKING: Report filed by Smoke Copilot

@github-actions

Copy link
Copy Markdown
Contributor

PR titles:

  • Refactor OpenAI BYOK base URL parsing to reuse shared proxy URL normalization
  • refactor(api-proxy): split proxy-request.js into http-client.js and body-handler.js

Checks:

  • Merged PR review: ✅
  • Safe inputs GH query: ✅
  • Playwright GitHub title: ✅
  • File write/read: ✅
  • Discussion lookup/comment: ✅
  • Build: ✅

Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions

Copy link
Copy Markdown
Contributor

🔍 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
1. Module Loading otel.js loads; exports startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled
2. Test Suite 59 tests passed, 0 failures (2 suites: otel.test.js, otel-fanout.test.js)
3. Env Var Forwarding src/services/api-proxy-service-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, OTEL_SERVICE_NAME to the api-proxy container
4. Token Tracker Integration onUsage callback present in token-tracker-http.js (lines 64, 256–258) as OTEL hook point
5. OTEL Diagnostics ⚠️ No spans exported — container not running in this environment (DinD not supported); local fallback path /var/log/api-proxy/otel.jsonl confirmed in otel.js

All scenarios pass or are expected-pending. No unexpected failures found.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions

Copy link
Copy Markdown
Contributor

Smoke test: PAT auth — PASS ✅

Test Result
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read

Auth mode: PAT (COPILOT_GITHUB_TOKEN) · @lpcox @Copilot

🔑 PAT report filed by Smoke Copilot PAT

@github-actions

Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3
Node.js v24.16.0 v22.22.3
Go go1.22.12 go1.22.12

Overall: ❌ Not all versions match — Go matches, but Python and Node.js differ between host and chroot environments.

Tested by Smoke Chroot

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity

Check Result
Redis PING ❌ timeout/no response
PostgreSQL pg_isready ❌ no response
PostgreSQL SELECT 1 ❌ timeout/no response

host.docker.internal resolves to 172.17.0.1 but no services are reachable on ports 6379 or 5432.

Overall: FAIL

🔌 Service connectivity validated by Smoke Services

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox

Smoke Test Results:

  • MCP connectivity: ✅
  • GitHub.com connectivity: ✅
  • File write/read: ✅
  • BYOK inference: ✅

Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra

Overall: PASS

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results

  • GitHub MCP Testing: ❌ (Tools missing)
  • GitHub.com Connectivity: ❌ (SSL Error 000)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall Status: FAIL

PR Titles:

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants