fix(api-proxy): stop double-counting cached tokens in AI credits by lpcox · Pull Request #4760 · github/gh-aw-firewall

lpcox · 2026-06-11T15:31:29Z

Problem

AI credits are significantly overcharged when prompt caching is active. Both Anthropic and OpenAI report input_tokens as the total input including cached tokens, but the credit calculation was charging:

The full input_tokens at the full input rate ($3.00/Mtok for claude-sonnet-4-6)
Plus cache_read_tokens at the cache rate ($0.30/Mtok)

This double-counts the cached portion — charging it at full price (included in input_tokens) and again at cache price.

Example (claude-sonnet-4-6, 3M total input, 2.9M from cache)

Calculation	Formula	Result
❌ Before (bug)	3.0M × $3/Mtok + 2.9M × $0.30/Mtok + output	~1017 AIC
✅ After (correct)	0.1M × $3/Mtok + 2.9M × $0.30/Mtok + output	~192 AIC

The cached tokens were being charged at $3.30/Mtok (full + cache rate) instead of just $0.30/Mtok.

Fix

Subtract cache_read_tokens and cache_write_tokens from input_tokens before applying the full input rate:

const nonCachedInput = Math.max(0, totalInput - cacheReadTokens - cacheWriteTokens);
const inputCredits = (nonCachedInput * pricing.input) / CREDIT_DENOMINATOR;
const cachedInputCredits = (cacheReadTokens * pricing.cachedInput) / CREDIT_DENOMINATOR;
const cacheWriteCredits = (cacheWriteTokens * pricing.cacheWrite) / CREDIT_DENOMINATOR;

Provider semantics confirmed

Anthropic: input_tokens includes cache_read_input_tokens + cache_creation_input_tokens (docs)
OpenAI: prompt_tokens includes prompt_tokens_details.cached_tokens (docs)

Testing

Updated existing credit calculation tests with correct expected values
Added new test: 3M input / 2.9M cached scenario asserting < 250 AIC (was ~1017)
All 999 api-proxy tests pass

Both Anthropic and OpenAI report input_tokens as the TOTAL input including cache_read and cache_creation tokens. The AI credits calculation was charging the full input_tokens at the full input rate, then ALSO charging cache_read_tokens at the cache rate — effectively double-counting cached tokens. Example (claude-sonnet-4-6, 3M input, 2.9M cached): Before: 3M × $3/Mtok + 2.9M × $0.30/Mtok ≈ 1017 AIC After: 0.1M × $3/Mtok + 2.9M × $0.30/Mtok ≈ 192 AIC (correct) The fix subtracts cache_read_tokens and cache_write_tokens from input_tokens before applying the full input rate, since those portions are already accounted for at their respective discounted rates. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-11T15:33:06Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	96.42%	96.46%	📈 +0.04%
Statements	96.34%	96.38%	📈 +0.04%
Functions	98.77%	98.77%	➡️ +0.00%
Branches	90.74%	90.78%	📈 +0.04%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/config-writer.ts`	89.3% → 90.9% (+1.65%)	89.3% → 90.9% (+1.65%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

github-actions · 2026-06-11T15:36:35Z

GitHub API: ✅ PASS
GitHub check: ✅ PASS
File verify: ✅ PASS

Total: PASS

💥 [THE END] — Illustrated by Smoke Claude

github-actions · 2026-06-11T15:37:02Z

🔥 Smoke Test: Copilot PAT Auth — FAIL

Test	Result
GitHub MCP connectivity	✅ PASS
GitHub.com HTTP connectivity	❌ Pre-step data unavailable (`${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }}` not substituted)
File write/read	❌ Pre-step data unavailable (path template not substituted)

Overall: FAIL — pre-step outputs were not evaluated; tests 2 & 3 could not be verified.

PR: fix(api-proxy): stop double-counting cached tokens in AI credits · Author: @lpcox · Reviewer: @Copilot

Auth mode: PAT (COPILOT_GITHUB_TOKEN)

🔑 PAT report filed by Smoke Copilot PAT

github-actions · 2026-06-11T15:37:12Z

Smoke Test: Copilot BYOK (Direct) Mode ✅

Test	Result
GitHub MCP connectivity	✅
GitHub.com HTTP	✅ (HTTP 200)
File write/read	✅
BYOK inference path	✅ (agent → api-proxy → api.githubcopilot.com)

Mode: Direct BYOK (COPILOT_PROVIDER_API_KEY via api-proxy sidecar)
Status: PASS — all tests passed

Author: @lpcox | Reviewers: @Copilot

🔑 BYOK report filed by Smoke Copilot BYOK

github-actions · 2026-06-11T15:37:33Z

@lpcox

GitHub MCP connectivity: ✅
GitHub.com connectivity (HTTP 200/301): ✅
File write/read in sandbox: ✅
Running in direct BYOK mode via api-proxy → Azure OpenAI (Foundry, o4-mini-aw): ✅

Overall: PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)

github-actions · 2026-06-11T15:37:45Z

@lpcox

GitHub MCP connectivity: ❌
GitHub.com connectivity: ✅
File write/read test: ❌
BYOK inference path: ✅

Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra

Overall: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

api.openai.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "api.openai.com"

See Network Configuration for more information.

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

github-actions · 2026-06-11T15:37:47Z

Reviewed PRs:

fix(api-proxy): use 'token' auth prefix for GHES enterprise Copilot API #4755 fix(api-proxy): use 'token' auth prefix for GHES enterprise Copilot API
fix: update workflow test SHA assertions after recompile #4720 fix: update workflow test SHA assertions after recompile

Checks:

GitHub title check: ✅
Temp file write/read: ✅
Discussion lookup/comment: ✅
Build (npm ci && npm run build): ✅

Overall status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

github-actions · 2026-06-11T15:38:30Z

🔭 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario	Result	Details
1. Module Loading	✅	`otel.js` loads cleanly; exports: `startRequestSpan`, `setTokenAttributes`, `setBudgetAttributes`, `endSpan`, `endSpanError`, `shutdown`, `isEnabled`
2. Test Suite	✅	39/39 tests pass across span creation, token attributes, parent context propagation, OTLP export, file export, and shutdown
3. Env Var Forwarding	✅	`api-proxy-service-config.ts` forwards `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_EXPORTER_OTLP_HEADERS`, `GITHUB_AW_OTEL_TRACE_ID`, `GITHUB_AW_OTEL_PARENT_SPAN_ID`, `OTEL_SERVICE_NAME`
4. Token Tracker Integration	✅	`onUsage` callback present in `token-tracker-http.js` (lines 62/66/269); invoked after normalized usage extraction
5. OTEL Diagnostics	✅	No `OTEL_EXPORTER_OTLP_ENDPOINT` set → graceful degradation to local file fallback (`/var/log/api-proxy/otel.jsonl`); `isEnabled()` returns `true`

All scenarios pass. OTEL tracing integration is functioning correctly on this PR.

📡 OTel tracing validated by Smoke OTel Tracing

github-actions · 2026-06-11T15:38:46Z

🔍 Chroot Version Comparison

Runtime	Host Version	Chroot Version	Match?
Python	Python 3.12.13	Python 3.12.3	❌
Node.js	v24.16.0	v22.22.3	❌
Go	go1.22.12	go1.22.12	✅

Result: Not all tests passed. Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

github-actions · 2026-06-11T15:40:25Z

Smoke Test Results

Check	Result
Redis PING	❌ No response
PostgreSQL pg_isready	❌ No response
PostgreSQL SELECT 1	❌ No response

Overall: FAIL — host.docker.internal is unreachable from this runner. Service containers are not accessible.

🔌 Service connectivity validated by Smoke Services

github-actions · 2026-06-11T16:03:31Z

🤖 Smoke Test: Copilot Engine — PASS

Test	Result
GitHub MCP	✅ PR listed: fix(api-proxy): stop double-counting cached tokens in AI credits
GitHub.com connectivity	✅ HTTP 200
File write/read	✅ `smoke-test-copilot-27359745034.txt` verified

Overall: PASS — @lpcox (no assignees)

📰 BREAKING: Report filed by Smoke Copilot

Copilot AI review requested due to automatic review settings June 11, 2026 15:31

Copilot started reviewing on behalf of lpcox June 11, 2026 15:31 View session

lpcox temporarily deployed to aoai-model June 11, 2026 15:33 — with GitHub Actions Inactive

github-actions Bot added the smoke-claude label Jun 11, 2026

github-actions Bot added the smoke-copilot-byok label Jun 11, 2026

lpcox temporarily deployed to aoai-model June 11, 2026 15:37 — with GitHub Actions Inactive

github-actions Bot mentioned this pull request Jun 11, 2026

[aw] No-Op Runs #4045

Closed

github-actions Bot added the smoke-copilot-byok-aoai-apikey label Jun 11, 2026

github-actions Bot added smoke-copilot-byok-aoai-entra smoke-codex labels Jun 11, 2026

github-actions Bot mentioned this pull request Jun 11, 2026

[aw] Contribution Check failed #4735

Closed

lpcox temporarily deployed to aoai-model June 11, 2026 15:37 — with GitHub Actions Inactive

This comment has been minimized.

Sign in to view

github-actions Bot added the smoke-copilot label Jun 11, 2026

lpcox merged commit ab519be into main Jun 11, 2026
84 of 85 checks passed

lpcox deleted the fix-ai-credits-cache-double-count branch June 11, 2026 15:49

lpcox removed the request for review from Copilot June 11, 2026 15:52

github-actions Bot mentioned this pull request Jun 11, 2026

fix: chroot runner tool cache mountpoints #4733

Merged

This was referenced Jun 11, 2026

perf(test-coverage-reporter): reduce token usage ~7-10% per run #4764

Merged

perf(doc-maintainer): reduce per-run token usage #4765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api-proxy): stop double-counting cached tokens in AI credits#4760

fix(api-proxy): stop double-counting cached tokens in AI credits#4760
lpcox merged 1 commit into
mainfrom
fix-ai-credits-cache-double-count

lpcox commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

This comment has been minimized.

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lpcox commented Jun 11, 2026

Problem

Example (claude-sonnet-4-6, 3M total input, 2.9M from cache)

Fix

Provider semantics confirmed

Testing

Uh oh!

github-actions Bot commented Jun 11, 2026

✅ Coverage Check Passed

Overall Coverage

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

🔥 Smoke Test: Copilot PAT Auth — FAIL

Uh oh!

github-actions Bot commented Jun 11, 2026

Smoke Test: Copilot BYOK (Direct) Mode ✅

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

This comment has been minimized.

github-actions Bot commented Jun 11, 2026

🔭 Smoke Test: API Proxy OpenTelemetry Tracing

Uh oh!

github-actions Bot commented Jun 11, 2026

🔍 Chroot Version Comparison

Uh oh!

github-actions Bot commented Jun 11, 2026

Smoke Test Results

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

🤖 Smoke Test: Copilot Engine — PASS

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant