[token-consumption] Daily AIC Consumption Report - 2026-06-22

### Executive Summary

In the last 24h, **48,888.5 AIC** were consumed across **117 unique agentic workflows** in `github/gh-aw`, spread over **746** AI-credit-bearing `gen_ai` spans (avg **65.5 AIC/event**, p95 **253.8**, single-event max **662.1**). The **top 10 workflows account for ~50%** of all AIC (~24,588). Two PR-review workflows alone — *PR Code Quality Reviewer* (5,661) and *Matt Pocock Skills Reviewer* (5,405) — consume ~23% combined.

- **Sentry AIC: queryable, but only via an explicit numeric cast.** The bare attribute `gh-aw.aic` resolves to an empty *string-typed* EAP column (returns `null`; `sum()` is rejected as "string type"). Real values are only reachable through `tags[gh-aw.aic,number]`. Any dashboard/saved query using the bare name will show AIC as missing — a silent observability trap.
- **Grafana AIC: not verifiable this run.** The Tempo query MCP tools (`tempo_traceql-search`, `tempo_get-attribute-names`, `tempo_get-trace`) are absent from this MCP build; only datasource discovery is available. Treated as an observability gap (unknown), not zero.
- No production failures: `errors` and `logs` datasets both returned **0** events in the window.

### Key Metrics

| Metric | Value |
|---|---|
| Events analyzed (`gen_ai` spans) | 15,495 |
| All spans (gen_ai + http.server + default) | 25,206 |
| Events with AIC data | 746 |
| Events with AIC data (Sentry) | 746 |
| Events with AIC data (Grafana) | Unknown — Tempo query tools unavailable |
| Total AIC | 48,888.5 |
| Unique workflows (AIC > 0) | 117 |
| Avg AIC/event | 65.5 |
| P95 AIC/event | 253.8 |
| Max single-event AIC | 662.1 |
| Events missing workflow name (`gen_ai`) | 11,443 (73.8%) |

### Top 10 Workflows by AIC Consumption

| Workflow | Events | Total AIC | Avg AIC/Event |
|---|---:|---:|---:|
| PR Code Quality Reviewer | 77 | 5,661.4 | 73.5 |
| Matt Pocock Skills Reviewer | 81 | 5,405.1 | 66.7 |
| Test Quality Sentinel | 72 | 3,397.7 | 47.2 |
| Design Decision Gate 🏗️ | 81 | 2,341.4 | 28.9 |
| PR Sous Chef | 65 | 2,060.0 | 31.7 |
| Contribution Check | 15 | 1,374.1 | 91.6 |
| Semantic Function Refactoring | 3 | 1,335.3 | 445.1 |
| [aw] Failure Investigator (6h) | 7 | 1,101.5 | 157.4 |
| Daily Rendering Scripts Verifier | 2 | 1,004.1 | 502.0 |
| Daily AgentRx Trace Optimizer | 2 | 907.3 | 453.6 |

Highest single-event consumers (run id): *PR Code Quality Reviewer* run [`27933865956`](https://github.com/github/gh-aw/actions/runs/27933865956) emitted 123.2 AIC on a single `claude-sonnet-4.6` span. The two highest *per-event* averages — *Daily Rendering Scripts Verifier* (502/event) and *Semantic Function Refactoring* (445/event) — are low-volume but expensive-per-run; worth watching.

> Ranking note: positions 1–8 (all ≥1,101 AIC) are firm. Positions 9–10 sit in a band where ~3,200 AIC across 17 low-volume (≤2-event) workflows could not be individually ranked because the MCP `list_events` tool cannot sort by a casted aggregate, so groups were pulled sorted by `-count()` and capped at 100 of 117 workflows. Captured ranking covers 45,686 / 48,888 AIC (93.4%).

### Grafana AIC Findings

**Grafana AIC: not queryable this run (observability gap, cause = tooling, not data).**

- The Tempo datasource exists and is healthy: `grafanacloud-ghaw-traces` (uid `grafanacloud-traces`, Tempo, region `prod-eu-west-2`).
- The Grafana MCP build exposed **only** `list_datasources` / `get_datasource`. The trace-query tools required to inspect AIC on spans — `tempo_get-attribute-names`, `tempo_traceql-search`, `tempo_get-attribute-values`, `tempo_get-trace` — returned `No such tool available`. AIC presence/typing in Tempo therefore **could not be confirmed**.
- Evidence (emit-side, not query-side): spans are exported to Sentry and Tempo from the same OTLP payload, and `actions/setup/js/send_otlp_span.cjs:2129` encodes `gh-aw.aic` as `doubleValue`; the inline comment (`:2120`) asserts Tempo indexes it so `{ span."gh-aw.aic" > 0 }` is queryable. This is plausible but **unverified** in this run. Suggested manual TraceQL: `{ span."gh-aw.aic" > 0 }` over `now-24h` on datasource `grafanacloud-traces`.
- `events_with_aic_data_grafana` is reported as **Unknown** (not 0) per the unknown-vs-zero rule.

<details>
<summary>Data Quality and Gaps</summary>

- **Events missing workflow identifiers:** 11,443 of 15,495 `gen_ai` spans (73.8%) have `gh-aw.workflow.name = null`. These are the per-LLM-call child spans; only the token-owning span carries workflow attribution. All 746 AIC-bearing spans **do** have a workflow name (no null bucket), so AIC attribution itself is complete — but generic `gen_ai`-level workflow rollups will undercount.
- **Events missing AIC attributes:** 14,749 of 15,495 `gen_ai` spans carry no AIC (expected — AIC is only attached to the job that owns token usage). `default`-op (3,641) and `http.server` (6,069) spans carry no AIC.
- **Sentry-specific AIC caveat:** `gh-aw.aic` is dual-registered (string + number) in EAP. Bare references hit the empty string column. Canonical query path is `tags[gh-aw.aic,number]` in both `query` and aggregate `fields`. `gh-aw.aic:>0` (string filter) returns nothing; `tags[gh-aw.aic,number]:>0` returns the real 746-span population.
- **Tool limitation:** `list_events` rejects `ORDER BY` on casted aggregates and `count_unique(...)` ("orderby must also be in the selected columns"). Only `-count()`/plain-field sorts work, capping server-side top-N-by-AIC; ranking done client-side over a 100-group fetch.
- **Assumptions/fallbacks:** Sentry treated as canonical to avoid double counting (spans dual-exported). `total_events_analyzed` = `gen_ai` span population (15,495). A single run can emit multiple AIC-bearing spans (e.g. agent + activation/detection jobs), so event count ≠ run count.
- **Grafana-specific AIC caveat:** see Grafana AIC Findings — query tooling unavailable; AIC queryability unverified.

</details>

### Recommendations

1. **Cap or tier the PR-review fleet.** *PR Code Quality Reviewer*, *Matt Pocock Skills Reviewer*, *Test Quality Sentinel*, *Design Decision Gate*, and *PR Sous Chef* are the top 5 (~18,866 AIC, ~39% of total). Consider gating them on PR size/paths, deduping overlapping review passes, or moving low-signal checks to a cheaper model — these run on nearly every PR.
2. **Investigate high per-event outliers.** *Daily Rendering Scripts Verifier* (502/event) and *Semantic Function Refactoring* (445/event) burn ~7× the avg per run despite low volume. Audit their prompts/context size for runaway token usage.
3. **Fix the Sentry AIC type ambiguity (instrumentation gap).** `gh-aw.aic` resolving to an empty string column means most users will conclude "AIC is missing." Either emit under a name that registers cleanly as numeric (e.g. a dedicated `gh-aw.aic_credits` double with no prior string writes) or document `tags[gh-aw.aic,number]` as the required access path in all saved queries/alerts/dashboards.
4. **Restore Grafana AIC verifiability + propagate workflow attribution.** Enable the Tempo query MCP tools (or document a manual TraceQL runbook) so Grafana AIC can be confirmed numeric each run, and consider propagating `gh-aw.workflow.name` onto child `gen_ai` spans so the 73.8% currently-unattributed spans can be rolled up by workflow.

### References

- Sentry — AIC by workflow (canonical query): https://github.sentry.io/explore/traces/?query=tags%5Bgh-aw.aic%2Cnumber%5D:%3E0&project=4511347087179777&aggregateField=%7B%22groupBy%22:%22gh-aw.workflow.name%22%7D&aggregateField=%7B%22yAxes%22:%5B%22count%28%29%22%2C%22sum%28tags%5Bgh-aw.aic%2Cnumber%5D%29%22%5D%7D&mode=aggregate&statsPeriod=24h&table=span
- Sentry — representative verified trace (`PR Code Quality Reviewer`, run 27933865956): https://github.sentry.io/explore/traces/trace/680b92a6eefe5b30e91ab6fac538471a
- Grafana/Tempo — datasource `grafanacloud-traces` (EU-West-2); suggested TraceQL `{ span."gh-aw.aic" > 0 }` over `now-24h` (query tools unavailable this run)
- Run reference: [§27933865956](https://github.com/github/gh-aw/actions/runs/27933865956)







> Generated by [📊 Daily AIC Consumption Report (Sentry + Grafana OTel)](https://github.com/github/gh-aw/actions/runs/27957153639) · 283 AIC · ⌖ 33.7 AIC · ⊞ 7.2K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-token-consumption-report%22&type=issues)
> - [x] expires  on Jun 23, 2026, 5:56 AM UTC-08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[token-consumption] Daily AIC Consumption Report - 2026-06-22 #40784

Executive Summary

Key Metrics

Top 10 Workflows by AIC Consumption

Grafana AIC Findings

Recommendations

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value
Events analyzed (`gen_ai` spans)	15,495
All spans (gen_ai + http.server + default)	25,206
Events with AIC data	746
Events with AIC data (Sentry)	746
Events with AIC data (Grafana)	Unknown — Tempo query tools unavailable
Total AIC	48,888.5
Unique workflows (AIC > 0)	117
Avg AIC/event	65.5
P95 AIC/event	253.8
Max single-event AIC	662.1
Events missing workflow name (`gen_ai`)	11,443 (73.8%)

Workflow	Events	Total AIC	Avg AIC/Event
PR Code Quality Reviewer	77	5,661.4	73.5
Matt Pocock Skills Reviewer	81	5,405.1	66.7
Test Quality Sentinel	72	3,397.7	47.2
Design Decision Gate 🏗️	81	2,341.4	28.9
PR Sous Chef	65	2,060.0	31.7
Contribution Check	15	1,374.1	91.6
Semantic Function Refactoring	3	1,335.3	445.1
[aw] Failure Investigator (6h)	7	1,101.5	157.4
Daily Rendering Scripts Verifier	2	1,004.1	502.0
Daily AgentRx Trace Optimizer	2	907.3	453.6

Uh oh!

[token-consumption] Daily AIC Consumption Report - 2026-06-22 #40784

Description

Executive Summary

Key Metrics

Top 10 Workflows by AIC Consumption

Grafana AIC Findings

Recommendations

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions