Executive Summary
In the last 24h, 48,888.5 AIC were consumed across 117 unique agentic workflows in github/gh-aw, spread over 746 AI-credit-bearing gen_ai spans (avg 65.5 AIC/event, p95 253.8, single-event max 662.1). The top 10 workflows account for ~50% of all AIC (~24,588). Two PR-review workflows alone — PR Code Quality Reviewer (5,661) and Matt Pocock Skills Reviewer (5,405) — consume ~23% combined.
- Sentry AIC: queryable, but only via an explicit numeric cast. The bare attribute
gh-aw.aic resolves to an empty string-typed EAP column (returns null; sum() is rejected as "string type"). Real values are only reachable through tags[gh-aw.aic,number]. Any dashboard/saved query using the bare name will show AIC as missing — a silent observability trap.
- Grafana AIC: not verifiable this run. The Tempo query MCP tools (
tempo_traceql-search, tempo_get-attribute-names, tempo_get-trace) are absent from this MCP build; only datasource discovery is available. Treated as an observability gap (unknown), not zero.
- No production failures:
errors and logs datasets both returned 0 events in the window.
Key Metrics
| Metric |
Value |
Events analyzed (gen_ai spans) |
15,495 |
| All spans (gen_ai + http.server + default) |
25,206 |
| Events with AIC data |
746 |
| Events with AIC data (Sentry) |
746 |
| Events with AIC data (Grafana) |
Unknown — Tempo query tools unavailable |
| Total AIC |
48,888.5 |
| Unique workflows (AIC > 0) |
117 |
| Avg AIC/event |
65.5 |
| P95 AIC/event |
253.8 |
| Max single-event AIC |
662.1 |
Events missing workflow name (gen_ai) |
11,443 (73.8%) |
Top 10 Workflows by AIC Consumption
| Workflow |
Events |
Total AIC |
Avg AIC/Event |
| PR Code Quality Reviewer |
77 |
5,661.4 |
73.5 |
| Matt Pocock Skills Reviewer |
81 |
5,405.1 |
66.7 |
| Test Quality Sentinel |
72 |
3,397.7 |
47.2 |
| Design Decision Gate 🏗️ |
81 |
2,341.4 |
28.9 |
| PR Sous Chef |
65 |
2,060.0 |
31.7 |
| Contribution Check |
15 |
1,374.1 |
91.6 |
| Semantic Function Refactoring |
3 |
1,335.3 |
445.1 |
| [aw] Failure Investigator (6h) |
7 |
1,101.5 |
157.4 |
| Daily Rendering Scripts Verifier |
2 |
1,004.1 |
502.0 |
| Daily AgentRx Trace Optimizer |
2 |
907.3 |
453.6 |
Highest single-event consumers (run id): PR Code Quality Reviewer run 27933865956 emitted 123.2 AIC on a single claude-sonnet-4.6 span. The two highest per-event averages — Daily Rendering Scripts Verifier (502/event) and Semantic Function Refactoring (445/event) — are low-volume but expensive-per-run; worth watching.
Ranking note: positions 1–8 (all ≥1,101 AIC) are firm. Positions 9–10 sit in a band where ~3,200 AIC across 17 low-volume (≤2-event) workflows could not be individually ranked because the MCP list_events tool cannot sort by a casted aggregate, so groups were pulled sorted by -count() and capped at 100 of 117 workflows. Captured ranking covers 45,686 / 48,888 AIC (93.4%).
Grafana AIC Findings
Grafana AIC: not queryable this run (observability gap, cause = tooling, not data).
- The Tempo datasource exists and is healthy:
grafanacloud-ghaw-traces (uid grafanacloud-traces, Tempo, region prod-eu-west-2).
- The Grafana MCP build exposed only
list_datasources / get_datasource. The trace-query tools required to inspect AIC on spans — tempo_get-attribute-names, tempo_traceql-search, tempo_get-attribute-values, tempo_get-trace — returned No such tool available. AIC presence/typing in Tempo therefore could not be confirmed.
- Evidence (emit-side, not query-side): spans are exported to Sentry and Tempo from the same OTLP payload, and
actions/setup/js/send_otlp_span.cjs:2129 encodes gh-aw.aic as doubleValue; the inline comment (:2120) asserts Tempo indexes it so { span."gh-aw.aic" > 0 } is queryable. This is plausible but unverified in this run. Suggested manual TraceQL: { span."gh-aw.aic" > 0 } over now-24h on datasource grafanacloud-traces.
events_with_aic_data_grafana is reported as Unknown (not 0) per the unknown-vs-zero rule.
Data Quality and Gaps
- Events missing workflow identifiers: 11,443 of 15,495
gen_ai spans (73.8%) have gh-aw.workflow.name = null. These are the per-LLM-call child spans; only the token-owning span carries workflow attribution. All 746 AIC-bearing spans do have a workflow name (no null bucket), so AIC attribution itself is complete — but generic gen_ai-level workflow rollups will undercount.
- Events missing AIC attributes: 14,749 of 15,495
gen_ai spans carry no AIC (expected — AIC is only attached to the job that owns token usage). default-op (3,641) and http.server (6,069) spans carry no AIC.
- Sentry-specific AIC caveat:
gh-aw.aic is dual-registered (string + number) in EAP. Bare references hit the empty string column. Canonical query path is tags[gh-aw.aic,number] in both query and aggregate fields. gh-aw.aic:>0 (string filter) returns nothing; tags[gh-aw.aic,number]:>0 returns the real 746-span population.
- Tool limitation:
list_events rejects ORDER BY on casted aggregates and count_unique(...) ("orderby must also be in the selected columns"). Only -count()/plain-field sorts work, capping server-side top-N-by-AIC; ranking done client-side over a 100-group fetch.
- Assumptions/fallbacks: Sentry treated as canonical to avoid double counting (spans dual-exported).
total_events_analyzed = gen_ai span population (15,495). A single run can emit multiple AIC-bearing spans (e.g. agent + activation/detection jobs), so event count ≠ run count.
- Grafana-specific AIC caveat: see Grafana AIC Findings — query tooling unavailable; AIC queryability unverified.
Recommendations
- Cap or tier the PR-review fleet. PR Code Quality Reviewer, Matt Pocock Skills Reviewer, Test Quality Sentinel, Design Decision Gate, and PR Sous Chef are the top 5 (~18,866 AIC, ~39% of total). Consider gating them on PR size/paths, deduping overlapping review passes, or moving low-signal checks to a cheaper model — these run on nearly every PR.
- Investigate high per-event outliers. Daily Rendering Scripts Verifier (502/event) and Semantic Function Refactoring (445/event) burn ~7× the avg per run despite low volume. Audit their prompts/context size for runaway token usage.
- Fix the Sentry AIC type ambiguity (instrumentation gap).
gh-aw.aic resolving to an empty string column means most users will conclude "AIC is missing." Either emit under a name that registers cleanly as numeric (e.g. a dedicated gh-aw.aic_credits double with no prior string writes) or document tags[gh-aw.aic,number] as the required access path in all saved queries/alerts/dashboards.
- Restore Grafana AIC verifiability + propagate workflow attribution. Enable the Tempo query MCP tools (or document a manual TraceQL runbook) so Grafana AIC can be confirmed numeric each run, and consider propagating
gh-aw.workflow.name onto child gen_ai spans so the 73.8% currently-unattributed spans can be rolled up by workflow.
References
Generated by 📊 Daily AIC Consumption Report (Sentry + Grafana OTel) · 283 AIC · ⌖ 33.7 AIC · ⊞ 7.2K · ◷
Executive Summary
In the last 24h, 48,888.5 AIC were consumed across 117 unique agentic workflows in
github/gh-aw, spread over 746 AI-credit-bearinggen_aispans (avg 65.5 AIC/event, p95 253.8, single-event max 662.1). The top 10 workflows account for ~50% of all AIC (~24,588). Two PR-review workflows alone — PR Code Quality Reviewer (5,661) and Matt Pocock Skills Reviewer (5,405) — consume ~23% combined.gh-aw.aicresolves to an empty string-typed EAP column (returnsnull;sum()is rejected as "string type"). Real values are only reachable throughtags[gh-aw.aic,number]. Any dashboard/saved query using the bare name will show AIC as missing — a silent observability trap.tempo_traceql-search,tempo_get-attribute-names,tempo_get-trace) are absent from this MCP build; only datasource discovery is available. Treated as an observability gap (unknown), not zero.errorsandlogsdatasets both returned 0 events in the window.Key Metrics
gen_aispans)gen_ai)Top 10 Workflows by AIC Consumption
Highest single-event consumers (run id): PR Code Quality Reviewer run
27933865956emitted 123.2 AIC on a singleclaude-sonnet-4.6span. The two highest per-event averages — Daily Rendering Scripts Verifier (502/event) and Semantic Function Refactoring (445/event) — are low-volume but expensive-per-run; worth watching.Grafana AIC Findings
Grafana AIC: not queryable this run (observability gap, cause = tooling, not data).
grafanacloud-ghaw-traces(uidgrafanacloud-traces, Tempo, regionprod-eu-west-2).list_datasources/get_datasource. The trace-query tools required to inspect AIC on spans —tempo_get-attribute-names,tempo_traceql-search,tempo_get-attribute-values,tempo_get-trace— returnedNo such tool available. AIC presence/typing in Tempo therefore could not be confirmed.actions/setup/js/send_otlp_span.cjs:2129encodesgh-aw.aicasdoubleValue; the inline comment (:2120) asserts Tempo indexes it so{ span."gh-aw.aic" > 0 }is queryable. This is plausible but unverified in this run. Suggested manual TraceQL:{ span."gh-aw.aic" > 0 }overnow-24hon datasourcegrafanacloud-traces.events_with_aic_data_grafanais reported as Unknown (not 0) per the unknown-vs-zero rule.Data Quality and Gaps
gen_aispans (73.8%) havegh-aw.workflow.name = null. These are the per-LLM-call child spans; only the token-owning span carries workflow attribution. All 746 AIC-bearing spans do have a workflow name (no null bucket), so AIC attribution itself is complete — but genericgen_ai-level workflow rollups will undercount.gen_aispans carry no AIC (expected — AIC is only attached to the job that owns token usage).default-op (3,641) andhttp.server(6,069) spans carry no AIC.gh-aw.aicis dual-registered (string + number) in EAP. Bare references hit the empty string column. Canonical query path istags[gh-aw.aic,number]in bothqueryand aggregatefields.gh-aw.aic:>0(string filter) returns nothing;tags[gh-aw.aic,number]:>0returns the real 746-span population.list_eventsrejectsORDER BYon casted aggregates andcount_unique(...)("orderby must also be in the selected columns"). Only-count()/plain-field sorts work, capping server-side top-N-by-AIC; ranking done client-side over a 100-group fetch.total_events_analyzed=gen_aispan population (15,495). A single run can emit multiple AIC-bearing spans (e.g. agent + activation/detection jobs), so event count ≠ run count.Recommendations
gh-aw.aicresolving to an empty string column means most users will conclude "AIC is missing." Either emit under a name that registers cleanly as numeric (e.g. a dedicatedgh-aw.aic_creditsdouble with no prior string writes) or documenttags[gh-aw.aic,number]as the required access path in all saved queries/alerts/dashboards.gh-aw.workflow.nameonto childgen_aispans so the 73.8% currently-unattributed spans can be rolled up by workflow.References
PR Code Quality Reviewer, run 27933865956): https://github.sentry.io/explore/traces/trace/680b92a6eefe5b30e91ab6fac538471agrafanacloud-traces(EU-West-2); suggested TraceQL{ span."gh-aw.aic" > 0 }overnow-24h(query tools unavailable this run)