Executive Summary
- 4 runs sampled across 4 distinct workflows from the last 24 hours
- Workflows covered: Smoke Codex, Smoke Claude, [aw] Failure Investigator (6h), AI Moderator
- Median first-request size: 19,020 chars | P95: 32,686 chars
- Highest-level finding: The two smoke test workflows (Smoke Claude, Smoke Codex) include large shared skills—particularly
shared/github-queries-mcp-script.md (~16.5 KB)—that are largely unused for smoke-test purposes. The Failure Investigator embeds MCP tool parameter docs inline that could be fetched at runtime, and ~50% of its 20 turns are data-gathering that should move to deterministic steps. Security policy boilerplate (4 lines × 4 runs) and safe-output noop examples (5× each run) are repeated verbatim across every prompt.
Highest-Leverage Changes
- [skills] Drop
shared/github-queries-mcp-script.md from smoke tests — it contributes ~16.5 KB (~50% of Smoke Claude's 32 KB prompt) but only repos and pull_requests toolsets are configured.
- [workflow-md] Remove inline MCP tool parameter docs from
aw-failure-investigator.md — the top two sections (audit + CLI usage) add >3 KB that can be fetched via --help at runtime.
- [agents] Move failure run ID discovery to a deterministic pre-step in
aw-failure-investigator.md — audit evidence shows ~50% of 20 turns are data-gathering, inflating effective tokens to 9.3 M.
- [workflow-md] Deduplicate repeated boilerplate in smoke workflows —
dup_line_ratio of 0.111 (Smoke Claude) and 0.090 (Smoke Codex) means 9–11% of lines are exact duplicates (workspace paths, memory paths, "Verify the tool call succeeds" repeated 5×).
- [skills] Enable
gh-proxy and cli-proxy on smoke workflows — none of the three largest-prompt workflows (Smoke Claude, Smoke Codex, Failure Investigator) have tools.github.mode: gh-proxy or tools.cli-proxy: true. Only AI Moderator has cli-proxy: true. Enabling these allows MCP-tool-oriented wording to be replaced with compact CLI-form commands.
- [skills] Extract reporting structure guidelines from
aw-failure-investigator.md into a referenced shared skill — currently ~2 KB of formatting rules (heading levels, progressive disclosure, report structure pattern) are inlined.
- [workflow-md] Trim the safe-output noop example JSON (appears 5× across all 4 runs) — a one-line reference to the schema is sufficient.
Key Metrics
| Metric |
Value |
| Sampled runs |
4 |
| Distinct workflows |
4 |
| Median chars |
19,020 |
| P95 chars |
32,686 |
| Largest sampled request |
Smoke Claude (32,686 chars) |
Per-Run First-Request Metrics
| Run ID |
Workflow |
Chars |
Lines |
Headings |
Code Fences |
Dup Line Ratio |
Token Usage |
| 26922082621 |
Smoke Claude |
32,686 |
681 |
56 |
19 |
0.111 |
1,392,018 |
| 26922082608 |
Smoke Codex |
20,451 |
411 |
23 |
9 |
0.090 |
1,589,240 |
| 26924689485 |
[aw] Failure Investigator (6h) |
17,589 |
356 |
31 |
11 |
0.053 |
1,541,840 |
| 26924934999 |
AI Moderator |
13,803 |
258 |
14 |
5 |
0.011 |
90,124 |
Repeated Ambient Context Signals
These lines/fragments appear verbatim across 4 or more sampled run prompts:
- [5×]
{"noop": {"message": "No action needed: [brief explanation...]"}} — safe-output noop example repeated in every run
- [5×]
- Verify the tool call succeeds — repeated within Smoke Claude alone
- [4×] Immutable security policy block (3 lines, ~400 chars) — injected identically into all 4 prompts; runtime injection; no action needed but confirms system overhead
- [4×]
<instruction>When you need to create temporary files...</instruction> — system injection, expected
- [4×]
Do NOT attempt to edit files outside these directories — system injection, expected
Largest unique skill sections driving Smoke Claude size:
shared/github-queries-mcp-script.md — ~16.5 KB inline (~50% of total prompt)
#### audit — Inspect a specific run — 6,693 chars
# Works with any tool — just match the parameter names — 3,699 chars (appears in Smoke Codex and Failure Investigator too)
Deterministic Analysis Output
Script: /tmp/gh-aw/ambient-context/analyze_requests.py
Output: /tmp/gh-aw/ambient-context/request-analysis.json + .md
Key computed metrics:
- Smoke Claude: 56 headings, 186 list items, 19 code fences, 104 "tools" keyword hits, 43 "safe_outputs" keyword hits
- Smoke Codex: 23 headings, 104 list items, 9 code fences, 58 "tools" keyword hits
- Failure Investigator: 31 headings, 93 list items, eff. tokens = 9,346,226 (20 turns × ~467K avg)
- AI Moderator: 14 headings, lowest dup ratio (0.011), lowest prompt size (13,803 chars) — closest to well-scoped baseline
Audit for Failure Investigator: agentic_fraction=0.50 — half of 20 turns are pure data gathering.
Recommendations by Category
Workflow Markdown
- [HIGH]
smoke-claude.md / smoke-codex.md: Remove shared/github-queries-mcp-script.md from skills: list. This file is 16.5 KB and provides GitHub API query scripts for issues, discussions, labels — none of which are exercised by a smoke test that only configures repos and pull_requests toolsets. Estimated saving: ~16 KB from Smoke Claude prompt.
- [HIGH]
aw-failure-investigator.md: Replace the inline MCP tool parameter documentation block (audit/logs/status sections, ~3 KB) with a single bootstrapping call: agenticworkflows --help or agenticworkflows logs --help. The agent uses bash: ["*"] and can fetch this at runtime. Needs manual review to confirm all referenced parameters are discoverable via --help.
- [MEDIUM]
smoke-claude.md / smoke-codex.md: Deduplicate repeated workspace/memory path lines and "Verify the tool call succeeds" boilerplate. dup_line_ratio of 0.09–0.11 means ~10% of lines are redundant. Consolidate into a single definitions block.
- [MEDIUM]
smoke-claude.md / smoke-codex.md / aw-failure-investigator.md: Enable tools.github.mode: gh-proxy and tools.cli-proxy: true. Currently only ai-moderator.md has cli-proxy: true. Enabling proxies allows replacing verbose MCP-tool-oriented step wording with compact gh CLI commands, reducing both prompt size and turn count.
Skills
- [HIGH] Smoke tests:
shared/github-queries-mcp-script.md should be removed from or not included by smoke test workflows. Consider gating skill inclusion on active toolset configuration (i.e., only include when issues/discussions toolsets are configured).
- [MEDIUM]
aw-failure-investigator.md: Extract the reporting structure guidelines (~2 KB of heading-level rules, progressive disclosure patterns, Airbnb-inspired design principles) into a shared shared/reporting.md skill so other workflows can reference it without inlining it. Check if shared/reporting.md already exists and whether it covers this content.
Agents
- [HIGH]
aw-failure-investigator.md: Move failure run ID pre-fetch to a deterministic GitHub Actions run: step before agent activation. The audit confirms ~50% of 20 turns are data-gathering (effective tokens: 9.3 M). Injecting structured run data as a cache: artifact (the workflow already has a cache block) would cut agentic turns by ~10 and save millions of effective tokens per run. Needs manual review to define the pre-fetch query scope.
References
- §26922082621 — Smoke Claude (32,686 chars, 1.39 M tokens)
- §26924689485 — Failure Investigator (17,589 chars, 9.35 M eff. tokens, 20 turns)
- §26922082608 — Smoke Codex (20,451 chars, 1.59 M tokens)
Generated by 🌫️ Daily Ambient Context Optimizer · sonnet46 3M · ◷
Executive Summary
shared/github-queries-mcp-script.md(~16.5 KB)—that are largely unused for smoke-test purposes. The Failure Investigator embeds MCP tool parameter docs inline that could be fetched at runtime, and ~50% of its 20 turns are data-gathering that should move to deterministic steps. Security policy boilerplate (4 lines × 4 runs) and safe-output noop examples (5× each run) are repeated verbatim across every prompt.Highest-Leverage Changes
shared/github-queries-mcp-script.mdfrom smoke tests — it contributes ~16.5 KB (~50% of Smoke Claude's 32 KB prompt) but onlyreposandpull_requeststoolsets are configured.aw-failure-investigator.md— the top two sections (audit + CLI usage) add >3 KB that can be fetched via--helpat runtime.aw-failure-investigator.md— audit evidence shows ~50% of 20 turns are data-gathering, inflating effective tokens to 9.3 M.dup_line_ratioof 0.111 (Smoke Claude) and 0.090 (Smoke Codex) means 9–11% of lines are exact duplicates (workspace paths, memory paths, "Verify the tool call succeeds" repeated 5×).gh-proxyandcli-proxyon smoke workflows — none of the three largest-prompt workflows (Smoke Claude, Smoke Codex, Failure Investigator) havetools.github.mode: gh-proxyortools.cli-proxy: true. Only AI Moderator hascli-proxy: true. Enabling these allows MCP-tool-oriented wording to be replaced with compact CLI-form commands.aw-failure-investigator.mdinto a referenced shared skill — currently ~2 KB of formatting rules (heading levels, progressive disclosure, report structure pattern) are inlined.Key Metrics
Per-Run First-Request Metrics
Repeated Ambient Context Signals
These lines/fragments appear verbatim across 4 or more sampled run prompts:
{"noop": {"message": "No action needed: [brief explanation...]"}}— safe-output noop example repeated in every run- Verify the tool call succeeds— repeated within Smoke Claude alone<instruction>When you need to create temporary files...</instruction>— system injection, expectedDo NOT attempt to edit files outside these directories— system injection, expectedLargest unique skill sections driving Smoke Claude size:
shared/github-queries-mcp-script.md— ~16.5 KB inline (~50% of total prompt)####audit— Inspect a specific run— 6,693 chars# Works with any tool — just match the parameter names— 3,699 chars (appears in Smoke Codex and Failure Investigator too)Deterministic Analysis Output
Script:
/tmp/gh-aw/ambient-context/analyze_requests.pyOutput:
/tmp/gh-aw/ambient-context/request-analysis.json+.mdKey computed metrics:
Audit for Failure Investigator:
agentic_fraction=0.50— half of 20 turns are pure data gathering.Recommendations by Category
Workflow Markdown
smoke-claude.md/smoke-codex.md: Removeshared/github-queries-mcp-script.mdfromskills:list. This file is 16.5 KB and provides GitHub API query scripts for issues, discussions, labels — none of which are exercised by a smoke test that only configuresreposandpull_requeststoolsets. Estimated saving: ~16 KB from Smoke Claude prompt.aw-failure-investigator.md: Replace the inline MCP tool parameter documentation block (audit/logs/status sections, ~3 KB) with a single bootstrapping call:agenticworkflows --helporagenticworkflows logs --help. The agent usesbash: ["*"]and can fetch this at runtime. Needs manual review to confirm all referenced parameters are discoverable via--help.smoke-claude.md/smoke-codex.md: Deduplicate repeated workspace/memory path lines and "Verify the tool call succeeds" boilerplate.dup_line_ratioof 0.09–0.11 means ~10% of lines are redundant. Consolidate into a single definitions block.smoke-claude.md/smoke-codex.md/aw-failure-investigator.md: Enabletools.github.mode: gh-proxyandtools.cli-proxy: true. Currently onlyai-moderator.mdhascli-proxy: true. Enabling proxies allows replacing verbose MCP-tool-oriented step wording with compactghCLI commands, reducing both prompt size and turn count.Skills
shared/github-queries-mcp-script.mdshould be removed from or not included by smoke test workflows. Consider gating skill inclusion on active toolset configuration (i.e., only include when issues/discussions toolsets are configured).aw-failure-investigator.md: Extract the reporting structure guidelines (~2 KB of heading-level rules, progressive disclosure patterns, Airbnb-inspired design principles) into a sharedshared/reporting.mdskill so other workflows can reference it without inlining it. Check ifshared/reporting.mdalready exists and whether it covers this content.Agents
aw-failure-investigator.md: Move failure run ID pre-fetch to a deterministic GitHub Actionsrun:step before agent activation. The audit confirms ~50% of 20 turns are data-gathering (effective tokens: 9.3 M). Injecting structured run data as acache:artifact (the workflow already has a cache block) would cut agentic turns by ~10 and save millions of effective tokens per run. Needs manual review to define the pre-fetch query scope.References