Skip to content

[ambient-context] Daily Ambient Context Optimizer - 2026-06-15 #39453

Description

@github-actions

Executive Summary

  • 4 runs sampled (all with conclusion: completed) from a pool of 17 eligible runs in the last 24 h
  • 4 distinct workflows covered: Copilot Opt, PR Triage Agent, Failure Investigator, Daily Safe Output Integrator
  • Median first-request size: 11,564 chars · P95: 19,966 chars (Copilot Opt)
  • Highest-leverage finding: copilot-opt.md inlines the full jqschema skill (3,032 chars of bash examples) and carries a 10,496-char main body—together they account for 68 % of the largest sampled prompt. Separately, pr-triage-agent.md imports a dead code-review config (901 chars) and runs an expensive frontier model on a read-only triage task.

Note on request source: no event-logs.jsonl was present in sandbox/firewall/logs/api-proxy-logs/ for any sampled run, and all user.message events in the agent session logs had empty content. All first-request sizes are therefore derived from prompt.txt (compilation artifact), cross-checked against the token-usage.jsonl first-entry input_tokens. Char/token ratios below 1.0 (PR Triage, Daily Safe Output Integrator) indicate that tool-schema JSON sent via the API adds tokens not visible in prompt.txt.


Highest-Leverage Changes

  1. [skills / copilot-opt] Replace the direct jqschema SKILL.md import with a lightweight reference — the verbose bash examples add 3,032 chars on every run. (High)
  2. [workflow-md / copilot-opt] Slim the main body (10,496 chars, 49 headings, 18 code fences); collapse example blocks and consolidate repetitive phase descriptions. (High)
  3. [workflow-md / pr-triage-agent] Remove the pr-code-review-config.md import (901 chars / ~1,208 tokens): it is code-review configuration that is never used in a triage-only workflow. (Medium — safe immediately)
  4. [workflow-md / pr-triage-agent] Downgrade engine from claude-sonnet-4.6 to haiku or gpt-4.1-mini; triage is read-only label/classify work with no code generation. (High — needs brief validation)
  5. [agents / copilot-opt] Move the 92 %-data-gathering turns to deterministic steps: (pre-agent); this eliminates redundant context re-sends on every turn. (Medium)

Key Metrics

Metric Value
Sampled runs 4
Distinct workflows 4
Median first-request chars 11,564
P95 first-request chars 19,966
Largest sampled request Copilot Opt — 19,966 chars / 18,756 tokens
Per-Run First-Request Metrics
Run Workflow chars input_tokens char/tok AIC turns Conclusion
§27569486853 Copilot Opt 19,966 18,756 1.065 401.0 1 success
§27570973333 PR Triage Agent 12,603 16,898 0.746 629.8 1 success
§27572546323 Failure Investigator 10,526 4,473 2.353† 572.8 38 success
§27572285541 Daily Safe Output Integrator 10,004 16,134 0.620 323.5 1 failure

† Failure Investigator uses the claude (Claude Code) engine; prompt.txt is significantly larger than what Claude Code sends as its first API message, so the ratio is not directly comparable.

Repeated Ambient Context Signals
  • Shared 1,363-char security-policy prefix identical across all 4 runs (infrastructure-injected; not actionable per workflow author).
  • <safe-outputs> block (2,578 chars) repeated verbatim in all 4 runs — infrastructure-injected.
  • reporting.md (497 chars) imported into every run that produces reports; content is short and shared, so no savings available here.
  • noop-reminder.md (411 chars) duplicated in the imports of copilot-opt and pr-triage-agent alongside the infrastructure <safe-outputs> block that already covers the same requirement — possible double-instruction overlap.
  • pr-code-review-config.md in pr-triage-agent injects 901 chars of review-tool guidance (diff fetching, comment threading) into a workflow that only applies labels and writes one issue — entirely dead context.
  • jqschema SKILL.md embeds two full worked examples with multi-step bash pipelines (3,032 chars total), the largest imported skill block observed.
  • <safe-output-tools> section in Daily Safe Output Integrator is 3,984 chars — 56 % larger than the same section in Copilot Opt (2,409 chars) — suggesting the workflow's safe-output configuration exposes more tools than necessary; worth auditing.
Deterministic Analysis Output

Script: /tmp/gh-aw/ambient-context/analyze_requests.py (stdlib only)

Top sections by line count (across sampled runs):

Workflow Largest section Lines
Copilot Opt (preamble / system) 73
PR Triage Agent (preamble / system) 87
Failure Investigator (preamble / system) 61
Daily Safe Output (preamble / system) 73

The preamble (infrastructure tags + imported shared context) dominates first-request line counts in every run; the actual workflow-specific sections are relatively compact.

Keyword line density (lines mentioning keyword / total lines):

Workflow tools agents safe_outputs workflow
Copilot Opt 30 17 13 14
PR Triage Agent 19 8 13 11
Failure Investigator 19 14 12
Daily Safe Output 18 3 21

Copilot Opt has the highest agent-keyword density, consistent with the 49 headings and multi-phase orchestration description.

Duplicate line ratios: Copilot Opt 0.048 (highest), all others 0.000. The duplicates in Copilot Opt stem from repeated bash command patterns in the jqschema skill examples.


Recommendations by Category

Workflow Markdown

  1. pr-triage-agent.md — remove pr-code-review-config.md import (Medium, safe immediately)

    • The pr-review-base.md import pulls in pr-code-review-config.md (901 chars), which documents how to fetch diffs and submit code reviews. PR Triage only applies labels and creates one report issue; it never calls review tools.
    • Removing this import saves ~1,208 tokens on every scheduled run (6 h cadence → ~4 runs/day).
    • Also confirm whether the pr-review-base.md import itself is needed, or if only the pr-data portion is required.
  2. pr-triage-agent.md — downgrade engine model (High, needs brief validation)

    • The workflow runs claude-sonnet-4.6 for a read-only label/classify/issue-create task. The audit flagged it as a model-downgrade candidate.
    • Set engine.model: claude-haiku-4-5 (or gpt-4.1-mini) in the frontmatter. Estimated cost reduction: 8–10× per run at similar quality for structured triage.
  3. copilot-opt.md — slim the main body (High, needs review)

    • 10,496 chars with 49 headings and 18 code fences is the single largest workflow body in the sample. The verbose Phase 0–3 descriptions and multi-step bash examples can be condensed.
    • Target: reduce to ≤6,000 chars by collapsing example code into one-liner hints, merging phases that share a single decision point, and moving static background into a separate imported stub loaded only when needed.

Skills

  1. copilot-opt.md — replace jqschema SKILL.md inline import with a short reference (High, safe immediately)
    • The import embeds 3,032 chars of usage narrative and bash examples (two full worked workflows with echo + pipeline steps). The skill's core value is in the jqschema.sh script path; the narrative examples can be cut to a 3-line stub: path, input format, output format.
    • Alternatively, load the skill only when a phase requires schema discovery (not in the system preamble).
    • Removing verbose examples saves ~2,400–2,700 chars from every Copilot Opt first request.

Agents

  1. copilot-opt.md — move data-fetching to deterministic steps: (Medium, needs design)
    • The audit found 92 % of turns are data-gathering (session log reads, PR fetches, file loads). These are deterministic and can run as pre-agent steps: writing to /tmp/gh-aw/agent/.
    • The workflow already imports shared/copilot-session-data-fetch.md and shared/copilot-pr-data-fetch.md — confirm these are being executed as pre-agent steps (not as runtime-imported LLM instructions). If they are LLM-directed at runtime, move them to deterministic steps: in the frontmatter.
    • This reduces multi-turn context growth and eliminates re-sending the full prompt on each data-gathering step.

References

Generated by 🌫️ Daily Ambient Context Optimizer · 1.6K AIC · ⌖ 14.5 AIC · ⊞ 21.9K ·

  • expires on Jun 22, 2026, 12:57 PM UTC-08:00

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions