perf(doc-maintainer): reduce per-run token usage#4765
Conversation
- Rec 1: exclude AGENTS.md, CLAUDE.md, skill.md from doc pool to eliminate ~10K tokens of duplicate content per run - Rec 2: reduce doc preview from head-80 to head-40 (~5,250 tokens saved) - Rec 3: lower max-turns from 6 to 4 (prevents runaway turn-5/6 waste) - Rec 4: narrow git log window in steps from 7 days to 24 hours (daily workflow; keeps check_relevant_changes job at 7 days) - Rec 5: remove ## Context background section from system prompt - Recompile + post-process lock file Closes #4742
There was a problem hiding this comment.
Pull request overview
Reduces token usage and runaway behavior in the doc-maintainer workflow by tightening the context it gathers and constraining the agent’s execution limits, while keeping the existing “should we run?” change-detection gate.
Changes:
- Deduplicate the root-level markdown doc pool by excluding agent meta-files (avoids loading duplicated
AGENTS.md/CLAUDE.mdcontent). - Reduce the amount of preloaded context (
head -80→head -40) and shorten the included git diff preview. - Lower agent max turns (
6→4) and tighten step-level git log lookback (7 days→24 hours) for the agent context generation.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/doc-maintainer.md | Adjusts the human-readable workflow source and agent prompt to reduce context size and cap turns. |
| .github/workflows/doc-maintainer.lock.yml | Regenerated lock file reflecting the new context-building logic and max-turns/maxRuns settings. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 2
| ## Your Mission | ||
|
|
||
| Review git commits from the past 7 days, identify documentation that has drifted out of sync with code, and create a PR with the necessary updates. | ||
|
|
||
| ## Context | ||
|
|
||
| This repository is a security-critical firewall for GitHub Copilot CLI. Accurate documentation is essential for safe usage. The documentation frequently drifts out of sync with code changes, especially: | ||
| - Architecture changes (Docker, containers, networking, iptables) | ||
| - CLI flag additions and modifications | ||
| - MCP configuration changes | ||
| - Security guidance updates | ||
| Review git commits from the past 24 hours, identify documentation that has drifted out of sync with code, and create a PR with the necessary updates. | ||
|
|
| git log --since="24 hours ago" --format="=== Commit %H: %s ===" --stat -- src/ containers/ scripts/ docs/ '*.md' | head -20 > "$CONTEXT_DIR/recent-diffs.txt" | ||
|
|
||
| git log --since="7 days ago" --format="%H" -- src/ containers/ scripts/ | \ | ||
| git log --since="24 hours ago" --format="%H" -- src/ containers/ scripts/ | \ |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@copilot address review feedback |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
Smoke Test Results: Copilot BYOK ✅ PASS
Mode: Direct BYOK (COPILOT_PROVIDER_API_KEY) → api-proxy sidecar → api.githubcopilot.com Status: PASS | Author:
|
🔬 Smoke Test Results — Auth mode: PAT (COPILOT_GITHUB_TOKEN)
Overall: PASS PR: perf(doc-maintainer): reduce per-run token usage — author
|
🔬 Smoke Test Results
PR: perf(doc-maintainer): reduce per-run token usage Overall: FAIL (file test unverifiable —
|
|
GitHub MCP connectivity: ✅ Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) Status: PASS
|
|
Smoke test results
|
|
fix(api-proxy): stop double-counting cached tokens in AI credits — ✅
|
Smoke Test: GitHub Actions Services Connectivity
Overall: FAIL
|
|
GitHub API: ✅ PASS Total: PASS
|
The
doc-maintainerworkflow was burning redundant tokens per run:AGENTS.md(a symlink toCLAUDE.md) was included twice in the doc pool, the doc preview was unnecessarily wide at 80 lines, and the prompt/context could be tightened without changing the 7-day change-detection gate.Changes
AGENTS.md,CLAUDE.md, andskill.mdfrom thefind . -maxdepth 1doc pool — these are agent meta-files, not project docs, andAGENTS.mdis a symlink toCLAUDE.md, causing identical content to load twicehead -80→head -40in the Pre-load step; section headers + opening paragraphs are sufficient for the agent to decide if a file needs updatingmax-turns: 6→4; observed runs used ≤4 turns, and the higher ceiling only enabled wasteful failure spirals--since="7 days ago"→--since="48 hours ago"in the threegit logcalls insidesteps:only — thecheck_relevant_changesjob gate stays at 7 days, while the agent context now stays focused and still covers delayed daily runs## Contextbackground section; the agent has what it needs from the task steps alonescripts/ci/doc-maintainer-workflow.test.tsassertions to match the optimized workflow and recompiled lock fileLock file recompiled via
gh aw compile.