Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions .github/aw/optimize-agentic-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
Comment thread
pelikhan marked this conversation as resolved.
description: Analyze and reduce token consumption in agentic workflows — audit-based measurement, DataOps, gh-proxy, sub-agents, and prompt optimization.
disable-model-invocation: true
---

# Agentic Workflow Token Optimizer

Help users reduce the AI token usage and cost of GitHub Agentic Workflows in this repository.

## Load These References First

- [github-agentic-workflows.md](github-agentic-workflows.md)
- [token-optimization.md](token-optimization.md)
- [workflow-editing.md](workflow-editing.md)
- [syntax.md](syntax.md)

Load these only when relevant:

- [experiments.md](experiments.md)
- [safe-outputs.md](safe-outputs.md)

## Available Commands

```bash
gh aw audit <run-id> --json
gh aw audit <base-run-id> <optimized-run-id>
gh aw logs <workflow-name> --json
gh aw compile <workflow-name>
gh aw status
```

## Start the Conversation

Ask for one of these inputs:

- a workflow run URL (or run ID) to analyze
- a workflow name to review the source
- the guardrail that was exceeded (max-ai-credits, max-daily-ai-credits, max-tool-denials, max-turns / timeout)

## Fast Path: Run URL Provided

If the user gives a GitHub Actions run URL:

1. Extract the run ID
2. Run `gh aw audit <run-id> --json`
3. Inspect `agent_usage.aic`, `agent_usage.input_tokens`, `agent_usage.output_tokens`, `agent_usage.cache_read_tokens`
4. Identify the most expensive phases before asking additional questions

## Guardrail-Specific Entry Points

### `max-ai-credits` exceeded

The workflow was stopped because it consumed more AI Credits than the configured per-run budget.

Priority checks:
1. Which tool calls dominated token usage? (`token-usage.jsonl`)
2. Is the prompt front-loading large payloads that could be fetched on demand?
3. Are there repetitive extraction steps that sub-agents could handle cheaply?
4. Does the frontier model handle tasks that a small model could do?

### `max-daily-ai-credits` exceeded

The workflow is being blocked because its 24-hour AI Credits budget is exhausted.

Priority checks:
1. What is the run cadence? (scheduled too frequently?)
2. Does the workflow use cheap triage before escalating to the frontier model?
3. Is batching or caching applicable to reduce run frequency?
4. Are there noop early-exits for events that do not require agent action?

### `max-tool-denials` exceeded

The Copilot SDK hit the tool-denial threshold, indicating the prompt attempted actions outside the allowed tool policy.

Priority checks:
1. What tool was repeatedly denied? (last denied reason in the failure issue)
2. Is the tool missing from the workflow's permissions/firewall config?
3. Can the prompt be revised to avoid the denied operation entirely?
4. Would a DataOps pre-step satisfy the data need without a tool call?

### Timeout / `max-turns` exceeded

The agent ran out of time or turns before completing the task.

Priority checks:
1. Is the task decomposable into smaller, faster sub-tasks?
2. Are there long-running tool calls that could be replaced with DataOps pre-steps?
3. Is the prompt asking the agent to do too much in one run?
4. Can `max-turns` or `timeout-minutes` be raised, or should the task be split?

## Optimization Analysis Plan

After measuring token usage, produce a prioritized plan:

1. **Measure** — run `gh aw audit <run-id>` and summarize AI Credits and per-call token breakdown
2. **Identify top cost drivers** — list the three most expensive phases/tool calls
3. **Apply quick wins first** — DataOps pre-steps, `gh-proxy`, `cli-proxy`, prompt trimming
Comment on lines +93 to +97
4. **Sub-agent delegation** — identify repetitive per-item loops suitable for small-model workers
5. **Prompt caching** — verify stable context appears before dynamic content
6. **Experiment** — add an `experiments:` entry with `metric: "aic"` to measure the change
7. **Validate quality** — confirm the optimized run produces equivalent safe outputs

Present the plan clearly before making any edits. Confirm with the user before applying changes.

## Editing Workflow

1. Edit `.github/workflows/<workflow-name>.md`
2. Recompile: `gh aw compile <workflow-name>`
3. Commit both the source and the generated `.lock.yml`
4. Report the estimated savings and link to the PR or commit
1 change: 1 addition & 0 deletions .github/skills/agentic-workflows/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Load these files from `github/gh-aw` (they are not available locally).
- `.github/aw/memory.md`
- `.github/aw/messages.md`
- `.github/aw/network.md`
- `.github/aw/optimize-agentic-workflow.md`
- `.github/aw/patterns.md`
Comment on lines 33 to 36
- `.github/aw/pr-reviewer.md`
- `.github/aw/report.md`
Expand Down
112 changes: 112 additions & 0 deletions .github/skills/optimize-agentic-workflow/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
name: optimize-agentic-workflow
description: Analyze and reduce token consumption in agentic workflows — guardrail-specific entry points, measurement, and optimization techniques.
---

# Agentic Workflow Token Optimizer

Help users reduce the AI token usage and cost of GitHub Agentic Workflows in this repository.

## Load These References First

Load these files from `github/gh-aw` (they are not available locally).

- `.github/aw/github-agentic-workflows.md`
- `.github/aw/token-optimization.md`
- `.github/aw/workflow-editing.md`
- `.github/aw/syntax.md`

Load these only when relevant:

- `.github/aw/experiments.md`
- `.github/aw/safe-outputs.md`

## Available Commands

```bash
gh aw audit <run-id> --json
gh aw audit <base-run-id> <optimized-run-id>
gh aw logs <workflow-name> --json
gh aw compile <workflow-name>
gh aw status
```

## Start the Conversation

Ask for one of these inputs:

- a workflow run URL (or run ID) to analyze
- a workflow name to review the source
- the guardrail that was exceeded (max-ai-credits, max-daily-ai-credits, max-tool-denials, max-turns / timeout)

## Fast Path: Run URL Provided

If the user gives a GitHub Actions run URL:

1. Extract the run ID
2. Run `gh aw audit <run-id> --json`
3. Inspect `agent_usage.aic`, `agent_usage.input_tokens`, `agent_usage.output_tokens`, `agent_usage.cache_read_tokens`
4. Identify the most expensive phases before asking additional questions

## Guardrail-Specific Entry Points

### `max-ai-credits` exceeded

The workflow was stopped because it consumed more AI Credits than the configured per-run budget.

Priority checks:
1. Which tool calls dominated token usage? (`token-usage.jsonl`)
2. Is the prompt front-loading large payloads that could be fetched on demand?
3. Are there repetitive extraction steps that sub-agents could handle cheaply?
4. Does the frontier model handle tasks that a small model could do?

### `max-daily-ai-credits` exceeded

The workflow is being blocked because its 24-hour AI Credits budget is exhausted.

Priority checks:
1. What is the run cadence? (scheduled too frequently?)
2. Does the workflow use cheap triage before escalating to the frontier model?
3. Is batching or caching applicable to reduce run frequency?
4. Are there noop early-exits for events that do not require agent action?

### `max-tool-denials` exceeded

The Copilot SDK hit the tool-denial threshold, indicating the prompt attempted actions outside the allowed tool policy.

Priority checks:
1. What tool was repeatedly denied? (last denied reason in the failure issue)
2. Is the tool missing from the workflow's permissions/firewall config?
3. Can the prompt be revised to avoid the denied operation entirely?
4. Would a DataOps pre-step satisfy the data need without a tool call?

### Timeout / `max-turns` exceeded

The agent ran out of time or turns before completing the task.

Priority checks:
1. Is the task decomposable into smaller, faster sub-tasks?
2. Are there long-running tool calls that could be replaced with DataOps pre-steps?
3. Is the prompt asking the agent to do too much in one run?
4. Can `max-turns` or `timeout-minutes` be raised, or should the task be split?

## Optimization Analysis Plan

After measuring token usage, produce a prioritized plan:

1. **Measure** — run `gh aw audit <run-id> --json` and summarize AI Credits and per-call token breakdown
2. **Identify top cost drivers** — list the three most expensive phases/tool calls
3. **Apply quick wins first** — DataOps pre-steps, `gh-proxy`, `cli-proxy`, prompt trimming
4. **Sub-agent delegation** — identify repetitive per-item loops suitable for small-model workers
5. **Prompt caching** — verify stable context appears before dynamic content
6. **Experiment** — add an `experiments:` entry with `metric: "aic"` to measure the change
7. **Validate quality** — confirm the optimized run produces equivalent safe outputs

Present the plan clearly before making any edits. Confirm with the user before applying changes.

## Editing Workflow

1. Edit `.github/workflows/<workflow-name>.md`
2. Recompile: `gh aw compile <workflow-name>`
3. Commit both the source and the generated `.lock.yml`
4. Report the estimated savings and link to the PR or commit
39 changes: 39 additions & 0 deletions actions/setup/js/handle_agent_failure.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -1665,6 +1665,40 @@ function buildDailyAICExceededContext(hasDailyAICExceeded, totalAIC, threshold)
);
}

/**
* Build the "Optimize token consumption" details section for the failure issue when a guardrail
* limit was the root cause of the failure.
*
* Guardrails that trigger this section:
* - max-ai-credits: per-run AI Credits budget exceeded
* - max-daily-ai-credits: 24-hour per-workflow AI Credits quota exhausted
* - max-tool-denials: Copilot SDK tool-denial threshold hit
* - max-turns / timeout: agent ran out of turns or wall-clock time
*
* @param {object} options
* @param {boolean} options.maxAICreditsExceeded - max-ai-credits guardrail triggered
* @param {boolean} options.hasDailyAICExceeded - max-daily-ai-credits guardrail triggered
* @param {boolean} options.hasToolDenialsExceeded - max-tool-denials guardrail triggered
* @param {boolean} options.isTimedOut - timeout / max-turns guardrail triggered
* @param {string} options.runUrl - URL to the failed workflow run
* @returns {string} Rendered section or empty string when no guardrail was triggered
*/
function buildOptimizeTokenConsumptionContext({ maxAICreditsExceeded, hasDailyAICExceeded, hasToolDenialsExceeded, isTimedOut, runUrl }) {
const guardrailTriggered = maxAICreditsExceeded || hasDailyAICExceeded || hasToolDenialsExceeded || isTimedOut;
if (!guardrailTriggered) {
return "";
}

let guardrailName = "guardrail limit";
if (maxAICreditsExceeded) guardrailName = "max-ai-credits";
else if (hasDailyAICExceeded) guardrailName = "max-daily-ai-credits";
else if (hasToolDenialsExceeded) guardrailName = "max-tool-denials";
else if (isTimedOut) guardrailName = "max-turns / timeout";

const templatePath = getPromptPath("optimize_token_consumption_context.md");
return renderTemplateFromFile(templatePath, { guardrail_name: guardrailName, run_url: runUrl });
}

// Maps engine ID (GH_AW_ENGINE_ID) to credential name for use with GH_AW_ENGINE_API_HOSTS.
const ENGINE_ID_TO_CREDENTIAL = /** @type {Record<string, string>} */ {
copilot: "`COPILOT_GITHUB_TOKEN`",
Expand Down Expand Up @@ -3116,6 +3150,9 @@ async function main() {
// Build credential auth error context (firewall audit.jsonl 401/403 from provider endpoints)
const credentialAuthErrorContext = buildCredentialAuthErrorContext();

// Build optimize token consumption context (shown when a guardrail was the failure root cause)
const optimizeTokenConsumptionContext = buildOptimizeTokenConsumptionContext({ maxAICreditsExceeded, hasDailyAICExceeded, hasToolDenialsExceeded, isTimedOut, runUrl });

// Create template context with sanitized workflow name
const templateContext = {
workflow_name: sanitizedWorkflowName,
Expand Down Expand Up @@ -3151,6 +3188,7 @@ async function main() {
lockdown_check_failed_context: lockdownCheckFailedContext,
stale_lock_file_failed_context: staleLockFileFailedContext,
daily_ai_credits_exceeded_context: dailyAICExceededContext,
optimize_token_consumption_context: optimizeTokenConsumptionContext,
};

// Render the issue template
Expand Down Expand Up @@ -3234,6 +3272,7 @@ module.exports = {
buildLockdownCheckFailedContext,
buildStaleLockFileFailedContext,
buildDailyAICExceededContext,
buildOptimizeTokenConsumptionContext,
buildTimeoutContext,
shouldBuildEngineFailureContext,
isIssueWritePermissionError,
Expand Down
Loading