feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report with repo aggregate by Copilot · Pull Request #37457 · github/gh-aw

Copilot · 2026-06-07T01:03:35Z

The forecast command downloaded workflow run logs (unnecessary for AIC computation), ignored the timeout context in artifact downloads, discarded all collected results when a timeout fired, and produced a report with insufficient cost detail.

Changes

Skip workflow log downloads — Introduces forecastDownloadUsageArtifact as a focused replacement for the general-purpose downloadRunArtifacts. It lists artifacts, downloads only the matching usage artifact, and returns ErrNoArtifacts immediately when none is found — no diagnostic log download fallback.
Context propagation — loadCachedRunAIC now accepts context.Context and forwards it through to artifact downloads. Previously used context.Background(), so downloads ran to completion even after the configured --timeout fired.
Partial results on timeout/cancel — emitPartialForecastResults emits JSON/table output for all workflows processed before the interrupt, then returns the timeout exit code. Previously all progress was silently discarded.
Structured logging — loadCachedRunAIC and forecastDownloadUsageArtifact now log cache hits, cache misses, download start/completion, artifact counts, and context interruptions via forecastRunLog.
Workflow YAML resilience — On exit 124 (CLI timeout), the step now checks whether report.json has content. With partial results: ::warning:: + continue (cache saves, issue is created with partial data). Without any results: ::error:: + exit 1 as before.
Enriched forecast report — The CLI table and GitHub issue now show P50/Run (per-run median AIC), P95/Run (95th-percentile per-run AIC), Weekly (P50) and Monthly (P50) projected totals, and a TOTAL row across all workflows. A detailed run-samples section lists every sampled run with its ID, date, and raw AIC for human review (collapsible <details> block in the issue body). The "increase the horizon" tip is removed.
Repo-level aggregate — ForecastResult now carries a totals field (weekly_p50, monthly_p50) that sums the weekly and monthly P50 projections across all workflows. This is computed by the new computeForecastTotals helper (preferring Monte Carlo P50 over point estimates) and is always present in report.json, so consumers never need to aggregate individual workflow entries themselves. The CLI table TOTAL row and the GitHub issue TOTAL row both use this pre-computed value.

…loading - Add forecastDownloadUsageArtifact: downloads only the usage artifact without workflow run log downloads (not needed for AIC computation) - Thread context.Context through loadCachedRunAIC so downloads respect the forecast timeout instead of using context.Background() - Produce partial forecast results on timeout or user cancellation instead of discarding all progress — results collected so far are emitted before the timeout error is returned - Add structured logging throughout loadCachedRunAIC: cache hits, cache misses, download start/end, artifact counts, and interruption - Update workflow YAML: on timeout with partial results, emit a warning and continue (instead of always failing with exit 1) Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

…rtialForecastResults Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan · 2026-06-07T01:23:42Z

@copilot Check the compiler code and ensure it generates a step in the conclusion job to create the usage artifact.

pelikhan · 2026-06-07T01:27:20Z

@copilot

the forecast report must provide PFC and P95 for AI credit costs per run, as well as weekly and monthly totals
provide a detailed section listing all reports used in the computation. It should also include raw AI credit consumption usage numbers for human review.
remove the tip about increasing the horizon.

Copilot · 2026-06-07T01:31:51Z

@copilot Check the compiler code and ensure it generates a step in the conclusion job to create the usage artifact.

The compiler already generates usage artifact steps in the conclusion job. In notify_comment.go, buildConclusionJob calls buildUsageArtifactUploadSteps (line 72), which emits two steps:

"Collect usage artifact files" — copies aw_info.json and token_usage.jsonl files (from both agent and detection sandboxes) into /tmp/gh-aw/usage/.
"Upload usage artifact" — uploads the collected files as the usage (or <prefix>usage in workflow_call context) artifact using actions/upload-artifact.

Both steps run with if: always() and continue-on-error: true. The TestConclusionJobIncludesUsageArtifactSteps test (and related prefix tests) verify this behaviour and all pass.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

… to forecast table Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

…run samples section, remove horizon tip Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-07T01:50:21Z

@copilot

the forecast report must provide PFC and P95 for AI credit costs per run, as well as weekly and monthly totals...

Done in commits e4e5aa5 and 1bd0d5c. Changes:

P50/Run & P95/Run — per-run median and 95th-percentile AIC computed from sorted sampled-run observations, shown in both the CLI table and the GitHub issue.
Weekly & Monthly totals — always computed (regardless of --period) via separate Monte Carlo simulations. CLI table and issue body both show Weekly (P50) and Monthly (P50) columns, with a TOTAL row when multiple workflows are present.
Detailed run samples section — CLI prints a per-workflow breakdown table (Run ID, Date, AIC). Issue body renders the same data as a collapsible <details> block.
Horizon tip removed — zeroProjectedTip and its {zero_projected_tip} template placeholder are gone; the forecast_issue.md template now uses {run_samples_section} in that slot.

pelikhan · 2026-06-07T02:04:36Z

@copilot, the forecast should also provide an aggregate of all forecasts so we know the total forecast for the repo.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR expands the forecast output to include per-run percentile metrics, weekly/monthly projections, and detailed run samples, while optimizing artifact downloads for forecasting and emitting partial results on interruption.

Changes:

Add per-run AIC samples, P50/P95 per-run metrics, and weekly/monthly projections (incl. Monte Carlo summaries) to forecast results and table output.
Replace forecast artifact downloading with a usage-artifact-only implementation and improve cancellation/timeout behavior by emitting partial results.
Update GitHub Action issue template + JS generator/tests to render the new summary columns and a run-samples details section.

Show a summary per file

File	Description
pkg/cli/forecast.go	Adds new forecast result fields (percentiles, weekly/monthly projections, run samples), partial-result emission, and usage-only artifact downloader.
pkg/cli/forecast_test.go	Updates mocks and callsites for new `loadCachedRunAIC(ctx, ...)` signature.
actions/setup/md/forecast_issue.md	Switches issue template placeholder from zero-tip to run-samples section.
actions/setup/js/create_forecast_issue.cjs	Updates issue body generator to render new columns, totals row, and run-samples details block; keeps legacy fallback.
actions/setup/js/create_forecast_issue.test.cjs	Updates existing expectations for new table format and adds tests for run samples + TOTAL row.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 5

+  // Legacy fallback: derive weekly/monthly from the configured-period P50 when new fields are absent.
+  const hasNewFields = workflows.some(w => w?.p50_aic_per_run != null || w?.weekly_projected_aic != null);
+  const legacyRows = hasNewFields
+    ? null
+    : workflows.map(workflow => {
+        const p50 = workflow?.monte_carlo?.p50_projected_aic ?? workflow?.projected_aic ?? workflow?.monte_carlo?.p50_projected_effective_tokens ?? workflow?.projected_effective_tokens ?? 0;
+        return [escapeCell(workflow.workflow_id), workflow.sampled_runs ?? 0, Number(p50)];
+      });


+func loadCachedRunAIC(ctx context.Context, runID int64, verbose bool) float64 {
 	dir := filepath.Join(defaultLogsOutputDir, fmt.Sprintf("run-%d", runID))
 	summary, ok := loadRunSummary(dir, verbose)
 	if ok && summary != nil && summary.TokenUsage != nil && summary.TokenUsage.TotalAIC > 0 {
+		forecastRunLog.Printf("AIC cache hit for run %d: aic=%.3f (from run_summary.json)", runID, summary.TokenUsage.TotalAIC)
 		return summary.TokenUsage.TotalAIC
 	}


+	// Sort partial results by Monte Carlo P50 descending (mirrors the full-results sort).
+	sort.Slice(results, func(i, j int) bool {
+		pi := results[i].ProjectedAIC
+		if mc := results[i].MonteCarlo; mc != nil {
+			pi = mc.P50ProjectedAIC
+		}
+		pj := results[j].ProjectedAIC
+		if mc := results[j].MonteCarlo; mc != nil {
+			pj = mc.P50ProjectedAIC
+		}
+		return pi > pj
+	})


+	// Compute P50 and P95 of individual run AIC (per-run percentiles, not period totals).
+	sortedAIC := make([]int, len(aicObservations))
+	copy(sortedAIC, aicObservations)
+	sort.Ints(sortedAIC)
+	result.P50AIC = roundForecastAIC(float64(percentileInt(sortedAIC, 50)) / 1000)
+	result.P95AIC = roundForecastAIC(float64(percentileInt(sortedAIC, 95)) / 1000)
+
 	// Compute observed run frequency: runs per calendar day over the history window,
 	// scaled to the projection period.
-	result.ObservedRunsPerPeriod = float64(n) / float64(config.Days) * float64(periodDays)
+	observedRunsPerDay := float64(n) / float64(config.Days)
+	result.ObservedRunsPerPeriod = observedRunsPerDay * float64(periodDays)
+
+	// Point estimates for weekly (7-day) and monthly (30-day) projections.
+	weeklyRuns := observedRunsPerDay * 7
+	monthlyRuns := observedRunsPerDay * 30
+	result.WeeklyProjectedAIC = roundForecastAIC(weeklyRuns * result.AvgAIC)
+	result.MonthlyProjectedAIC = roundForecastAIC(monthlyRuns * result.AvgAIC)


+  for (const wf of workflows) {
+    const samples = Array.isArray(wf?.run_samples) ? wf.run_samples : [];
+    for (const s of samples) {
+      const runID = s?.run_id ?? "";
+      const date = s?.date ?? "";
+      const aic = formatAIC(s?.aic ?? 0);
+      lines.push(`| ${escapeCell(wf.workflow_id)} | #${runID} | ${date} | ${aic} |`);
+    }
+  }


Copilot · 2026-06-07T02:17:16Z

@copilot, the forecast should also provide an aggregate of all forecasts so we know the total forecast for the repo.

Done in the latest commits. The ForecastResult struct now carries a Totals *ForecastTotals field ({"weekly_p50": ..., "monthly_p50": ...}) that sums the weekly and monthly P50 projections across all workflows — using Monte Carlo P50 when available, falling back to the point estimate. This is computed by the new computeForecastTotals helper and is populated in both the normal and partial-results paths, so report.json always exposes the repo-level aggregate directly. The existing TOTAL row in the CLI table and GitHub issue now uses the same pre-computed value. create_forecast_issue.cjs also prefers report.totals when present.

Copilot AI and others added 2 commits June 7, 2026 00:58

docs: improve docstrings for forecastDownloadUsageArtifact and emitPa…

45cf56a

…rtialForecastResults Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 7, 2026

Copilot AI changed the title ~~feat: improve forecast command resilience, logging, and artifact downloading~~ feat: forecast — context-aware downloads, no workflow logs, partial results on timeout Jun 7, 2026

Copilot AI requested a review from pelikhan June 7, 2026 01:04

Copilot created this pull request from a session on behalf of pelikhan June 7, 2026 01:04 View session

Copilot finished work on behalf of pelikhan June 7, 2026 01:04

Copilot started work on behalf of pelikhan June 7, 2026 01:23 View session

github-actions Bot mentioned this pull request Jun 7, 2026

[PR Triage Report] PR Triage Report — 2026-06-07 #37460

Closed

Apply remaining changes

efa945e

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 7, 2026 01:32

Copilot started work on behalf of pelikhan June 7, 2026 01:32 View session

github-actions Bot mentioned this pull request Jun 7, 2026

[Contribution Check Report] Contribution Check — 2026-06-07 #37465

Closed

Copilot AI and others added 2 commits June 7, 2026 01:44

feat: add P50/P95 per-run, weekly/monthly totals, run samples section…

33b45b8

… to forecast table Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

feat: update forecast issue template — P50/P95/weekly/monthly table, …

d469064

…run samples section, remove horizon tip Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~feat: forecast — context-aware downloads, no workflow logs, partial results on timeout~~ feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report Jun 7, 2026

Copilot finished work on behalf of pelikhan June 7, 2026 01:50

Copilot started work on behalf of pelikhan June 7, 2026 02:04 View session

pelikhan marked this pull request as ready for review June 7, 2026 02:06

Copilot AI review requested due to automatic review settings June 7, 2026 02:06

pelikhan merged commit 881b5db into main Jun 7, 2026
15 checks passed

pelikhan deleted the copilot/improve-forecast-command branch June 7, 2026 02:06

Copilot AI reviewed Jun 7, 2026

View reviewed changes

Copilot AI restored the copilot/improve-forecast-command branch June 7, 2026 02:11

Copilot AI changed the title ~~feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report~~ feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report with repo aggregate Jun 7, 2026

Copilot finished work on behalf of pelikhan June 7, 2026 02:17

github-actions Bot mentioned this pull request Jun 7, 2026

[aw] No-Op Runs #37456

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report with repo aggregate#37457

feat: forecast — context-aware downloads, no workflow logs, partial results on timeout, enriched report with repo aggregate#37457
pelikhan merged 5 commits into
mainfrom
copilot/improve-forecast-command

Copilot AI commented Jun 7, 2026 •

edited

Loading

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

pelikhan commented Jun 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 7, 2026 •

edited

Loading