Enable Smoke Pi safeoutputs CLI mounting and migrate 20 non-smoke workflows to Pi#40375
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
/smoke-pi |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 |
|
@copilot investigate this issue: https://github.com/github/gh-aw/actions/runs/27844212213 |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Investigated run 27844212213 and found a remaining sandbox permission warning: MCPG couldn’t create |
|
/smoke-pi |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 |
|
@copilot it seems that pi is not finding the "safeoutputs" cli shell scrip. See {"type":"tool_execution_update","toolCallId":"call_yFrJ4VFeqQJDlm4uWATqbRaB","toolName":"bash","args":{"command":"printf '%s' '{"message":"Could not emit required safe output: safeoutputs CLI not available; smoke test completed locally with FAIL status."}' | safeoutputs noop .","timeout":30},"partialResult":{"content":[]}} Review pi bash tool implementation to determine why this is happening. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Fixed in Root cause: Pi engine did not render MCP config, so the I added Pi MCP config rendering (matching Claude/Gemini behavior), regenerated |
|
/smoke-pi |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 Caution agentic threat detected |
|
Caution agentic threat detected Overview
Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
|
|
@copilot remove the custom sandbox mounts from smoke-pi |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #40375 does not have the 'implementation' label (has_implementation_label=false) and has 0 new lines of code in business logic directories (requires_adr_by_default_volume=false, threshold=100). |
|
✅ Test Quality Sentinel completed test quality analysis. |
There was a problem hiding this comment.
Pull request overview
This pull request updates gh-aw to better support the Pi engine by adding Pi-specific MCP config rendering (so Start MCP Gateway receives generated MCP JSON), removing Smoke Pi’s custom sandbox write-mount tweaks, and migrating a set of non-smoke workflows to run with engine.id: pi / engine.model: copilot/gpt-5.4 (with accompanying lockfile regeneration).
Changes:
- Added
PiEngine.RenderMCPConfig()and expanded MCP config generation tests to cover Pi. - Migrated multiple workflow markdown sources to the Pi engine and adjusted tool settings (e.g.,
tools.cli-proxy: true,tools.github.mode: gh-proxy) where required. - Regenerated affected compiled
.lock.ymlworkflows, includingsmoke-pi.lock.ymland several migrated workflows’ lock files.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/pi_mcp.go | Adds Pi engine MCP config rendering via the standard JSON MCP renderer. |
| pkg/workflow/agentic_workflow_test.go | Extends MCP config generation tests to include Pi. |
| .github/workflows/unbloat-docs.md | Migrates workflow engine config to Pi. |
| .github/workflows/spec-enforcer.md | Migrates workflow engine config to Pi. |
| .github/workflows/schema-consistency-checker.md | Migrates workflow engine config to Pi and enables cli-proxy. |
| .github/workflows/poem-bot.md | Migrates workflow engine config to Pi. |
| .github/workflows/lint-monster.md | Migrates workflow engine config to Pi. |
| .github/workflows/issue-monster.md | Migrates workflow engine config to Pi. |
| .github/workflows/hippo-embed.md | Migrates workflow engine config to Pi and sets tools.github.mode: gh-proxy. |
| .github/workflows/dependabot-worker.md | Migrates workflow engine config to Pi. |
| .github/workflows/dependabot-campaign.md | Migrates workflow engine config to Pi. |
| .github/workflows/daily-news.md | Migrates workflow engine config to Pi and sets tools.github.mode: gh-proxy. |
| .github/workflows/daily-multi-device-docs-tester.md | Migrates workflow engine config to Pi and sets tools.github.mode: gh-proxy. |
| .github/workflows/daily-hippo-learn.md | Migrates workflow engine config to Pi. |
| .github/workflows/daily-function-namer.md | Migrates workflow engine config to Pi. |
| .github/workflows/daily-doc-updater.md | Migrates workflow engine config to Pi. |
| .github/workflows/daily-awf-spec-compiler-surfacing.md | Migrates workflow engine config to Pi. |
| .github/workflows/commit-changes-analyzer.md | Migrates workflow engine config to Pi. |
| .github/workflows/chaos-pr-bundle-fuzzer.md | Migrates workflow engine config to Pi and sets tools.github.mode: gh-proxy. |
| .github/workflows/auto-triage-issues.md | Migrates workflow engine config to Pi. |
| .github/workflows/agent-persona-explorer.md | Migrates workflow engine config to Pi and sets tools.github.mode: gh-proxy. |
| .github/workflows/ab-testing-advisor.md | Migrates workflow engine config to Pi. |
| .github/workflows/smoke-pi.lock.yml | Regenerated compiled workflow for Smoke Pi (includes Pi MCP gateway JSON piping). |
| .github/workflows/lint-monster.lock.yml | Regenerated compiled workflow for Pi engine migration. |
| .github/workflows/issue-monster.lock.yml | Regenerated compiled workflow for Pi engine migration. |
| .github/workflows/hippo-embed.lock.yml | Regenerated compiled workflow for Pi engine migration (includes mcpscripts + safeoutputs gateway config). |
| .github/workflows/daily-hippo-learn.lock.yml | Regenerated compiled workflow for Pi engine migration. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 34/43 changed files
- Comments generated: 4
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (1 test analyzed)
Go: 1 ( Verdict
|
|
@copilot run pr-finisher skill |
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd, /zoom-out, and /improve-codebase-architecture — approving with minor suggestions.
📋 Key Themes & Highlights
Key Themes
- Core fix is minimal and correct:
pi_mcp.goaddsRenderMCPConfigtoPiEngineby delegating torenderDefaultJSONMCPConfig— the same pattern as Claude and Gemini. This unblocksstart_mcp_gateway.cjsso it generates MCP config for Pi andsafeoutputsgets mounted as a CLI. - Integration test pre-exists:
TestCompiledLockFiles_SmokePiKeepsCLIProxySafeoutputsWiringincompiled_lock_files_test.go(not changed here) already validatesGH_AW_MCP_CLI_SERVERS=["safeoutputs"]in the lock file — this was the failing test the PR fixes. - Unit test coverage gap: The new row in
TestAgenticWorkflowsMCPConfigGenerationexercisesagentic-workflows, notsafe-outputs, so the broken code path isn't exercised at unit level. - Bulk migration to experimental Pi: 20 workflows moved off Copilot/Claude. All compiled lock files flip
GH_AW_INFO_FIREWALL_ENABLEDtofalseand drop the AWF firewall container stack — intentional for Pi, but worth calling out explicitly.
Positive Highlights
- ✅ Minimal, targeted fix — 18 lines, zero speculative changes
- ✅ Pattern consistency with Claude/Gemini is exactly right
- ✅
compiled_lock_files_test.goprovides strong end-to-end validation of the safeoutputs CLI wiring - ✅ Pi required settings (
cli-proxy: true,github.mode: gh-proxy) correctly added to all 20 migrated workflows - ✅
smoke-pi.mdsandbox mount cleanup is clean
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 144.9 AIC · ⌖ 10 AIC · ⊞ 6.9K
| piMCPLog.Printf("Rendering MCP config for Pi: tool_count=%d, mcp_tool_count=%d", len(tools), len(mcpTools)) | ||
|
|
||
| // Pi uses JSON format without Copilot-specific fields and multi-line args. | ||
| return renderDefaultJSONMCPConfig(yaml, tools, mcpTools, workflowData, constants.ShellMcpServersJsonPath) |
There was a problem hiding this comment.
[/tdd] The fix is correct and mirrors Claude/Gemini exactly — renderDefaultJSONMCPConfig is the right delegation. The existing TestCompiledLockFiles_SmokePiKeepsCLIProxySafeoutputsWiring validates the safeoutputs CLI mount at lock-file level, which is solid coverage.
One gap: the new unit test in agentic_workflow_test.go only exercises mcpTools: ["agentic-workflows"], not "safe-outputs" — the actual broken path. A targeted Pi + safe-outputs sub-test would pin the root cause at unit level.
💡 Suggested test outline
t.Run("Pi/safe-outputs", func(t *testing.T) {
wd := workflowDataWithSafeOutputs()
var sb strings.Builder
err := NewPiEngine().RenderMCPConfig(\u0026sb, wd.Tools, []string{"safe-outputs"}, wd)
require.NoError(t, err)
assert.Contains(t, sb.String(), "safeoutputs",
"Pi MCP config must include safeoutputs server so CLI mounting works")
})This directly documents the bug that was fixed.
There was a problem hiding this comment.
Fixed in 6bf8621. Added TestPiMCPConfig_SafeOutputs in agentic_workflow_test.go that calls NewPiEngine().RenderMCPConfig with mcpTools: ["safe-outputs"] and asserts the output contains "safeoutputs" — directly documenting the root cause of the original bug.
| var piMCPLog = logger.New("workflow:pi_mcp") | ||
|
|
||
| // RenderMCPConfig renders the MCP configuration for Pi engine. | ||
| func (e *PiEngine) RenderMCPConfig(yaml *strings.Builder, tools map[string]any, mcpTools []string, workflowData *WorkflowData) error { |
There was a problem hiding this comment.
[/improve-codebase-architecture] pi_mcp.go is structurally identical to claude_mcp.go and gemini_mcp.go (same 18-line shape, same delegation to renderDefaultJSONMCPConfig, same log prefix pattern). That's fine for now since Pi genuinely uses the same JSON config format.
Worth adding a brief comment explaining why Pi uses renderDefaultJSONMCPConfig rather than a Pi-specific renderer — e.g., distinguishing it from Codex (TOML + custom shell policy) or Copilot (extra Copilot fields). This makes the intent clear if Pi's config needs to diverge in future.
💡 Example clarifying comment
// Pi uses the same JSON MCP format as Claude and Gemini: no Copilot-specific
// "type"/"tools" fields, no multi-line args. If Pi requires custom config
// sections (e.g., shell-policy or provider blocks), add them here similarly
// to how CodexEngine.RenderMCPConfig handles TOML-specific sections.There was a problem hiding this comment.
Fixed in 6bf8621. Added a comment to pi_mcp.go explaining why renderDefaultJSONMCPConfig is used (same JSON format as Claude/Gemini, no Copilot-specific fields) and what to do if Pi requires custom config sections in the future.
| engine: | ||
| bare: true | ||
| id: copilot | ||
| id: pi |
There was a problem hiding this comment.
[/zoom-out] Migrating 20 non-smoke workflows to an experimental engine in a single PR is a meaningful blast radius. Pi sets GH_AW_INFO_FIREWALL_ENABLED: false and GH_AW_INFO_AWF_VERSION: "" in compiled lock files, meaning all migrated workflows run without the AWF firewall container stack — a significant security posture change compared to Copilot.
No blocking issue if this is intentional policy for Pi, but worth confirming explicitly in the PR description or a follow-up ADR so reviewers don't have to infer it from the lock file diffs.
There was a problem hiding this comment.
Acknowledged — the Pi firewall posture gap was unintentional. Fixed in 6bf8621 by adding enableFirewallByDefaultForPi (matching the pattern of enableFirewallByDefaultForCopilot and enableFirewallByDefaultForClaude). All Pi workflows with sandbox: agent: awf now get GH_AW_INFO_FIREWALL_ENABLED: "true" and the AWF firewall stack, putting them on the same security posture as Copilot and Claude workflows.
There was a problem hiding this comment.
REQUEST_CHANGES — one correctness bug must be fixed before merge; two non-blocking observations
Blocking: daily-function-namer.md still contains an experiments block defining Claude model variants, but the engine model is now hardcoded to copilot/gpt-5.4. The compiled lock file still runs the full experiment-selection machinery (picks a Claude variant, writes state to the experiments branch, sets GH_AW_EXPERIMENTS_MODEL_SIZE in three places) but the selected variant is never consumed by the Pi invocation. Every daily run produces meaningless A/B metrics and corrupts the experiment tracking branch.
Non-blocking observations
GH_AW_INFO_FIREWALL_ENABLED: "false" across all 20 migrated lock files (misleading metadata): AWF IS still running in the compiled YAML for workflows with sandbox: agent: awf — the actual firewall behavior is unchanged. The flag is wrong because compiler_yaml.go uses getFirewallConfig (which requires an explicit network.firewall block or a sandbox.agent.version override) rather than isFirewallEnabled to set this flag. The copilot engine had enableFirewallByDefaultForCopilot that injected an explicit config; Pi does not. See inline comment on daily-news.lock.yml.
Mixed mount-escaping in smoke-pi.lock.yml: intentional but undocumented — the two expansion scopes in a single mounts array will silently break if a generator change or a new mount entry uses the wrong quoting. See inline comment.
pi_mcp.go path choice undocumented: a one-line comment explaining why Pi uses ShellMcpServersJsonPath (not TmpMcpServersJsonPath like crush/opencode) would prevent a future mis-routing. See inline comment.
🔎 Code quality review by PR Code Quality Reviewer · 239.4 AIC · ⌖ 9.6 AIC · ⊞ 5.1K
| id: claude | ||
| model: "${{ needs.activation.outputs.model_size }}" | ||
| id: pi | ||
| model: copilot/gpt-5.4 |
There was a problem hiding this comment.
Orphaned experiments block silently poisons A/B data: the model is now hardcoded to copilot/gpt-5.4, but the experiments block below still compiles to full Claude-variant selection infrastructure in the lock file.
💡 Impact and fix
The compiled daily-function-namer.lock.yml still:
- Runs a
pick-experimentstep that samples fromclaude-sonnet-4-6/claude-haiku-4-5-20251001 - Sets
GH_AW_EXPERIMENT_SPECwith those Claude variant names (in 3 env-var occurrences) - Writes experiment state to the
experiments/dailyfunctionnamerbranch on every run
But GH_AW_ENGINE_MODEL and GH_AW_INFO_MODEL are both copilot/gpt-5.4 — the selected variant is never consumed. Every daily run records meaningless experiment metrics that corrupt future analysis, and the selection machinery runs wastefully.
Fix: remove the experiments block from daily-function-namer.md entirely (then make recompile), or if the experiment should continue with Pi-compatible models, update the variants accordingly.
There was a problem hiding this comment.
Fixed in 6bf8621. Removed the entire experiments block from daily-function-namer.md and recompiled. The lock file no longer contains pick-experiment steps or Claude variant env vars. Also updated TestDailyFunctionNamerUsesConcreteClaudeModelsForExperiment to assert that no experiments block is present and that the engine is Pi.
| "mcpServers": { | ||
| "safeoutputs": { | ||
| "container": "ghcr.io/github/gh-aw-node", | ||
| "mounts": ["\${GITHUB_WORKSPACE}:\${GITHUB_WORKSPACE}:rw", "${RUNNER_TEMP}/gh-aw/safeoutputs:${RUNNER_TEMP}/gh-aw/safeoutputs:rw", "/tmp/gh-aw:/tmp/gh-aw:rw"], |
There was a problem hiding this comment.
Fragile mixed-escape semantics in the mounts array: \${GITHUB_WORKSPACE} is backslash-escaped (resolved inside the container), while ${RUNNER_TEMP} is bare (expanded by the outer shell before JSON reaches Node).
💡 Details
The mounts array mixes two resolution scopes in a single structure:
"mounts": [
"\${GITHUB_WORKSPACE}:\${GITHUB_WORKSPACE}:rw", // ← resolved by container / entrypoint
"${RUNNER_TEMP}/gh-aw/safeoutputs:...:rw", // ← resolved by shell heredoc expansion
"/tmp/gh-aw:/tmp/gh-aw:rw"
]This is not wrong today, but it requires any future maintainer (or the code generator) to know which placeholders belong to which scope. If the generator changes escaping rules or a new mount is added with the wrong quoting, one path will silently resolve to a literal string like ${GITHUB_WORKSPACE} at runtime, causing a mount that never maps a real directory — with no obvious error.
A comment in the template code (or the generated file header) explaining the two-scope convention would prevent future breakage.
There was a problem hiding this comment.
Acknowledged. The mixed-escape semantics in the mounts array are intentional: \${GITHUB_WORKSPACE} is escaped so the container entrypoint resolves it, while ${RUNNER_TEMP} is left bare so the shell expands it before passing the JSON to Node. This is consistent with how the same mount pattern is generated for other engines. A comment explaining the two-scope convention in the template renderer would be valuable — tracked as a follow-up improvement to avoid future mis-routing.
| GH_AW_INFO_ALLOWED_DOMAINS: '["*.grafana.net","*.sentry.io","defaults","node","python"]' | ||
| GH_AW_INFO_FIREWALL_ENABLED: "true" | ||
| GH_AW_INFO_AWF_VERSION: "v0.27.7" | ||
| GH_AW_INFO_FIREWALL_ENABLED: "false" |
There was a problem hiding this comment.
GH_AW_INFO_FIREWALL_ENABLED reports false while AWF is still running: this metadata flag is misleading for all 20 migrated workflows that have sandbox: agent: awf.
💡 Root cause
compiler_yaml.go sets this flag from getFirewallConfig(data).Enabled. For workflows using sandbox: agent: awf (string form, no explicit version), getFirewallConfig returns nil because:
- There is no explicit
network.firewall.enabled: truein the frontmatter sandbox.agent.versionis not set (which is the only path that makesgetFirewallConfigfall back toisFirewallEnabled)
When the engine was copilot, enableFirewallByDefaultForCopilot injected an explicit FirewallConfig{Enabled: true} into NetworkPermissions.Firewall, so the flag was correctly true. Pi has no equivalent default-enable path.
The actual AWF steps (Install AWF binary, sudo -E awf ...) are still present and running in the compiled YAML — confirmed at lines 683–1169 of the current lock file. Only the informational flag is wrong.
Fix options:
- In
compiler_yaml.go, change thefirewallEnabledcomputation to useisFirewallEnabled(data)instead ofgetFirewallConfig(data)(or fall back to it whengetFirewallConfigreturns nil). - Alternatively, ensure Pi workflows with
sandbox: agent: awfexplicitly setnetwork.firewall.enabled: truein the migration.
There was a problem hiding this comment.
Fixed in 6bf8621. Added enableFirewallByDefaultForPi in firewall.go and called it from runPostEngineValidations in compiler_orchestrator_engine.go. Pi workflows with sandbox: agent: awf now get an explicit FirewallConfig{Enabled: true} injected, so getFirewallConfig returns non-nil and GH_AW_INFO_FIREWALL_ENABLED is correctly set to "true" in compiled lock files. Wasm golden files and lock files updated via make update-wasm-golden and make recompile.
| piMCPLog.Printf("Rendering MCP config for Pi: tool_count=%d, mcp_tool_count=%d", len(tools), len(mcpTools)) | ||
|
|
||
| // Pi uses JSON format without Copilot-specific fields and multi-line args. | ||
| return renderDefaultJSONMCPConfig(yaml, tools, mcpTools, workflowData, constants.ShellMcpServersJsonPath) |
There was a problem hiding this comment.
ShellMcpServersJsonPath choice needs a comment explaining why it diverges from crush/opencode.
💡 Details
Of the five JSON MCP engines in this codebase:
- Claude, Antigravity, Gemini, and now Pi →
constants.ShellMcpServersJsonPath - Crush, OpenCode →
constants.TmpMcpServersJsonPath
The comment "Pi uses JSON format without Copilot-specific fields and multi-line args" explains the format choice but not the path choice. A future maintainer or a Pi CLI version bump that changes where the CLI expects its config file will have no guidance on which constant to verify. One line documenting why Pi uses the shell path (e.g., "Pi CLI resolves config from the same path as the shell environment, consistent with Claude/Antigravity") would prevent a silent mis-routing bug.
There was a problem hiding this comment.
Fixed in 6bf8621. Updated the comment in pi_mcp.go to explain both the format choice (same JSON as Claude/Gemini, no Copilot-specific fields) and the path choice (ShellMcpServersJsonPath because Pi CLI resolves its config from the shell environment path, consistent with Claude/Gemini/Antigravity — Crush and OpenCode use TmpMcpServersJsonPath because their CLIs look in a different location).
|
@copilot run pr-finisher skill |
…update tests and golden files, remove orphaned experiments Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
PR finisher run complete. Here's the status:
Actions taken: No new commits needed — all review feedback was already addressed in Hand-off: A maintainer should resolve the 7 remaining |
Smoke Pi runs were falling back to direct
ghwrites because Pi could not reliably emit safe-outputs via the mountedsafeoutputsCLI. This PR fixes Pi MCP config rendering so thesafeoutputsCLI is mounted correctly, keeps Smoke Pi sandbox mounts at their default configuration, and additionally migrates 20 random non-smoke workflows to use Pi as the agentic engine.Pi MCP config rendering fix
Start MCP Gatewayinvokesstart_mcp_gateway.cjswith generated MCP config for Pi workflows.safeoutputsto be mounted and resolved in Pi bash tool execution.Sandbox mount alignment
allowWritemount additions.smoke-pi.mdsandbox write paths at the default entries for this workflow.MCP heredoc secret escaping fix
$VARto\${VAR}in MCP config heredoc env var values for all non-Copilot JSON engines (mcp_renderer_builtin.go,mcp_scripts_renderer.go).GITHUB_TOKEN,GH_AW_MCP_SCRIPTS_API_KEY) beforestart_mcp_gateway.cjsruns and logs the full config.Pi firewall enablement
enableFirewallByDefaultForPi(matching the pattern of Copilot and Claude) so Pi workflows withsandbox: agent: awfcorrectly getGH_AW_INFO_FIREWALL_ENABLED: "true"and the AWF firewall container stack.Test coverage update
pkg/workflow/agentic_workflow_test.go.TestPiMCPConfig_SafeOutputsunit test that directly documents the root-cause bug: Pi MCP config must include thesafeoutputsserver entry.TestDailyFunctionNamerUsesConcreteClaudeModelsForExperimentto assert that no orphaned experiments block is present and the engine is Pi.Workflow engine migration (new scope)
engine.id: pi,engine.model: copilot/gpt-5.4).tools.cli-proxy: trueandtools.github.mode: gh-proxy) so migrated workflows compile and run with Pi constraints.experimentsblock fromdaily-function-namer.md(Claude variants were never consumed after migrating to Pi, producing meaningless A/B data on every run).Compiled workflow alignment
make recompileto reflect Pi engine, firewall enablement, and corrected\${VAR}escaping across all non-Copilot JSON engine workflows.