fix: move smoke-gemini tests into agent container#2401
Conversation
Previously, smoke-gemini ran all tests (curl connectivity, file write/read, gh pr list) in a host pre-step, then had the agent merely verify pre-computed results. This meant the tests validated host connectivity, not AWF sandbox connectivity. Now the agent performs all tests inside the sandbox (like smoke-claude), properly exercising the firewall's domain allowlist, bash tool, and MCP connectivity from within the container. Changes: - Remove Pre-compute smoke test data host pre-step - Move test requirements into agent prompt - Agent now runs curl, file write/read, and MCP calls itself - Keep post-step safe-output validation unchanged Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR updates the smoke-gemini workflow so that connectivity and filesystem checks are executed inside the AWF agent sandbox container (instead of being pre-computed on the host), aligning the Gemini smoke test pattern with the existing smoke-claude workflow approach.
Changes:
- Removed the host “Pre-compute smoke test data” step (curl / file I/O /
gh pr list) so results are no longer produced outside the sandbox. - Updated the agent prompt to directly perform GitHub MCP PR listing,
curlconnectivity, and file write/read via bash inside the container. - Regenerated the compiled
.lock.ymlworkflow accordingly (removing smoke-data outputs and wiringgithub.run_idthrough prompt substitutions).
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/smoke-gemini.md | Removes host-side precomputation and instructs the Gemini agent to run all smoke checks inside the sandbox. |
| .github/workflows/smoke-gemini.lock.yml | Regenerated compiled workflow removing the pre-step and its derived env/expressions; keeps prompt interpolation functional. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 0
|
Smoke Test Results:
Overall: PASS
|
🔥 Smoke Test Results
Overall: PASS PR by @lpcox. Reviewer:
|
|
Smoke Test: Copilot BYOK Offline Mode — PR #2401 by @lpcox, reviewer
Running in BYOK offline mode ( Overall: PARTIAL — BYOK inference and MCP work; pre-step smoke data was not injected into prompt (unexpanded
|
Smoke TestMerged PRs: "feat: unify schema versioning — use repo release tag for all schemas, publish JSONL schemas as release assets"; "fix: load schema via require() for pkg/esbuild compat" Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test: GitHub Actions Services Connectivity
Overall: FAIL
|
|
PR Titles:
Test Results:
Overall status: PASS
|
Summary
The smoke-gemini workflow was running all tests (curl connectivity, file write/read,
gh pr list) in a host pre-step, then having the Gemini agent merely verify pre-computed results. This meant the tests validated host connectivity, not AWF sandbox connectivity.Changes
Pre-compute smoke test datahost pre-step that ran curl, file write, and gh commands on the hostsmoke-claude.md)Why
Smoke tests should exercise the firewall's domain allowlist, bash tool, and MCP connectivity from within the container. Running tests on the host bypasses the sandbox entirely.
Pattern comparison