Skip to content

fix: move smoke-gemini tests into agent container#2401

Merged
lpcox merged 1 commit into
mainfrom
fix/smoke-gemini-agent-testing
May 2, 2026
Merged

fix: move smoke-gemini tests into agent container#2401
lpcox merged 1 commit into
mainfrom
fix/smoke-gemini-agent-testing

Conversation

@lpcox

@lpcox lpcox commented May 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

The smoke-gemini workflow was running all tests (curl connectivity, file write/read, gh pr list) in a host pre-step, then having the Gemini agent merely verify pre-computed results. This meant the tests validated host connectivity, not AWF sandbox connectivity.

Changes

  • Removed the Pre-compute smoke test data host pre-step that ran curl, file write, and gh commands on the host
  • Moved test requirements into the agent prompt so the Gemini agent performs all tests inside the AWF sandbox container (matching the pattern used by smoke-claude.md)
  • Agent now runs curl connectivity, file write/read, and MCP calls itself inside the sandbox
  • Post-step safe-output validation unchanged

Why

Smoke tests should exercise the firewall's domain allowlist, bash tool, and MCP connectivity from within the container. Running tests on the host bypasses the sandbox entirely.

Pattern comparison

Workflow Before After
smoke-claude Agent does all tests ✅ (unchanged)
smoke-copilot Host pre-computes, agent verifies (unchanged, separate fix)
smoke-gemini Host pre-computes, agent verifies ❌ Agent does all tests

Previously, smoke-gemini ran all tests (curl connectivity, file write/read,
gh pr list) in a host pre-step, then had the agent merely verify pre-computed
results. This meant the tests validated host connectivity, not AWF sandbox
connectivity.

Now the agent performs all tests inside the sandbox (like smoke-claude),
properly exercising the firewall's domain allowlist, bash tool, and MCP
connectivity from within the container.

Changes:
- Remove Pre-compute smoke test data host pre-step
- Move test requirements into agent prompt
- Agent now runs curl, file write/read, and MCP calls itself
- Keep post-step safe-output validation unchanged

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox lpcox requested a review from Mossaka as a code owner May 2, 2026 18:05
Copilot AI review requested due to automatic review settings May 2, 2026 18:05
@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 85.76% 85.84% 📈 +0.08%
Statements 85.64% 85.72% 📈 +0.08%
Functions 88.11% 88.11% ➡️ +0.00%
Branches 78.65% 78.69% 📈 +0.04%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 87.4% → 87.7% (+0.29%) 87.0% → 87.3% (+0.27%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the smoke-gemini workflow so that connectivity and filesystem checks are executed inside the AWF agent sandbox container (instead of being pre-computed on the host), aligning the Gemini smoke test pattern with the existing smoke-claude workflow approach.

Changes:

  • Removed the host “Pre-compute smoke test data” step (curl / file I/O / gh pr list) so results are no longer produced outside the sandbox.
  • Updated the agent prompt to directly perform GitHub MCP PR listing, curl connectivity, and file write/read via bash inside the container.
  • Regenerated the compiled .lock.yml workflow accordingly (removing smoke-data outputs and wiring github.run_id through prompt substitutions).
Show a summary per file
File Description
.github/workflows/smoke-gemini.md Removes host-side precomputation and instructs the Gemini agent to run all smoke checks inside the sandbox.
.github/workflows/smoke-gemini.lock.yml Regenerated compiled workflow removing the pre-step and its derived env/expressions; keeps prompt interpolation functional.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

Smoke Test Results:

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

🔥 Smoke Test Results

Test Result
GitHub MCP connectivity
GitHub.com HTTP connectivity (200)
File write/read

Overall: PASS

PR by @lpcox. Reviewer: @Mossaka.

📰 BREAKING: Report filed by Smoke Copilot

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK Offline Mode — PR #2401 by @lpcox, reviewer @Mossaka

Test Result
GitHub MCP connectivity ✅ Listed merged PR #2398
GitHub.com HTTP connectivity ⚠️ Pre-step template vars not expanded ($\{\{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }})
File write/read ⚠️ Pre-step template vars not expanded
BYOK inference (agent → api-proxy → api.githubcopilot.com) ✅ Responding via BYOK path

Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy → api.githubcopilot.com.

Overall: PARTIAL — BYOK inference and MCP work; pre-step smoke data was not injected into prompt (unexpanded $\{\{ }} variables).

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

Smoke Test

Merged PRs: "feat: unify schema versioning — use repo release tag for all schemas, publish JSONL schemas as release assets"; "fix: load schema via require() for pkg/esbuild compat"
Queried PRs: "fix: move smoke-gemini tests into agent container"; "[docs] docs: add awf-config.v1.schema.json references to config spec"
GitHub merged PR review ✅
GH CLI query ❌ (safeinputs-gh unavailable; fallback gh used)
Playwright ✅ | Tavily ❌
File+cat ✅ | Discussion comment ✅ | Build ✅
Overall: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #2401 · ● 483.6K ·

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity

Check Result
Redis PING ❌ Timeout (no response)
PostgreSQL pg_isready ❌ No response
PostgreSQL SELECT 1 ❌ Timeout

Overall: FAIL

host.docker.internal is unreachable from this environment — all three service connectivity checks failed.

🔌 Service connectivity validated by Smoke Services

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

PR Titles:

  • feat: unify schema versioning — use repo release tag for all schemas, publish JSONL schemas as release assets
  • fix: load schema via require() for pkg/esbuild compat

Test Results:

  • GitHub MCP Testing: ✅
  • GitHub.com Connectivity: ✅
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall status: PASS

💎 Faceted by Smoke Gemini

@lpcox lpcox merged commit bd3eb05 into main May 2, 2026
67 of 69 checks passed
@lpcox lpcox deleted the fix/smoke-gemini-agent-testing branch May 2, 2026 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants