Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions .github/workflows/hourly-ci-cleaner.lock.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .github/workflows/hourly-ci-cleaner.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ tools:
edit:
sandbox:
agent:
id: awf
mounts:
- "/usr/bin/make:/usr/bin/make:ro"
- "/usr/bin/go:/usr/bin/go:ro"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# ADR-29663: Default Sandbox Agent Type to AWF for Ambiguous Configurations and Require Explicit ID in Strict Mode

**Date**: 2026-05-02
**Status**: Draft
**Deciders**: pelikhan, copilot-swe-agent

---

## Part 1 — Narrative (Human-Friendly)

### Context

The `sandbox.agent` configuration in gh-aw workflow files allows users to customize the AWF (Agent Workflow Firewall) sandbox that isolates agent execution. A user can specify just a version pin — for example `{ version: "v0.25.29" }` — without providing an explicit `id` or `type` field, intending to use the AWF sandbox at a specific version. However, `getAgentType()` returns an empty string when neither `ID` nor `Type` is set on `AgentSandboxConfig`, and `isSupportedSandboxType("")` returns `false`, which caused `isSandboxEnabled()` to return `false`. This silently disabled the AWF firewall and ran the agent on the host runner, breaking MCP connectivity (the "smoke-gemini" incident). `applySandboxDefaults` only defaulted the agent type to AWF for a `nil` agent config, not for a non-nil agent config with an empty type.

### Decision

We will default `sandbox.agent.type` to `awf` whenever `sandbox.agent` is a non-nil, non-disabled object that carries no explicit `id` or `type`. This is a fail-safe default: a user who writes any `sandbox.agent` block clearly intends to use a sandbox, so we treat an absent `id`/`type` as an implicit AWF selection rather than as disabled. Additionally, we will reject `sandbox.agent` configurations without an explicit `id` in strict mode, where ambiguous configurations are not acceptable and must fail loudly rather than silently defaulting.

### Alternatives Considered

#### Alternative 1: Fail-Closed — Reject Ambiguous Configurations in All Modes

Require `id: awf` explicitly in every `sandbox.agent` block, in both normal and strict mode. Return a validation error for any agent object with no `id`/`type`.

This was not chosen for non-strict mode because it would break existing workflow files that pin a version without an explicit id — a common and reasonable pattern. The intent is unambiguous enough (user provided an agent block) to justify defaulting. Failing closed in non-strict mode would be a breaking change with no security benefit, since the only ambiguity is *which* sandbox type, not whether a sandbox should run.

#### Alternative 2: Document the Behavior and Do Nothing

Accept the existing behavior where a version-only agent config disables the sandbox, and document this as a known footgun. Users would need to always write `id: awf` explicitly to opt into the sandbox.

This was not chosen because the prior behavior (silently disabling the firewall) is a security regression. Any `sandbox.agent` object signals clear intent to run in a sandbox; treating it as disabled is surprising and unsafe. The cost of changing the behavior is minimal while the security benefit (always-on firewall when configured) is high.

### Consequences

#### Positive
- Workflows that pin an AWF version without an explicit `id` now correctly run inside the sandbox, closing the silent firewall-bypass footgun.
- Strict mode now provides a clear, actionable error when `sandbox.agent` lacks an explicit `id`, preventing ambiguous configurations from reaching production.
- The default behavior is "more sandboxed" rather than "less sandboxed", which is the safer default for a security boundary.

#### Negative
- Any workflow that intentionally relied on the empty-type behavior to bypass the sandbox (i.e., accidentally "using" the bug as a feature) will now have the sandbox re-enabled. Such workflows must add `disabled: true` explicitly if they truly intend to skip the sandbox.
- Strict mode is now more restrictive for `sandbox.agent` blocks: workflows that previously passed strict validation with a version-only agent must be updated to add `id: awf`.

#### Neutral
- The `hourly-ci-cleaner.md` workflow required a one-line fix (`id: awf`) to conform to strict mode after this change was introduced.
- `applySandboxDefaults` now has an additional code path; the behavior is guarded by `isSupportedSandboxType(getAgentType(...))`, reusing existing helper functions without adding new abstractions.

---

## Part 2 — Normative Specification (RFC 2119)

> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

### Sandbox Agent Default Behavior

1. When `sandbox.agent` is a non-nil object and `Disabled` is `false`, `applySandboxDefaults` **MUST** set `Agent.Type` to `awf` if `isSupportedSandboxType(getAgentType(agent))` returns `false`.
2. `applySandboxDefaults` **MUST NOT** override `Agent.Type` when the agent is explicitly disabled (`Disabled == true`).
3. `applySandboxDefaults` **MUST NOT** override an already-valid, explicitly-set `Agent.Type` (i.e., when `isSupportedSandboxType(getAgentType(agent))` returns `true`).
4. The AWF default **SHOULD** be applied before returning from `applySandboxDefaults` so that downstream consumers always receive a fully-resolved sandbox configuration.

### Strict Mode Validation

1. In strict mode, `validateStrictSandboxCustomization` **MUST** return an error if `sandbox.agent` is non-nil, `Disabled` is `false`, and `isSupportedSandboxType(getAgentType(agent))` returns `false`.
2. The error message **MUST** instruct the user to add `id: awf` explicitly to their `sandbox.agent` configuration.
3. A `sandbox.agent` block with `Disabled: true` **MUST NOT** trigger this strict-mode validation error; the disabled-agent case is handled by `validateStrictFirewall`.
4. Workflow files that use the AWF sandbox in strict mode **MUST** include an explicit `id: awf` field in their `sandbox.agent` block.

### Conformance

An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.

---

*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/25240361053) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*
13 changes: 13 additions & 0 deletions pkg/workflow/sandbox.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ func migrateSRTToAWF(sandboxConfig *SandboxConfig) *SandboxConfig {
// applySandboxDefaults applies default values to sandbox configuration
// If no sandbox config exists, creates one with awf as default agent
// If sandbox config exists but has no agent, sets agent to awf (unless agent is explicitly disabled)
// If sandbox.agent is an object with no id/type (e.g., version-only), defaults the type to awf
func applySandboxDefaults(sandboxConfig *SandboxConfig, engineConfig *EngineConfig) *SandboxConfig {
// First, migrate any SRT references to AWF (codemod)
sandboxConfig = migrateSRTToAWF(sandboxConfig)
Expand Down Expand Up @@ -168,6 +169,18 @@ func applySandboxDefaults(sandboxConfig *SandboxConfig, engineConfig *EngineConf
sandboxConfig.Agent = &AgentSandboxConfig{
Type: SandboxTypeAWF,
}
return sandboxConfig
}

// If sandbox.agent is configured but has no type/ID set (e.g., a version-only object
// like { version: "v0.25.29" } that reached here without a prior `return`), default
// the type to awf so the sandbox is always enabled. This prevents a bare
// sandbox.agent object from silently disabling the firewall by leaving the type empty.
// Note: this block is only reached when Agent != nil and Disabled == false (the
// Disabled case returned early above).
if !isSupportedSandboxType(getAgentType(sandboxConfig.Agent)) {
Comment on lines +177 to +181
sandboxLog.Print("Sandbox agent has no type/ID configured, defaulting to awf")
sandboxConfig.Agent.Type = SandboxTypeAWF
}

return sandboxConfig
Expand Down
55 changes: 51 additions & 4 deletions pkg/workflow/sandbox_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,16 +89,63 @@ func TestApplySandboxDefaults(t *testing.T) {
},
},
},
{
// version-only object (no id/type) must default to AWF so the sandbox is
// always enabled, matching the previous analysis of the smoke-gemini bug.
name: "version-only agent defaults to AWF",
config: &SandboxConfig{
Agent: &AgentSandboxConfig{
Version: "v0.25.29",
},
},
engine: &EngineConfig{ID: "gemini"},
expected: &SandboxConfig{
Agent: &AgentSandboxConfig{
Type: SandboxTypeAWF,
Version: "v0.25.29",
},
},
},
{
// An agent object with only an empty string ID must also default to AWF.
name: "empty ID agent defaults to AWF",
config: &SandboxConfig{
Agent: &AgentSandboxConfig{},
},
engine: &EngineConfig{ID: "copilot"},
expected: &SandboxConfig{
Agent: &AgentSandboxConfig{
Type: SandboxTypeAWF,
},
},
},
{
// Explicitly disabled agent must never be overridden.
name: "disabled agent is preserved",
config: &SandboxConfig{
Agent: &AgentSandboxConfig{
Disabled: true,
},
},
engine: nil,
expected: &SandboxConfig{
Agent: &AgentSandboxConfig{
Disabled: true,
},
},
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := applySandboxDefaults(tt.config, tt.engine)
if tt.expected != nil {
require.NotNil(t, result)
require.NotNil(t, result.Agent)
assert.Equal(t, tt.expected.Agent.Type, result.Agent.Type)
require.NotNil(t, result)
require.NotNil(t, result.Agent)
assert.Equal(t, tt.expected.Agent.Type, result.Agent.Type, "agent type")
if tt.expected.Agent.Version != "" {
assert.Equal(t, tt.expected.Agent.Version, result.Agent.Version, "agent version")
}
assert.Equal(t, tt.expected.Agent.Disabled, result.Agent.Disabled, "agent disabled flag")
})
}
}
Expand Down
Loading