fix: sidebar prompt injection defense (v0.13.4.0) by garrytan · Pull Request #611 · garrytan/gstack

garrytan · 2026-03-29T01:21:36Z

Summary

Three security layers for the Chrome sidebar extension, which has bash access via Claude:

XML prompt framing with trust boundaries — user messages wrapped in <user-message> tags, XML special chars escaped to prevent tag injection. System prompt explicitly instructs Claude to treat content as data.
Bash command allowlist — system prompt restricts bash to browse binary commands only ($B goto, $B click, $B snapshot). All other commands (curl, rm, cat) are forbidden. Prevents prompt injection from escalating to arbitrary code execution.
Opus default — sidebar now uses the most injection-resistant model by default.

Bug fix: sidebar-agent.ts was silently rebuilding its own Claude args from scratch, ignoring --model, --allowedTools, and all other server-side arg changes. Fixed to use queued args from server.ts.

Design doc: docs/designs/ML_PROMPT_INJECTION_KILLER.md covers the follow-up ML classifier PR (DeBERTa via @huggingface/transformers, BrowseSafe-bench red team harness, and the ambitious Bun-native 5ms inference vision).

Test Coverage

12 new tests in browse/test/sidebar-security.test.ts:

XML escaping (tag closing attacks, ampersands, clean passthrough)
Command allowlist (system prompt contains restrictions)
Opus model default
Trust boundary instructions
Sidebar-agent arg plumbing fix

All existing tests pass. Zero regressions.

Pre-Landing Review

CEO review (SCOPE EXPANSION): 6 proposals, 5 accepted, 1 deferred.
Eng review: 1 issue (ONNX + compiled Bun compat), resolved with @huggingface/transformers v4.
Codex review: 15 findings, 4 critical, all resolved (command allowlist instead of removing Bash, arg plumbing fix, XML escaping, salted payload hashes).

Test plan

12 sidebar security tests pass
Full test suite passes (0 failures)

🤖 Generated with Claude Code

…t, arg plumbing Three security fixes for the Chrome sidebar: 1. XML-framed prompts with trust boundaries and escape of < > & in user messages to prevent tag injection attacks. 2. Bash command allowlist in system prompt — only browse binary commands ($B goto, $B click, etc.) allowed. All other bash commands forbidden. 3. Fix sidebar-agent.ts ignoring queued args — server-side --model and --allowedTools changes were silently dropped because the agent rebuilt args from scratch instead of using the queue entry. Also defaults sidebar to Opus (harder to manipulate). 12 new tests covering XML escaping, command allowlist, Opus default, trust boundary instructions, and arg plumbing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ML prompt injection defense design doc + P0 TODO for follow-up PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-29T01:24:48Z

E2E Evals: ✅ PASS

8/8 tests passed | $1.06 total cost | 12 parallel runners

Suite	Result	Status	Cost
e2e-browse	2/2	✅	$0.13
e2e-deploy	2/2	✅	$0.24
e2e-qa-workflow	1/1	✅	$0.43
llm-judge	1/1	✅	$0.02
e2e-deploy	2/2	✅	$0.24

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

loadSession() was restoring worktreePath and claudeSessionId from prior crashes. The worktree directory no longer existed (deleted on cleanup) and --resume with a dead session ID caused claude to fail silently. Now validates worktree exists on load and clears stale claude session IDs on every server restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

VERSION was bumped to 0.13.4.0 in garrytan#611 but package.json was left at 0.13.3.0, causing the gen-skill-docs version-match test to fail. https://claude.ai/code/session_013TrjT8R7grdziMP1ffRivA

* fix: sidebar prompt injection defense — XML framing, command allowlist, arg plumbing Three security fixes for the Chrome sidebar: 1. XML-framed prompts with trust boundaries and escape of < > & in user messages to prevent tag injection attacks. 2. Bash command allowlist in system prompt — only browse binary commands ($B goto, $B click, etc.) allowed. All other bash commands forbidden. 3. Fix sidebar-agent.ts ignoring queued args — server-side --model and --allowedTools changes were silently dropped because the agent rebuilt args from scratch instead of using the queue entry. Also defaults sidebar to Opus (harder to manipulate). 12 new tests covering XML escaping, command allowlist, Opus default, trust boundary instructions, and arg plumbing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.4.0) ML prompt injection defense design doc + P0 TODO for follow-up PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: clear stale worktree and claude session on sidebar reconnect loadSession() was restoring worktreePath and claudeSessionId from prior crashes. The worktree directory no longer existed (deleted on cleanup) and --resume with a dead session ID caused claude to fail silently. Now validates worktree exists on load and clears stale claude session IDs on every server restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan and others added 2 commits March 28, 2026 18:20

chore: bump version and changelog (v0.13.4.0)

abf030c

ML prompt injection defense design doc + P0 TODO for follow-up PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan merged commit ea7dbc9 into main Mar 29, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: sidebar prompt injection defense (v0.13.4.0)#611

fix: sidebar prompt injection defense (v0.13.4.0)#611
garrytan merged 3 commits into
mainfrom
garrytan/extension-prompt-injection-defense

garrytan commented Mar 29, 2026

Uh oh!

github-actions Bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Mar 29, 2026

Summary

Test Coverage

Pre-Landing Review

Test plan

Uh oh!

github-actions Bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Evals: ✅ PASS

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Mar 29, 2026 •

edited

Loading