Prototype Starlark script middleware for vMCP#4714
Draft
Conversation
Prototype a "tool script middleware" that lets agents write Starlark scripts to orchestrate multiple MCP tool calls in a single atomic operation. This validates the Starlark execution model from RFC THV-0060 without committing to the full session initialization scope. Key components: - Starlark execution engine with step limits and script wrapping for top-level return support - Tool bridge converting MCP tools into callable Starlark functions with type conversion between Go/JSON and Starlark values - parallel() builtin for concurrent fan-out of tool calls - HTTP middleware intercepting execute_tool_script and injecting it into tools/list with dynamic descriptions - Wired into vMCP server above authz so scripts only see authorized tools Includes unit tests, in-process acceptance tests, a k8s e2e test, demo manifests for a Kind cluster with 8 enterprise dummy MCP servers, and /incident-triage skills for interactive demos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
This was referenced Apr 10, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
This is a prototype / proof-of-concept. It validates the Starlark execution model described in RFC THV-0060 without committing to the full session initialization scope. Not intended for merge as-is — this is a starting point for discussion and iteration.
Summary
execute_tool_scriptvirtual tool to vMCP that accepts a Starlark script. The script can call any authorized MCP tool as a function, use loops and conditionals to cross-reference results, fan out calls withparallel(), and return a single aggregated result — all server-side in one tool call.What's included
pkg/script/— Starlark execution engine, MCP tool bridge with type conversion,parallel()builtin for concurrent fan-out, HTTP middleware for request interception andtools/listinjection/incident-triageskill for interactive demosType of change
Test plan
task test)task lint-fix)Deployed to a local Kind cluster with 8 dummy MCP servers. Connected via
thv runas a remote MCP server. Verifiedexecute_tool_scriptappears intools/listwith dynamic description, executed scripts with loops over degraded services,parallel()fan-out, and string parsing. Compared/incident-triage(scripted) vs/incident-triage-lame(sequential) side-by-side.Changes
pkg/script/engine.goreturn, step limits, print capturepkg/script/bridge.goparallel()builtin, result parsing with SDK wrapper unwrappingpkg/script/middleware.goexecute_tool_script, injects intotools/listwith dynamic description,innerToolCallerfor backend dispatchpkg/script/*_test.gopkg/vmcp/server/server.goScriptMiddlewareconfig field, applied above authz inHandler()cmd/vmcp/app/commands.gotest/e2e/.../virtualmcp_script_test.godemo/script-middleware/.claude/skills/incident-triage/execute_tool_scriptwithparallel().claude/skills/incident-triage-lame/Special notes for reviewers
This is a prototype. Known limitations and things to resolve before any production path:
httptest.NewRecorderused in production code for inner tool calls (works fine, but unconventional)parallel()creates a goroutine per callable with no concurrency cap{"result": value}structured content unwrapping is specific to mcp-go SDK behaviorGenerated with Claude Code