Skip to content

feat(mcp): add sampling/createMessage support#2815

Merged
dgageot merged 2 commits into
docker:mainfrom
dgageot:feat/mcp-sampling
May 18, 2026
Merged

feat(mcp): add sampling/createMessage support#2815
dgageot merged 2 commits into
docker:mainfrom
dgageot:feat/mcp-sampling

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented May 18, 2026

Fixes #2809.

Implements MCP sampling (sampling/createMessage) so MCP servers attached to a docker-agent can ask the host to drive its own LLM, mirroring the existing elicitation plumbing.

What

  • New tools.SamplingHandler capability (pkg/tools/sampling.go, pkg/tools/capabilities.go).
  • MCP toolset advertises Sampling in ClientCapabilities and wires CreateMessageHandler for both stdio and remote transports (pkg/tools/mcp/{mcp.go,session_client.go,stdio.go,remote.go}).
  • pkg/runtime/sampling.go: converts inbound CreateMessageParams to []chat.Message (system prompt, text/image/audio, role mapping), clones the current agent's model with the requested MaxTokens, drains the resulting stream and returns an mcp.CreateMessageResult with the assistant text, model id and a translated stopReason.
  • pkg/runtime/loop.go passes the sampling handler through tools.ConfigureHandlers.

Design notes

  • Host stays in control. Server-supplied ModelPreferences are advisory only; we use the agent's currently-selected model. Tool use is intentionally disabled (tools=nil) so sampling cannot recurse into the agent loop.
  • Structured output is cleared per call. Without this, the server would inherit the agent's JSON-schema response format and silently get a reshaped reply.
  • Hardened against misbehaving servers. Per-request limits are enforced before we touch the model: at most 256 messages, 1 MiB per text block (and per system prompt), 8 MiB per image/audio block. Nil messages and unknown roles are rejected.
  • Visibility. Inbound requests log at Info; completion logs model + finish reason + byte count at Debug.

Tests

pkg/runtime/sampling_test.go covers the message conversion (text, system prompt, image data URL, audio fallback, default role, unsupported role), every rejection path (nil request, system-prompt-only, nil entry, oversize text/image/system-prompt, too-many-messages), the finish-reason → stopReason mapping, and the data-URL helper. Existing MCP client mocks (pkg/tools/mcp/{mcp_test.go,reconnect_test.go}) were updated for the new interface method.

Validation

  • task lint — 0 issues (golangci-lint + custom cops + go mod tidy --diff)
  • task test — all packages pass

@dgageot dgageot requested a review from a team as a code owner May 18, 2026 10:11
@docker-agent
Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@dgageot dgageot merged commit 75b2c4c into docker:main May 18, 2026
8 checks passed
@EronWright
Copy link
Copy Markdown

Requirements compliance (against #2809)

Requirement Status
Capability declaration — MUST declare sampling at init; MUST declare sampling.tools if tools supported Sampling: &mcp.SamplingCapabilities{} declared (no Tools field, correct since tools not supported). Servers see the flag absent and MUST NOT send tool-enabled requests. ✅
Human-in-the-loop — SHOULD present prompt for user review before forwarding; SHOULD allow edits; SHOULD present response before returning Not implemented. No approval dialog, no pause, no visibility into the prompt or response. ❌
Tool useToolUseContent/ToolResultContent pairing; iteration limits Intentionally excluded. Tools capability not declared, so servers MUST NOT send tool requests — a valid and correct deferral. ✅ (deferred)
Model selection — respect modelPreferences hints; client retains final authority MaxTokens is honored. Priority scores and hints (model name substrings) are silently ignored; host's configured model is always used. Client retains authority ✅, hints not respected ❌
Security — user approval, rate limiting, validate content, iteration limits on tool loops Input validation (size limits, message count) is solid. No rate limiting. No user approval (same gap as HitL). ⚠️

The two open items are human-in-the-loop (no approval UI) and model preferences (hints dropped rather than considered). Both are acknowledged as intentional in the design notes — flagging here for follow-up tracking.

@EronWright
Copy link
Copy Markdown

EronWright commented May 18, 2026

Thanks @dgageot for getting a jump on this, I was the OP. Would you mind if I filed a follow-up issue to implement tool callbacks to close the remaining functional gap? The "tools" aspect allows the LLM to invoke tool callbacks while processing the sampling message.

Thanks again!

sequenceDiagram
     participant H as cagent                                                                                                                                                 
      participant S as MCP Server                                                                                                                                                    
      participant L as LLM
                                                                                                                                                                                     
      activate H  
      H->>+S: tools/call {name, arguments}
                                                                                                                                                                                     
      note over S: needs LLM inference
                                                                                                                                                                                     
      S->>+H: sampling/createMessage<br/>{messages, tools: [...]}
      H->>+L: chat completion
      L-->>-H: ToolUseContent<br/>stopReason: "toolUse"
      H-->>-S: CreateMessageResult<br/>{tool_use, stopReason: "toolUse"}                                                                                                             
   
      note over S: executes tool locally                                                                                                                                             
                  
      S->>+H: sampling/createMessage<br/>{messages + tool_use + tool_result, tools: [...]}                                                                                           
      H->>+L: chat completion
      L-->>-H: TextContent<br/>stopReason: "endTurn"                                                                                                                                 
      H-->>-S: CreateMessageResult<br/>{text, stopReason: "endTurn"}
                                                                                                                                                                                     
      S-->>-H: tool result
      deactivate H        
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: MCP sampling (sampling/createMessage)

4 participants