fix(server): align ProtocolError re-throw with spec error classification#1769
fix(server): align ProtocolError re-throw with spec error classification#1769felixweinberger wants to merge 11 commits intomainfrom
Conversation
Re-throw all ProtocolError instances from the tools/call handler as JSON-RPC errors. Previously only UrlElicitationRequired was re-thrown; other ProtocolErrors thrown inside the try block (output validation, task misconfiguration) were wrapped as isError: true tool results. Per the MCP spec's error classification: - Input validation failures are tool-execution errors (isError: true) - Output validation failures are server errors (JSON-RPC InternalError) - Task misconfiguration is a protocol mismatch (JSON-RPC error) Changes: - validateToolInput now throws plain Error so input validation stays tool-level (isError: true) - validateToolOutput now uses InternalError code instead of InvalidParams (output validation failure is a server-side bug, not client fault) - catch block re-throws any ProtocolError, matching python-sdk semantics and allowing tool handlers to deliberately throw protocol-level errors
🦋 Changeset detectedLatest commit: 54ef5ea The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
There was a problem hiding this comment.
The code change itself is small and logically sound, but this is an explicitly breaking change to core error handling in the tools/call handler that warrants human sign-off — particularly the design choice to keep MethodNotFound for task-required-without-task (vs InvalidParams), and whether the changeset should be minor given the breaking nature.
Extended reasoning...
Overview
This PR broadens the catch block in McpServer's tools/call handler from only re-throwing UrlElicitationRequired to re-throwing all ProtocolError instances. It also downgrades validateToolInput from throwing ProtocolError to plain Error (so input validation stays as isError: true), and changes output validation error codes from InvalidParams to InternalError. Documentation and tests are updated accordingly.
Security risks
No security concerns. The change affects error classification, not authorization or data exposure.
Level of scrutiny
This is a self-described breaking change to a core code path. The tools/call handler is exercised by every tool invocation. Consumers relying on result.isError to detect output validation failures or handler-thrown ProtocolErrors will see different behavior. The changeset marks it as minor but the PR description and migration docs both call it breaking — a human should verify the semver classification. The files are also covered by CODEOWNERS (@modelcontextprotocol/typescript-sdk).
Other factors
The two bug reports are both pre-existing issues not introduced by this PR. The code logic is correct and well-motivated (aligning with spec classification and Python SDK behavior). Tests are updated to match. Migration docs are thorough. The concern is purely about the design-level decisions that a maintainer should validate.
bhosmer-ant
left a comment
There was a problem hiding this comment.
The spec-alignment reasoning is correct (input validation → tool result, output validation → InternalError, task-required → InvalidParams all check out against MCP spec §Tools/Error Handling). Bughunter found two issues with the broadened catch-all:
Inline comment below on the main one (getTask InvalidParams now exposed).
Also — pre-existing consistency gap this PR exposes (not in diff, at ~L313): handleAutomaticTaskPolling throws plain Error('No task store provided for task-capable tool.') for a server misconfig, while the analogous taskSupport mismatch at L177 uses ProtocolError(InternalError). Both are server-side config bugs, but with this PR's broadened re-throw, L177 now becomes a JSON-RPC error while L313 stays {isError: true}. Worth aligning while you're here?
Test-coverage nits (non-blocking):
- Tests at L1429/1551/6432 assert error messages but not codes — since the PR's thesis is code alignment, consider
.rejects.toMatchObject({ code: ProtocolErrorCode.X, message: /.../ })(matching the existing pattern at L1824) - The new auto-polling output validation path (commit 3f29f35, mcp.ts:335-339) has no test coverage
| @@ -206,8 +206,8 @@ export class McpServer { | |||
| await this.validateToolOutput(tool, result, request.params.name); | |||
| return result; | |||
| } catch (error) { | |||
There was a problem hiding this comment.
This broadened catch now re-throws ProtocolError(InvalidParams) from RequestTaskStore.getTask (taskManager.ts:666-668) when a task vanishes mid-polling at L327. Before this PR, that was wrapped as {isError: true}; now it surfaces as a JSON-RPC InvalidParams error — but a task going missing during automatic polling isn't the client's fault.
Options:
- Catch-and-convert in
handleAutomaticTaskPollingbefore L327'sgetTaskresult is consumed - Or fix the underlying
RequestTaskStore.getTaskto throwInternalErrorinstead ofInvalidParamsfor task-not-found
Also: the null-check at L328-330 (if (!task) throw new ProtocolError(InternalError, ...)) is dead code — getTask throws before it can return null.
…ling Addresses review from @bhosmer-ant on #1769: - handleAutomaticTaskPolling: wrap getTask in try/catch and convert ProtocolError(InvalidParams) to InternalError when a task vanishes mid-poll. A task going missing during automatic polling is a server-side issue — the client didn't request a task. - handleAutomaticTaskPolling: change plain Error('No task store...') to ProtocolError(InternalError) for consistency with the analogous taskSupport config check at L177. - Update three tests to assert error codes (not just messages) using toMatchObject, matching the pattern at L1824. Follow-up (non-blocking): test coverage for the auto-polling output validation path added in 3f29f35.
There was a problem hiding this comment.
LGTM — all prior concerns addressed, only nits remain.
Extended reasoning...
Overview
This PR broadens the tools/call catch block to re-throw all ProtocolError instances as JSON-RPC errors instead of wrapping them as { isError: true } results. It also changes input validation to throw plain Error (spec-compliant), upgrades output validation error codes from InvalidParams to InternalError, fixes a pre-existing MethodNotFound → InvalidParams error code for task-required-without-task, adds validateToolOutput to the automatic polling path, and handles task-vanishing mid-poll by converting InvalidParams to InternalError. Five changed files total: changeset, two migration doc updates, the main mcp.ts logic, and integration tests.
Security Risks
No security-sensitive code paths (auth, crypto, permissions) are touched. The change affects error surface area: protocol errors that were previously swallowed now propagate to clients. This is the intended fix and is spec-compliant.
Level of Scrutiny
Moderate. This is a targeted behavior fix in the tools/call error handling path with breaking changes documented in the migration guide. The logic is well-contained and the test changes directly validate the new behavior. The change is consistent with how the python-sdk handles the equivalent path.
Other Factors
All substantive concerns from prior review rounds have been addressed: the MethodNotFound error code has been corrected, validateToolOutput is now called in the automatic polling path with a correct status === completed guard, and task-vanishing during polling now surfaces as InternalError rather than InvalidParams. Two nits remain (dead null check after try-catch, minor doc gap) but are captured as inline comments and do not affect correctness.
- Remove dead null-guard after getTask try-catch (getTask never returns null) - Add migration doc bullet for no-task-store InternalError case - Add test for output schema validation in automatic-polling path Prior commits on this branch already addressed: MethodNotFound -> InvalidParams, plain Error -> ProtocolError for no-task-store, getTask InvalidParams -> InternalError conversion, status-guarded validateToolOutput, and error-code assertions in tests.
There was a problem hiding this comment.
LGTM — all previously raised issues are addressed in the current revision.
Extended reasoning...
Overview
This PR touches packages/server/src/server/mcp.ts (the tools/call catch block, validateToolInput, validateToolOutput, and handleAutomaticTaskPolling), the corresponding integration tests, and migration documentation.
Security risks
The change broadens the catch block from re-throwing only UrlElicitationRequired to re-throwing all ProtocolError instances. This is intentional and correct per the MCP spec. No new attack surface is introduced; server-side bugs (output validation failures) now surface as InternalError instead of being silently swallowed as isError: true results.
Level of scrutiny
This is a breaking change to a production error-handling path. Prior review rounds identified several real bugs: wrong error code for task-required-without-task, missing output validation in the automatic polling path, dead null check, failed-task validation guard, and vanishing-task error surfacing. All have been addressed in the current revision as confirmed by the diff.
Other factors
The migration guides enumerate all five behavioral changes. The test-only nit about dead code in the new automatic-polling test (unreachable null guard and unused getTask handler) has no impact on test correctness or production behavior. Test coverage for the new validateToolOutput path in automatic polling is present and correct.
| const task = await ctx.task.store.getTask(ctx.task.id); | ||
| if (!task) { | ||
| throw new Error('Task not found'); | ||
| } | ||
| return task; | ||
| }, |
There was a problem hiding this comment.
🟡 The getTask handler registered in the new automatic-polling test (lines 6716–6721) contains two forms of dead code: its if (!task) null check is unreachable because RequestTaskStore.getTask throws ProtocolError(InvalidParams) rather than returning null, and the entire handler is never invoked in this test since handleAutomaticTaskPolling calls ctx.task.store.getTask(taskId) directly, bypassing all registered tool handlers. Consider removing the handler or replacing it with a comment explaining why no getTask override is needed here.
Extended reasoning...
What the bug is
The new test added at line 6642 (should validate output schema for taskSupport "optional" tool result returned via automatic polling) registers a getTask handler as part of the tool task handler object:
getTask: async (_args, ctx) => {
const task = await ctx.task.store.getTask(ctx.task.id);
if (!task) { // dead code
throw new Error("Task not found");
}
return task;
},This handler contains two independent dead-code problems.
Dead null check
In a handler context, ctx.task.store is a RequestTaskStore. Looking at taskManager.ts:666-669, RequestTaskStore.getTask wraps the underlying InMemoryTaskStore.getTask and explicitly throws ProtocolError(InvalidParams) when the underlying store returns null — it never propagates a null return value. The TypeScript return type is Promise<Task> (non-nullable). Therefore getTask has exactly two observable outcomes: it throws, or it returns a valid non-null Task. The if (!task) branch at line 6717 can never evaluate to true, making lines 6717–6719 unreachable dead code. This is the same pattern the PR itself cleaned up in production code (the if (!updatedTask) guard in handleAutomaticTaskPolling).
Handler never invoked
More fundamentally, this particular test exercises the automatic polling path — the client calls callTool without task augmentation, triggering handleAutomaticTaskPolling in mcp.ts. That method calls ctx.task.store.getTask(taskId) directly on the RequestTaskStore at line 328 (inside the polling loop). It does not dispatch through the registered getTask handler on the tool. Tool handlers for getTask are only invoked when a client sends an explicit tasks/get request, which does not happen in this test. The registered getTask handler is therefore completely inert for the scenario under test.
Step-by-step proof
- The test registers a
registerToolTaskwithtaskSupport: optionaland agetTaskhandler. - The client calls
callToolwithout ataskfield →isTaskRequest = false. - In the
tools/callhandler:taskSupport === optional && !isTaskRequest && isTaskHandler→ callshandleAutomaticTaskPolling. handleAutomaticTaskPollingcallsctx.task.store.getTask(taskId)directly at line 328 — theRequestTaskStore, not the user-registered handler.- The registered
getTaskhandler is never called. Its body, including theif (!task)check, is dead code for this test.
Why the refutation does not change the conclusion
The refutation correctly notes that this is test-only scaffolding with zero impact on test correctness or production behavior. That observation is accurate and informs the severity — this is genuinely a nit. However, the dead code is still real: a reader studying this test to understand how getTask handlers work would be misled into thinking (a) the handler runs during automatic polling, and (b) ctx.task.store.getTask can return null. Both impressions are incorrect.
How to fix
Remove the getTask handler from the test entirely, or replace it with a comment noting that no getTask override is needed since this test exercises the automatic polling path (which bypasses tool-level handlers). Also remove the if (!task) null guard from any getTask handlers elsewhere in the test suite that contain the same pattern.
Re-throws all
ProtocolErrorinstances from thetools/callhandler as JSON-RPC errors. Previously onlyUrlElicitationRequiredwas re-thrown; otherProtocolErrors thrown inside the try block (output validation, task misconfiguration) were silently wrapped asisError: truetool results.Motivation and Context
The MCP spec classifies tool errors into two categories:
isError: true): API failures, input validation errors, business logic errorsThe current catch block only re-throws
UrlElicitationRequired, which means output validation failures (a server-side bug) and task misconfiguration errors get demoted to tool-levelisError: trueresults when they should be protocol-level JSON-RPC errors.This also means tool handlers that deliberately
throw new ProtocolError(...)get their intent overridden — the python-sdk re-throws allMCPErrorin the equivalent path.How Has This Been Tested?
Updated existing integration tests to reflect the new behavior. All tests pass locally.
Breaking Changes
Yes — output validation failures and task-required-without-task now throw
ProtocolErrorinstead of returning{ isError: true }. See migration guide updates.Input validation behavior is unchanged (still
isError: true, per spec).Types of changes
Checklist
Additional context
Surfaced while triaging #1674, which proposed the broadened re-throw but would have also promoted input validation to JSON-RPC (spec violation). This PR applies the broadening while keeping input validation tool-level by changing
validateToolInputto throw plainErrorinstead ofProtocolError.