Skip to content

Preserve fresh per-request identity in vMCP backend transports#5335

Merged
tgrunnagle merged 6 commits into
mainfrom
issue_5323_vmcp-stale-identity
May 20, 2026
Merged

Preserve fresh per-request identity in vMCP backend transports#5335
tgrunnagle merged 6 commits into
mainfrom
issue_5323_vmcp-stale-identity

Conversation

@tgrunnagle
Copy link
Copy Markdown
Contributor

Summary

  • vMCP backend HTTP clients captured the request *auth.Identity once at session/client-creation time and re-injected that snapshot on every outgoing backend request, overriding the fresh identity placed on req.Context() by auth.TokenValidator.Middleware. Because that fresh identity carries upstream tokens transparently refreshed by upstreamtoken.InProcessService.GetAllValidTokens, the override silently re-injected stale upstream access tokens for the entire session lifetime — users saw a forced re-auth roughly every 24h (the upstream provider's access-token lifespan).
  • The captured identity is now treated as a fallback: it is injected only when req.Context() carries no identity. This preserves the streamable-HTTP Close()/DELETE teardown path — mcp-go constructs that request from context.Background(), which loses the per-request identity — while letting normal backend calls use the freshly refreshed identity from the request context.
  • The fix is applied to both transport chains that exhibited the bug: identityPropagatingRoundTripper in pkg/vmcp/client/client.go (per-call client) and the new identityRoundTripper in pkg/vmcp/session/internal/backend/mcp_session.go (persistent session-backed client). Existing health-check marker propagation in the per-call client is preserved.

Closes #5323

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Unit-test coverage added/updated:

  • pkg/vmcp/client/client_test.go — round-tripper tests reorganized into Per-request / Fallback / Health-check sections; new cases assert that a fresh identity on req.Context() is preserved unchanged, that the fallback is injected only when the request context has no identity, and that the health-check marker behavior is unchanged.
  • pkg/vmcp/session/internal/backend/roundtripper_test.go — mirrors the above for the session-backed identityRoundTripper.
  • pkg/vmcp/session/internal/backend/mcp_session_identity_refresh_test.go (new) — higher-level regression test that exercises the full transport chain (headerForwardRoundTripperidentityRoundTripperauthRoundTripper with UpstreamInjectStrategyhttp.DefaultTransport) against a fake backend. Asserts that the upstream Authorization header on each tools/call reflects the identity placed on the per-request context, NOT the identity captured at session-init time. Verified to fail under the pre-fix always-override behavior, so it is a true [vMCP] identityPropagatingRoundTripper captures stale identity, bypassing per-request upstream token refresh #5323 regression guard.

Manual verification: traced through auth.TokenValidator.MiddlewareInProcessService.GetAllValidTokens → vMCP outgoing-auth strategies (upstream_inject, tokenexchange, aws_sts) to confirm that all three read identity from req.Context() and therefore benefit from the fix.

API Compatibility

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

Changes

File Change
pkg/vmcp/client/client.go identityPropagatingRoundTripper no longer overrides identity on req.Context(); the captured identity is now a fallback used only when the request context carries none. Doc comments rewritten to document the invariant.
pkg/vmcp/session/internal/backend/mcp_session.go New identityRoundTripper with the same fallback-only semantics, inserted into the session-backed transport chain. Documents asymmetric health-check handling (not needed in this chain).
pkg/vmcp/client/client_test.go Reorganized round-tripper tests; new cases for per-request identity preservation, fallback-only injection, and health-check marker behavior.
pkg/vmcp/session/internal/backend/roundtripper_test.go Same coverage for the session-backed round-tripper.
pkg/vmcp/session/internal/backend/mcp_session_identity_refresh_test.go New higher-level regression test exercising the full transport chain end-to-end.

Does this introduce a user-facing change?

Yes. Users of vMCP backends configured with upstreamInject, tokenExchange, or awsSTS outgoing auth no longer see forced re-auth when the upstream provider's access token expires mid-session. Upstream tokens are transparently refreshed on every tool call as long as the vMCP-issued refresh token remains valid.

Special notes for reviewers

identityPropagatingRoundTripper (pkg/vmcp/client) and identityRoundTripper
(pkg/vmcp/session/internal/backend) captured the request identity once at
session/client-creation time and re-injected that snapshot on every
outgoing backend request, overriding the fresh identity placed on
req.Context() by auth.TokenValidator.Middleware. That fresh identity
carries upstream tokens transparently refreshed by
upstreamtoken.InProcessService.GetAllValidTokens, so overriding it
silently re-injected stale upstream access tokens for the entire
session lifetime — manifesting as a forced re-auth roughly every 24h.

The captured identity is now treated as a fallback: it is injected only
when req.Context() carries no identity. This preserves the existing
streamable-HTTP Close()/DELETE teardown path (mcp-go constructs that
request from context.Background(), which loses the per-request identity)
while letting normal backend calls use the freshly refreshed identity
from the request context.

Closes #5323
Fixed issues from code review:
- HIGH: Added higher-level regression test exercising the full transport
  chain against fakeBackend, asserting that a fresh identity placed on the
  per-request context drives the upstream Authorization header on tools/call
  while the captured session-init identity authenticates the initialize
  handshake. Manually verified the test fails under the pre-fix
  always-override behavior, so it is a true #5323 regression guard.
- MEDIUM: Documented the asymmetric health-check handling on the
  identityRoundTripper doc comment in mcp_session.go (no marker propagation
  here because health probes do not flow through session-backed clients).
- MEDIUM: Clarified that the fallback identity in httpBackendClient is
  captured per-call (not per-session) and is functionally only required to
  keep mcp-go's Close() DELETE authenticated; the bug being fixed lives in
  the persistent-session path.
- MEDIUM: Restructured the round-tripper test sections in both
  client_test.go and roundtripper_test.go: added a block-comment header
  summarizing the design invariant, grouped tests into Per-request /
  Fallback / Health-check sub-sections, and renamed *_NoFallback_* tests
  to *_PerRequestIdentity_* so the canonical path leads with the canonical
  name.
- Lint cleanup: a pre-existing unparam failure on fakeBackend.headersFor
  was tripped on this branch; resolved by having the new integration test
  call headersFor with both initialize and tools/call methods.
Fixed issues from iteration 2 code review:
- MEDIUM: Added a maintenance-note comment block at the Initialize
  assertion in mcp_session_identity_refresh_test.go explaining that the
  block is load-bearing for the unparam linter (keeps
  fakeBackend.headersFor's `method` parameter genuinely keyed by method)
  as well as for test documentation. Closes the readability gap a future
  contributor might trip on if they tried to delete the Initialize block
  as redundant.

Deferred MEDIUM issues filed as follow-ups:
- #5333: Consolidate the duplicate
  identityPropagatingRoundTripper and identityRoundTripper implementations
  into a shared internal transport package.
- #5334: Extend the identity-refresh integration test to
  wire TokenValidator.Middleware + a fake UpstreamTokenRefresher (AC5 as
  literally written on #5323).
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label May 19, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.35%. Comparing base (2ce3ef7) to head (76ad527).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5335      +/-   ##
==========================================
- Coverage   68.36%   68.35%   -0.02%     
==========================================
  Files         620      620              
  Lines       63316    63258      -58     
==========================================
- Hits        43285    43239      -46     
+ Misses      16800    16791       -9     
+ Partials     3231     3228       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor Author

@tgrunnagle tgrunnagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review summary

Reviewed against the repo's .claude/rules/ (go-style, vmcp-anti-patterns, security, testing, pr-creation) using 5 specialist agents (auth/security, concurrency, Go code quality, vMCP architecture, general). Codex cross-review skipped (CLI not installed).

Bottom line: the fix is correct and well-tested at the layer it targets. No HIGH findings. Six pre-merge polish items improve clarity, robustness, and drift-resistance.

Consensus findings (score ≥ 7)

# Severity Where Title
1 MEDIUM mcp_session_identity_refresh_test.go "Maintenance note" + Initialize assertion is a lint-driven test smell
2 MEDIUM both round-trippers Duplicate implementation risks drift; add cross-reference comments now
3 MEDIUM client.go factory comment Doc claim understates fallback scope (injects on any context lacking identity)
4 LOW pkg/auth/context.go (not in diff) IdentityFromContext returns ok=true for typed-nil *Identity
5 LOW 4 files Identity-invariant prose duplicated across 4 files (drift risk)
6 LOW both round-tripper structs Document immutability invariant on the captured *Identity

Details on #1, #2, #3, #5, #6 are inline. #4 is independent of this PR's diff so noted here:

pkg/auth/context.go:51IdentityFromContext does ctx.Value(...).(*Identity) and returns (nil, true) if a caller stored a typed-nil *Identity via raw context.WithValue. The new round-tripper guards if !hasIdentity && fallbackIdentity != nil so a typed-nil would suppress the fallback and pass nil to UpstreamInjectStrategy, which would nil-deref on identity.UpstreamTokens[providerName]. WithIdentity is nil-safe so this is latent rather than active. Recommendation: tighten to return identity, ok && identity != nil. One-line, defensive, no caller impact. Out of scope for this PR — file a follow-up if you agree.

What's solid

  • Identity invariant correctly inverted and consistently applied in both transport chains
  • Regression test (TestHTTPSession_PerRequestIdentity_DrivesUpstreamAuthHeader) verified to fail under pre-fix behavior — true #5323 regression guard
  • No new log statements, no credential leakage risk introduced
  • No public API change; all renames on unexported types
  • Doc comments accurately describe behavior; field rename identityfallbackIdentity makes the invariant self-documenting
  • Inner round-trippers (authRoundTripper, headerForwardRoundTripper) still clone before mutating, so the new "skip-clone-when-no-mutation" fast path is safe
  • Test files correctly colocated with existing fakeBackend helpers
  • The doc comment correctly notes WHY health-check propagation is absent in the session-backed connector (forward-pointer for future maintainers)

Dropped findings (consensus < 7, listed for auditability)

  • Mutable mutated flag violates immutable-assignment style (LOW, 6) — stylistic taste
  • sess.Close() in t.Cleanup lacks a bounded path (LOW, 6) — in-process httptest, low practical risk
  • Upstream mcp-go fix not tracked anywhere (LOW, 6) — fixing Close() to reuse session ctx upstream would obviate the fallback entirely
  • Over-long "Scope note" comment on defaultClientFactory (LOW, 5) — specialists disagreed; appropriate for the bug's subtlety
  • E2E test bypasses TokenValidator.Middleware (LOW, 6) — acknowledged in PR description and tracked as #5334
  • Behavior change is a soft backwards-compat break (LOW, 5) — OOT callers relying on the buggy override
  • Several INFO-level items (test naming, magic strings, 5s timeouts, line count vs CLAUDE.md cap)

Process notes

  • PR is in draft state. Per .claude/rules/pr-creation.md ("Reviewer trust signals"), mark ready-for-review before requesting human review, or note in the description that draft is intentional pending follow-up.
  • Net production code is small (~10 lines of real logic + comments + renames). The PR's line count is dominated by tests and doc comments, well under the 400-line cap for non-test/non-doc changes.

Comment thread pkg/vmcp/session/internal/backend/mcp_session_identity_refresh_test.go Outdated
Comment thread pkg/vmcp/client/client.go
Comment thread pkg/vmcp/client/client.go
Comment thread pkg/vmcp/client/client_test.go Outdated
Comment thread pkg/vmcp/session/internal/backend/mcp_session.go
Addresses #5335 review comments:
- MEDIUM pkg/vmcp/client/client.go (3268152045): add cross-reference
  to twin in mcp_session.go and document fallbackIdentity immutability
  invariant.
- LOW pkg/vmcp/session/internal/backend/mcp_session.go (3268152063):
  add symmetric immutability note on captured fallback pointer.
- LOW pkg/vmcp/client/client_test.go (3268152058): replace duplicated
  "Design invariant" prose with a reference to the canonical location
  (identityPropagatingRoundTripper in client.go). Apply the same
  shortening to roundtripper_test.go to eliminate four-way drift risk.
Addresses #5335 review comments:
- MEDIUM pkg/vmcp/session/internal/backend/mcp_session_identity_refresh_test.go
  (3268152036): drop the Initialize-headers assertion and the 7-line
  "Maintenance note" that propped up the unparam linter. The test now
  focuses solely on the #5323 regression invariant. The unparam shield
  moves onto fakeBackend.headersFor itself with a justification that the
  helper is method-agnostic by design.
Addresses #5335 review comment:
- MEDIUM pkg/vmcp/client/client.go (3268152048): the Scope note claimed
  the fallback was "functionally only required to keep mcp-go's Close()
  DELETE authenticated", but the code injects on ANY request whose
  context lacks an identity. Update the comment to honestly describe
  the broader scope (fallback fires for any context-less request; only
  Close() exercises that path today) and call out that future code
  paths dropping identity would be silently authenticated by the
  captured snapshot unless a method/URL gate is added.
@tgrunnagle
Copy link
Copy Markdown
Contributor Author

Re: body finding #4 (IdentityFromContext returns (nil, true) for typed-nil *Identity) — filed as #5337 for follow-up. Keeping it out of this PR to stay focused on the #5323 transport-layer regression; the hazard is latent (no in-tree caller currently stores a typed-nil) and the fix is a 1-line defensive tighten at the read site, well-scoped to its own PR.

@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 19, 2026
@tgrunnagle tgrunnagle marked this pull request as ready for review May 19, 2026 18:40
@tgrunnagle tgrunnagle merged commit e5670ea into main May 20, 2026
45 checks passed
@tgrunnagle tgrunnagle deleted the issue_5323_vmcp-stale-identity branch May 20, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vMCP] identityPropagatingRoundTripper captures stale identity, bypassing per-request upstream token refresh

2 participants