Skip to content

feat: Add directory mounting capabilities to vt run and registry run#11

Merged
JAORMX merged 1 commit into
mainfrom
volume-mounts
Mar 26, 2025
Merged

feat: Add directory mounting capabilities to vt run and registry run#11
JAORMX merged 1 commit into
mainfrom
volume-mounts

Conversation

@JAORMX
Copy link
Copy Markdown
Collaborator

@JAORMX JAORMX commented Mar 26, 2025

This commit adds the ability to mount directories from the host into containers
via both vt run and vt registry run commands. This feature enhances the
flexibility of MCP servers by allowing them to access files from the host system.

Volume Mount Format:

  • Basic format: -v host-path:container-path
  • Read-only mount: -v host-path:container-path:ro

The implementation supports three types of mounts:

  1. Single path: When only one path is provided, it's used as both the source and
    target (e.g., -v /path/to/dir)
  2. Different paths: When two paths are provided, the first is the host path and
    the second is the container path (e.g., -v /host/path:/container/path)
  3. Resource URIs: For future extensibility, the source can be a URI that
    identifies a resource (e.g., -v volume://name:/container/path)

Security Features:

  • Command injection detection: Paths are validated to prevent command injection
  • Null byte detection: Paths containing null bytes are rejected
  • Path normalization: All paths are normalized using filepath.Clean

Implementation Details:

  • Added MountDeclaration type to represent mount declarations
  • Updated Profile struct to use MountDeclaration for Read and Write mounts
  • Added volume flag to both vt run and registry run commands
  • Implemented processVolumeMounts function to parse and validate volume mounts
  • Updated container client to handle MountDeclaration objects

The feature has been thoroughly tested with both read-write and read-only mounts,
and all code quality checks have been passed.

Signed-off-by: Juan Antonio Osorio ozz@stacklok.com

This commit adds the ability to mount directories from the host into containers
via both `vt run` and `vt registry run` commands. This feature enhances the
flexibility of MCP servers by allowing them to access files from the host system.

Volume Mount Format:
- Basic format: `-v host-path:container-path`
- Read-only mount: `-v host-path:container-path:ro`

The implementation supports three types of mounts:
1. Single path: When only one path is provided, it's used as both the source and
   target (e.g., `-v /path/to/dir`)
2. Different paths: When two paths are provided, the first is the host path and
   the second is the container path (e.g., `-v /host/path:/container/path`)
3. Resource URIs: For future extensibility, the source can be a URI that
   identifies a resource (e.g., `-v volume://name:/container/path`)

Security Features:
- Command injection detection: Paths are validated to prevent command injection
- Null byte detection: Paths containing null bytes are rejected
- Path normalization: All paths are normalized using filepath.Clean

Implementation Details:
- Added MountDeclaration type to represent mount declarations
- Updated Profile struct to use MountDeclaration for Read and Write mounts
- Added volume flag to both vt run and registry run commands
- Implemented processVolumeMounts function to parse and validate volume mounts
- Updated container client to handle MountDeclaration objects

The feature has been thoroughly tested with both read-write and read-only mounts,
and all code quality checks have been passed.

Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
@JAORMX JAORMX requested review from ChrisJBurns and dmjb March 26, 2025 10:40
@JAORMX JAORMX merged commit 2f8b497 into main Mar 26, 2025
@dmjb dmjb deleted the volume-mounts branch March 26, 2025 12:15
tgrunnagle added a commit that referenced this pull request Apr 29, 2026
Closes the 14 review threads from review #4199898531 that are scoped to
this PR; #3 and #5 are intentionally deferred to PR #5044 which is the
PR that consumes the ClientSecret plumbing and RedirectURI propagation.

Behaviour fixes:

- #2 synthesiseRegistrationEndpoint and resolveUpstreamRedirectURI now
  preserve the issuer's path component, so multi-tenant upstreams that
  ship DCR without advertising it (or this auth server when it itself
  has a tenant prefix in its issuer) keep their tenant prefix in the
  synthesised URL and the defaulted redirect URI.
- #7 resolveDCRCredentials wraps registration in a package-level
  singleflight.Group keyed by DCRKey so concurrent callers for the same
  upstream coalesce into a single RFC 7591 registration. The leader's
  closure rechecks the cache before any network I/O so followers that
  arrive just after a Put returns observe the fresh entry.
- #9 applyResolution writes ClientID only when the run-config copy
  leaves it empty, treating it symmetrically with the endpoint writes.
  This is defence-in-depth for callers that bypass the resolver's
  validateResolveInputs precondition check.
- #10 the cfg.RegistrationEndpoint short-circuit branch in
  resolveDCREndpoints validates URL well-formedness and HTTPS-or-
  loopback locally, so a non-HTTPS endpoint fails before
  performRegistration constructs a bearer-token transport for it.
- #13 endpointsFromMetadata validates discovered authorization_endpoint
  and token_endpoint before populating dcrEndpoints, so a self-
  consistent metadata document with http:// URLs cannot flow through
  to the auth-code or token-exchange path.

Doc / scope:

- #6 trust-assumption note added on DCRUpstreamConfig with a link to
  the follow-up SSRF-hardening issue (#5135) per the
  "Document Architectural Constraints" rule.
- #8 NewInMemoryDCRCredentialStore doc now explicitly calls out what
  the in-memory store does and does NOT solve (cross-replica sharing,
  durability, cross-process write coordination).

Test hygiene:

- #4 deleted the literal-equals-literal dcrStaleAgeThreshold test that
  added no meaningful coverage.
- #11 atomic.AddInt32 / LoadInt32 for wellKnownHits / customHits in
  TestFetchAuthorizationServerMetadataFromURL, matching the sibling
  pattern in dcr_test.go.
- #12 new failingDCRStore test double + two cases asserting the
  cache.Get / cache.Put wrap messages ("dcr: cache lookup", "dcr: cache
  put") that operators see when a future Redis backend errors.
- #14 renamed the TestNeedsDCR case from
  "client_id with dcr (invalid but returns false)" to
  "client_id wins over dcr_config (defensive AND semantic)" so the
  case label describes the intentional defensive behaviour rather
  than reading like a bug report.
- #15 removed a //nolint:lll directive that was suppressing a non-
  existent violation (the line is 108 chars; the limit is 130).
- #16 new TestInMemoryDCRCredentialStore_ConcurrentAccess fans out
  16 goroutines doing alternating Put/Get against overlapping and
  disjoint keys, with a fail-fast deadline, so go test -race actually
  exercises the sync.RWMutex guard.

New tests cover each behaviour change:

- TestResolveDCRCredentials_SingleflightCoalescesConcurrentCallers
  pins exactly-one-registration under N concurrent callers.
- TestSynthesiseRegistrationEndpoint_PreservesIssuerPath /
  TestResolveUpstreamRedirectURI_PreservesIssuerPath cover the path-
  preservation fix.
- TestApplyResolution_DoesNotOverwritePreProvisionedClientID covers
  the symmetric ClientID write.
- TestResolveDCREndpoints_DirectRegistrationEndpointValidated covers
  the local registration-endpoint guard.
- TestEndpointsFromMetadata_RejectsInsecureDiscoveredEndpoints covers
  the metadata authz/token endpoint guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tgrunnagle added a commit that referenced this pull request Apr 29, 2026
Closes the 14 review threads from review #4199898531 that are scoped to
this PR; #3 and #5 are intentionally deferred to PR #5044 which is the
PR that consumes the ClientSecret plumbing and RedirectURI propagation.

Behaviour fixes:

- #2 synthesiseRegistrationEndpoint and resolveUpstreamRedirectURI now
  preserve the issuer's path component, so multi-tenant upstreams that
  ship DCR without advertising it (or this auth server when it itself
  has a tenant prefix in its issuer) keep their tenant prefix in the
  synthesised URL and the defaulted redirect URI.
- #7 resolveDCRCredentials wraps registration in a package-level
  singleflight.Group keyed by DCRKey so concurrent callers for the same
  upstream coalesce into a single RFC 7591 registration. The leader's
  closure rechecks the cache before any network I/O so followers that
  arrive just after a Put returns observe the fresh entry.
- #9 applyResolution writes ClientID only when the run-config copy
  leaves it empty, treating it symmetrically with the endpoint writes.
  This is defence-in-depth for callers that bypass the resolver's
  validateResolveInputs precondition check.
- #10 the cfg.RegistrationEndpoint short-circuit branch in
  resolveDCREndpoints validates URL well-formedness and HTTPS-or-
  loopback locally, so a non-HTTPS endpoint fails before
  performRegistration constructs a bearer-token transport for it.
- #13 endpointsFromMetadata validates discovered authorization_endpoint
  and token_endpoint before populating dcrEndpoints, so a self-
  consistent metadata document with http:// URLs cannot flow through
  to the auth-code or token-exchange path.

Doc / scope:

- #6 trust-assumption note added on DCRUpstreamConfig with a link to
  the follow-up SSRF-hardening issue (#5135) per the
  "Document Architectural Constraints" rule.
- #8 NewInMemoryDCRCredentialStore doc now explicitly calls out what
  the in-memory store does and does NOT solve (cross-replica sharing,
  durability, cross-process write coordination).

Test hygiene:

- #4 deleted the literal-equals-literal dcrStaleAgeThreshold test that
  added no meaningful coverage.
- #11 atomic.AddInt32 / LoadInt32 for wellKnownHits / customHits in
  TestFetchAuthorizationServerMetadataFromURL, matching the sibling
  pattern in dcr_test.go.
- #12 new failingDCRStore test double + two cases asserting the
  cache.Get / cache.Put wrap messages ("dcr: cache lookup", "dcr: cache
  put") that operators see when a future Redis backend errors.
- #14 renamed the TestNeedsDCR case from
  "client_id with dcr (invalid but returns false)" to
  "client_id wins over dcr_config (defensive AND semantic)" so the
  case label describes the intentional defensive behaviour rather
  than reading like a bug report.
- #15 removed a //nolint:lll directive that was suppressing a non-
  existent violation (the line is 108 chars; the limit is 130).
- #16 new TestInMemoryDCRCredentialStore_ConcurrentAccess fans out
  16 goroutines doing alternating Put/Get against overlapping and
  disjoint keys, with a fail-fast deadline, so go test -race actually
  exercises the sync.RWMutex guard.

New tests cover each behaviour change:

- TestResolveDCRCredentials_SingleflightCoalescesConcurrentCallers
  pins exactly-one-registration under N concurrent callers.
- TestSynthesiseRegistrationEndpoint_PreservesIssuerPath /
  TestResolveUpstreamRedirectURI_PreservesIssuerPath cover the path-
  preservation fix.
- TestApplyResolution_DoesNotOverwritePreProvisionedClientID covers
  the symmetric ClientID write.
- TestResolveDCREndpoints_DirectRegistrationEndpointValidated covers
  the local registration-endpoint guard.
- TestEndpointsFromMetadata_RejectsInsecureDiscoveredEndpoints covers
  the metadata authz/token endpoint guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tgrunnagle added a commit that referenced this pull request May 1, 2026
* Add authserver DCR credential store and resolver

Implements Phase 2 steps 2f/2c of the DCR story (#5038):

- DCRCredentialStore interface with in-memory implementation keyed by
  (Issuer, RedirectURI, ScopesHash). No TTL — RFC 7591 registrations
  are long-lived and only purged by explicit deregistration. The
  ScopesHash tuple is designed to compose a Redis key segment in
  Phase 3 without redefining the canonical form.
- resolveDCRCredentials performs cache lookup, authorization server
  metadata discovery (or direct registration endpoint), explicit
  endpoint override, auth-method intersection against the preference
  order private_key_jwt > client_secret_basic > client_secret_post >
  none, synthesis of {origin}/register when metadata omits
  registration_endpoint, and a single call to
  oauthproto.RegisterClientDynamically with the initial access token
  injected as Authorization: Bearer via a wrapping RoundTripper.
- Resolution stored before returning, per the write-durable-first
  rule in .claude/rules/go-style.md.

Address code review feedback on DCR resolver

Fixed issues from code review:

- HIGH: DCRUpstreamConfig.DiscoveryURL now honoured. Added
  oauthproto.FetchAuthorizationServerMetadataFromURL which fetches
  the operator-configured URL exactly (no well-known-path fallback)
  and enforces RFC 8414 §3.3 issuer equality. Updated field doc
  comments to match behaviour.
- HIGH: TestResolveDCRCredentials_CacheHitShortCircuits rewritten
  to wire the count-everything server as the issuer and assert
  totalRequests == 0, so a regression that moved discovery before
  the cache lookup would actually fail the test.
- MEDIUM: Added tests for discovered-scopes fallback and empty-scope
  omitted-field branches.
- MEDIUM: Exposed oauthproto.NewDefaultDCRClient so newDCRHTTPClient
  inherits the canonical DCR timeout policy rather than duplicating
  it — future timeout changes propagate automatically.
- MEDIUM: Replaced hand-rolled readAll / readAllResp helpers with
  io.ReadAll.
- MEDIUM: Documented that direct RegistrationEndpoint disables
  server-capability negotiation.
- MEDIUM: Documented that applyResolution always overwrites ClientID
  and explained the precondition enforcement.
- MEDIUM: scopesHash now deduplicates (slices.Compact) so scope sets
  that differ only by duplicated values share a canonical digest.
  Test subtests updated accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address PR #5042 review feedback

Refuse HTTP redirects on DCR registration so an upstream cannot 30x us
into re-issuing the registration request — and the attached RFC 7591
initial access token — against a foreign origin. RFC 7591 §3 does not
require redirect support, so refusing them is the conservative fix.

- newDCRHTTPClient now sets http.Client.CheckRedirect to refuse all
  redirects (errDCRRedirectRefused), regardless of whether an initial
  access token is configured. The bearerTokenTransport itself is
  unchanged; the redirect block is the actual mitigation, not a header
  scrub inside RoundTrip.
- New TestResolveDCRCredentials_DoesNotForwardBearerOnRedirect: stands
  up an upstream whose /register 302s to a separate "foreign" httptest
  server, and asserts (a) the resolver errors with errDCRRedirectRefused
  and (b) the foreign server receives zero requests — proving the
  bearer never crossed origins.

Also includes a stray blank-line whitespace fix to discovery_test.go
left from the earlier rebase conflict resolution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address remaining PR #5042 review feedback

Closes the 14 review threads from review #4199898531 that are scoped to
this PR; #3 and #5 are intentionally deferred to PR #5044 which is the
PR that consumes the ClientSecret plumbing and RedirectURI propagation.

Behaviour fixes:

- #2 synthesiseRegistrationEndpoint and resolveUpstreamRedirectURI now
  preserve the issuer's path component, so multi-tenant upstreams that
  ship DCR without advertising it (or this auth server when it itself
  has a tenant prefix in its issuer) keep their tenant prefix in the
  synthesised URL and the defaulted redirect URI.
- #7 resolveDCRCredentials wraps registration in a package-level
  singleflight.Group keyed by DCRKey so concurrent callers for the same
  upstream coalesce into a single RFC 7591 registration. The leader's
  closure rechecks the cache before any network I/O so followers that
  arrive just after a Put returns observe the fresh entry.
- #9 applyResolution writes ClientID only when the run-config copy
  leaves it empty, treating it symmetrically with the endpoint writes.
  This is defence-in-depth for callers that bypass the resolver's
  validateResolveInputs precondition check.
- #10 the cfg.RegistrationEndpoint short-circuit branch in
  resolveDCREndpoints validates URL well-formedness and HTTPS-or-
  loopback locally, so a non-HTTPS endpoint fails before
  performRegistration constructs a bearer-token transport for it.
- #13 endpointsFromMetadata validates discovered authorization_endpoint
  and token_endpoint before populating dcrEndpoints, so a self-
  consistent metadata document with http:// URLs cannot flow through
  to the auth-code or token-exchange path.

Doc / scope:

- #6 trust-assumption note added on DCRUpstreamConfig with a link to
  the follow-up SSRF-hardening issue (#5135) per the
  "Document Architectural Constraints" rule.
- #8 NewInMemoryDCRCredentialStore doc now explicitly calls out what
  the in-memory store does and does NOT solve (cross-replica sharing,
  durability, cross-process write coordination).

Test hygiene:

- #4 deleted the literal-equals-literal dcrStaleAgeThreshold test that
  added no meaningful coverage.
- #11 atomic.AddInt32 / LoadInt32 for wellKnownHits / customHits in
  TestFetchAuthorizationServerMetadataFromURL, matching the sibling
  pattern in dcr_test.go.
- #12 new failingDCRStore test double + two cases asserting the
  cache.Get / cache.Put wrap messages ("dcr: cache lookup", "dcr: cache
  put") that operators see when a future Redis backend errors.
- #14 renamed the TestNeedsDCR case from
  "client_id with dcr (invalid but returns false)" to
  "client_id wins over dcr_config (defensive AND semantic)" so the
  case label describes the intentional defensive behaviour rather
  than reading like a bug report.
- #15 removed a //nolint:lll directive that was suppressing a non-
  existent violation (the line is 108 chars; the limit is 130).
- #16 new TestInMemoryDCRCredentialStore_ConcurrentAccess fans out
  16 goroutines doing alternating Put/Get against overlapping and
  disjoint keys, with a fail-fast deadline, so go test -race actually
  exercises the sync.RWMutex guard.

New tests cover each behaviour change:

- TestResolveDCRCredentials_SingleflightCoalescesConcurrentCallers
  pins exactly-one-registration under N concurrent callers.
- TestSynthesiseRegistrationEndpoint_PreservesIssuerPath /
  TestResolveUpstreamRedirectURI_PreservesIssuerPath cover the path-
  preservation fix.
- TestApplyResolution_DoesNotOverwritePreProvisionedClientID covers
  the symmetric ClientID write.
- TestResolveDCREndpoints_DirectRegistrationEndpointValidated covers
  the local registration-endpoint guard.
- TestEndpointsFromMetadata_RejectsInsecureDiscoveredEndpoints covers
  the metadata authz/token endpoint guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Gate DCR "none" auth method on S256 PKCE advertisement

RFC 7636 §4.2 / OAuth 2.1 require S256 PKCE for public clients. The
resolver previously accepted the "none" token-endpoint auth method
whenever an upstream advertised it, with no check against
code_challenge_methods_supported. Registering as "none" against an
upstream that only advertises "plain" — or omits the field entirely —
would either break at runtime (our auth-code path always sends
code_challenge_method=S256) or silently downgrade to a non-compliant
plain-PKCE flow.

- dcrEndpoints now carries codeChallengeMethodsSupported, populated
  from metadata.CodeChallengeMethodsSupported in endpointsFromMetadata.
- selectTokenEndpointAuthMethod takes the slice as a second argument
  and skips "none" when S256 is not advertised, falling through to
  the next less-preferred method. When "none" is the only mutually
  supported method and S256 is missing, the function returns an
  explicit error citing the spec so operators see a clear failure
  at boot rather than a silent runtime downgrade.
- TestResolveDCRCredentials_AuthMethodPreference's "falls back to
  none" case now also advertises S256 to reflect the compliant flow.
- New TestResolveDCRCredentials_RefusesNoneWithoutS256 pins the
  rejection in both shapes the upstream can take it (field omitted,
  field lists only plain).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Capture RFC 7591 client_secret_expires_at and refetch on expiry

oauthproto decodes RFC 7591 §3.2.1 client_id_issued_at and
client_secret_expires_at, but buildResolution dropped both. The
authoritative consequence was for client_secret_expires_at: a non-zero
upstream-asserted expiry was invisible to lookupCachedResolution, so
the cache served an expired secret indefinitely, every token-endpoint
call after the upstream's expiry would 401, and the resolver had no
signal to refetch.

The 90-day dcrStaleAgeThreshold added for Step 2g observability is a
separate, heuristic signal ("consider rotating") and does not address
this — the upstream-asserted expiry is authoritative.

- DCRResolution now carries ClientIDIssuedAt and ClientSecretExpiresAt
  as time.Time. The wire convention "0 means absent / does not expire"
  maps to the zero time.Time so callers can use IsZero() uniformly,
  via a small epochSecondsToTime helper.
- buildResolution populates both fields by converting the int64 epoch
  values from the registration response.
- lookupCachedResolution treats an entry with a non-zero
  ClientSecretExpiresAt that has passed as a cache miss, so the
  singleflight body re-runs registerAndCache and overwrites the stale
  entry via the existing cache.Put. A Debug log attributes the
  refetch to client_secret_expires_at so operators can correlate the
  re-registration with the upstream's expiry rather than guessing.

Tests:

- TestBuildResolution_PopulatesRFC7591ExpiryFields covers all three
  shapes (both populated, expires_at omitted, both omitted) and that
  zero on the wire becomes the zero time.Time.
- TestResolveDCRCredentials_RefetchesOnExpiredCachedSecret pins the
  cache-miss-on-expiry behaviour: an upstream that returns an already-
  expired secret causes back-to-back resolver calls to register twice.
- TestResolveDCRCredentials_HonoursFutureExpiryAndZero pins the cache-
  hit path: future expiry and the zero-means-does-not-expire wire
  convention both serve from the cache without re-registering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Recover panics inside the DCR singleflight closure

singleflight.Group re-panics the leader's panic in every follower, so
without recovery a panic in registerAndCache (or anything it calls —
the registration HTTP exchange, the cache.Put, etc.) would crash the
leader goroutine and every concurrent follower for the same DCRKey
with the same value. Net effect: a single bad upstream response could
take out N goroutines instead of one error path.

- The dcrFlight.Do closure now installs a defer/recover that converts
  a panic into a normal error. The recovered value is logged at
  slog.Error with a full debug.Stack() so the panic is not silently
  swallowed; the returned error wraps the recovered value so callers
  can react to it as a normal failure mode.
- Doc comment on the singleflight call expanded to explain why the
  recover is necessary (the fan-out semantics of singleflight).

Tests:

- panickingPutDCRStore: a new test double whose Put panics with a
  configured value while Get returns a normal cache miss, so callers
  reach the singleflight closure and trigger the panic via cache.Put
  inside registerAndCache.
- TestResolveDCRCredentials_RecoversPanicInsideSingleflight: spawns
  N=6 concurrent callers for the same DCRKey, each with its own
  defer/recover so an un-recovered panic is captured as a test
  failure rather than crashing the suite. Asserts every caller
  received an error containing the panic-marker string and the
  recovered value, and that none of them observed an actual panic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Tighten singleflight coalescing test to fail loudly on timing slip

The existing assertion "registrationCalls == 1" is necessary but not
sufficient: a goroutine that reached resolveDCRCredentials after the
leader's cache.Put returned would short-circuit through
lookupCachedResolution, take the cache hit, and still leave the count
at 1. Under CI load the 50ms gate window was easy to miss, which would
silently weaken what the test claims to cover.

- Wrap the in-memory store in a countingStore decorator that records
  every Get that returned a hit.
- Assert cache.hits.Load() == 0 after wg.Wait — any non-zero value
  means a late-arriver took the lookupCachedResolution short-circuit
  instead of joining the singleflight, and the coalescing path was
  not actually exercised.
- Bump the gate window from 50ms to 250ms for headroom under CI load.
  The countingStore assertion turns any remaining timing slip into a
  loud failure rather than a silent regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Consolidate validateResolveInputs tests into one table

The previous TestResolveDCRCredentials_RequiresClientIDEmpty and
TestResolveDCRCredentials_RequiresDCRConfig each exercised one branch
of validateResolveInputs, leaving three uncovered: nil run-config,
empty issuer, and nil credential store. Per
.claude/rules/testing.md "Consolidation", folding into one table-
driven test covers every branch in one place and shrinks duplicated
setup.

- Replace the two single-branch tests with
  TestResolveDCRCredentials_RejectsInvalidInputs, which iterates over
  all five branches and asserts the wrapped error message contains
  the expected substring drawn from validateResolveInputs's literals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Note in-memory DCR store is for tests / single-replica dev

Reviewer asked us to confirm the framing and document it on the
constructor. The unbounded-cache concern in .claude/rules/go-style.md
"Resource Leaks" does not bite for this implementation because the
key space is bounded by the operator-configured upstream count and
because Phase 3 Redis is the production answer; the doc now states
that explicitly so a future reader does not assume this is a
production-grade store.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reject nil resolution in DCRCredentialStore.Put

A nil resolution previously fell through as a silent no-op: the caller
got a successful return, the next Get returned a miss, and there was
no error trail to debug from. Per the constructor-validation rule in
.claude/rules/go-style.md, fail loudly at the boundary.

- inMemoryDCRCredentialStore.Put returns an explicit error when the
  resolution is nil, before taking the lock.
- DCRCredentialStore.Put interface doc records the contract so future
  backends (e.g. the Phase 3 Redis store) behave consistently.
- New TestInMemoryDCRCredentialStore_Put_RejectsNilResolution pins the
  rejection and asserts the rejected Put did not leave any partial
  entry behind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Disambiguate local vs upstream issuer in DCR resolver

The resolveDCRCredentials doc said "issuer is the upstream authorization
server's issuer identifier used both for discovery and for keying the
cache." That was wrong on both counts: the upstream's issuer is
recovered separately from rc.DCRConfig.DiscoveryURL via
deriveExpectedIssuerFromDiscoveryURL, and the parameter actually names
*this* auth server. An inline comment in registerAndCache already
contradicted the public doc, and a future caller reading only the
public doc would have produced (a) a wrong-origin default redirect URI
under the upstream IdP and (b) a cache key that does not identify the
auth-server context.

Renaming the parameter and documenting the contract closes the trap.

- resolveDCRCredentials: parameter renamed issuer → localIssuer; doc
  rewritten to spell out that it is *this* auth server's issuer, used
  to key the cache and default the redirect URI, and that the
  upstream's issuer is recovered separately. Same rename applied to
  validateResolveInputs, registerAndCache, lookupCachedResolution,
  chooseRegistrationScopes, and resolveUpstreamRedirectURI for
  internal consistency.
- endpointsFromMetadata, synthesiseRegistrationEndpoint: parameter
  renamed issuer → upstreamIssuer (it is in fact the upstream's,
  recovered from the DiscoveryURL) so the local/upstream split is
  symmetric.
- DCRKey.Issuer: doc clarifies the field carries *this* auth server's
  local issuer; two different local issuers registering against the
  same upstream are distinct OAuth clients and must not share cache
  entries.
- Structured-log keys "issuer" → "local_issuer" everywhere the value
  is the local issuer, so operators can tell the two apart in logs
  too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Require explicit endpoints when DCR bypasses discovery

When dcr_config.registration_endpoint is set, the resolver bypasses
RFC 8414 / OIDC Discovery and therefore cannot populate
authorization_endpoint or token_endpoint from server metadata. The
operator-facing prose on DCRUpstreamConfig already says the run-config
must supply both, but neither validator enforced it. The downstream
effect was silent: dcrEndpoints came back with empty authorize/token
URLs, applyExplicitEndpointOverrides only filled them from rc, and
applyResolution only wrote when rc fields were empty — so a
misconfigured run-config completed registration successfully and only
broke at the first authorize or token-exchange call.

- OAuth2UpstreamRunConfig.Validate now rejects a config that sets
  DCRConfig.RegistrationEndpoint without both AuthorizationEndpoint
  and TokenEndpoint, with an error message that names the field and
  explains the discovery-bypass rationale.
- The discovery-flow path (DCRConfig.DiscoveryURL) is unaffected:
  metadata populates the endpoints there, so leaving them empty in
  the run-config remains valid.
- Existing test case "DCRConfig with only registration_endpoint is
  valid" updated to also set authorize/token endpoints (the path it
  was claiming to cover); two new cases pin the rejection when each
  endpoint is missing; one new case pins that DiscoveryURL is still
  valid without explicit endpoints.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants