Skip to content

🤖 feat: surface entitlements and gate provisioner reconciliation#67

Merged
ThomasK33 merged 1 commit into
mainfrom
provisioner-jrq9
Feb 12, 2026
Merged

🤖 feat: surface entitlements and gate provisioner reconciliation#67
ThomasK33 merged 1 commit into
mainfrom
provisioner-jrq9

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Surface license tier and entitlements on CoderControlPlane.status, and gate
CoderProvisioner reconciliation on the external_provisioner_daemons
entitlement so unlicensed setups short-circuit cleanly.

Background

External provisioner daemons are license-gated in Coder. Before this change,
CoderProvisioner reconciliation could attempt key/deployment work even when
the deployment was not entitled, causing noisy failures and wasted reconcile
work.

Implementation

  • Added control-plane status fields for:
    • licenseTier
    • entitlementsLastChecked
    • externalProvisionerDaemonsEntitlement
  • Added control-plane entitlement reconciliation via /api/v2/entitlements
    using operator credentials, with best-effort tier derivation.
  • Added CoderProvisionerConditionExternalProvisionersEntitled and enforced an
    entitlement gate in provisioner reconciliation:
    • fast-path from control-plane status
    • fallback API query via bootstrap client when status is unavailable
  • Extended bootstrap client interface/SDK implementation with
    Entitlements(ctx, coderURL, sessionToken).
  • Added a CoderControlPlane watch + spec.controlPlaneRef.name index in the
    provisioner controller for quicker entitlement re-evaluation after control
    plane status changes.
  • Updated API docs and regenerated code/manifests for status schema updates.

Validation

  • make codegen
  • make manifests
  • make docs-reference
  • make verify-vendor
  • make test
  • make build
  • make lint
  • make docs-build

Risks

  • Moderate: reconciliation flow changed in both control-plane and provisioner
    controllers.
  • Main risk is false-negative entitlement gating if entitlement APIs are
    unavailable; mitigated by explicit condition reasons, retries, and fallback
    query behavior.

📋 Implementation Plan

Plan: Surface license tier/entitlements on CoderControlPlane and gate CoderProvisioner

Context / Why

External provisioner daemons are a paid Coder feature. Today, CoderProvisionerReconciler will attempt to create provisioner keys and deploy provisionerd pods even when the target Coder deployment is not entitled, leading to confusing failures and wasted reconciliation work.

We recently merged license reconciliation into the CoderControlPlane controller (it already tracks when a license was last applied). We can extend this to also surface license tier/type and the external provisioner entitlement on CoderControlPlane.status, then use that as a fast path for provisioner reconciliation.

Goals:

  • CoderControlPlane: publish status.licenseTier (best-effort) and status.externalProvisionerDaemonsEntitlement by querying /api/v2/entitlements with the operator token.
  • CoderProvisioner: short-circuit early when the control plane reports external_provisioner_daemons is not entitled, and re-evaluate automatically once it becomes entitled.
  • Keep a fallback path where the provisioner controller queries entitlements directly if the control plane hasn’t published them yet (or operator access is disabled).

Non-goals (initially): deleting already-created provisioner Deployments/keys when a license is removed; we can decide policy later.

Evidence (repo + upstream SDK)

  • CoderControlPlane already has license reconciliation support and records:
    • status.licenseLastApplied and status.licenseLastAppliedHash (api/v1alpha1/codercontrolplane_types.go).
    • reconcileLicense(...) uploads license via codersdk.AddLicense and sets LicenseApplied condition (internal/controller/codercontrolplane_controller.go).
  • CoderProvisionerReconciler currently:
    • Fetches CoderControlPlane, reads bootstrap session token, then calls BootstrapClient.EnsureProvisionerKey(...) and creates Secret/RBAC/Deployment (internal/controller/coderprovisioner_controller.go).
    • Does not check entitlements today.
  • Vendored codersdk exposes entitlements:
    • Client.Entitlements(ctx) (codersdk.Entitlements, error).
    • Feature constant codersdk.FeatureExternalProvisionerDaemons = "external_provisioner_daemons".
    • Entitlement.Entitled() is true for entitled and grace_period.
  • Coder docs confirm Premium-only features that we can use as tier signals:
    • Custom Roles are Premium (docs/admin/users/groups-roles.md → maps to codersdk.FeatureCustomRoles).
    • Multiple Organizations are Premium (docs/admin/users/organizations.md → maps to codersdk.FeatureMultipleOrganizations).

Implementation details

1) Surface license tier/type + provisioner entitlement on CoderControlPlane.status (fast path)

Files:

  • api/v1alpha1/codercontrolplane_types.go
  • internal/controller/codercontrolplane_controller.go
  • internal/app/controllerapp/controllerapp.go
  • internal/controller/codercontrolplane_controller_test.go

API additions (status fields)

Add new, user-visible status fields so other controllers can make fast decisions without extra Coder API calls:

type CoderControlPlaneStatus struct {
    ...

    // LicenseTier is a best-effort classification of the currently-applied license.
    // Values: none, trial, enterprise, premium, unknown.
    // +optional
    LicenseTier string `json:"licenseTier,omitempty"`

    // EntitlementsLastChecked is when the operator last queried /api/v2/entitlements.
    // +optional
    EntitlementsLastChecked *metav1.Time `json:"entitlementsLastChecked,omitempty"`

    // ExternalProvisionerDaemonsEntitlement is the entitlement value for
    // feature "external_provisioner_daemons".
    // Values: entitled, grace_period, not_entitled, unknown.
    // +optional
    ExternalProvisionerDaemonsEntitlement string `json:"externalProvisionerDaemonsEntitlement,omitempty"`
}

Controller logic

  • Introduce an EntitlementsInspector (or similar) interface on CoderControlPlaneReconciler with an SDK-backed implementation using codersdk.Client.Entitlements.
  • Wire the SDK-backed implementation in controllerapp.SetupControllers so production always reports these fields.
  • Add a reconcileEntitlements(...) step (called after reconcileLicense(...) so the values reflect the most recently applied license) that:
    • Runs even when spec.licenseSecretRef is nil (so status reflects licenses applied out-of-band).
    • Requires: nextStatus.Phase == Ready, nextStatus.OperatorAccessReady, nextStatus.OperatorTokenSecretRef != nil, and nextStatus.URL != "".
    • Reads the operator token Secret and calls /api/v2/entitlements.
    • Stamps EntitlementsLastChecked=now, ExternalProvisionerDaemonsEntitlement, and LicenseTier.

Tier derivation should be explicit and documented. A practical best-effort heuristic is:

func licenseTierFromEntitlements(ent codersdk.Entitlements) string {
    if !ent.HasLicense {
        return "none"
    }
    if ent.Trial {
        return "trial"
    }

    // Premium-only signals per docs: custom roles + multiple organizations.
    for _, f := range []codersdk.FeatureName{codersdk.FeatureCustomRoles, codersdk.FeatureMultipleOrganizations} {
        if feat, ok := ent.Features[f]; ok && feat.Entitlement.Entitled() {
            return "premium"
        }
    }

    return "enterprise"
}

Error handling guidance:

  • Treat transient API failures as Unknown (don’t hard-block forever); keep last-known values when available.
  • Requeue after operatorAccessRetryInterval when entitlements cannot be queried, so the status eventually updates when a license is installed later.

Note: adding status fields requires regenerating deepcopy + CRDs (see Validation section).

2) Add a provisioner status condition for entitlement

File: api/v1alpha1/coderprovisioner_types.go

Add a new condition constant (name bikesheddable; keep it explicit):

const (
    ...
    // CoderProvisionerConditionExternalProvisionersEntitled indicates whether the
    // referenced Coder deployment is entitled to run external provisioner daemons.
    CoderProvisionerConditionExternalProvisionersEntitled = "ExternalProvisionersEntitled"
)

Rationale: keeps licensing failures distinct from bootstrap secret / key failures.

3) Add an entitlements fetch helper to internal/coderbootstrap

Files:

  • internal/coderbootstrap/client.go (interface + SDK implementation)
  • internal/controller/*_test.go fakes implementing the interface

Extend coderbootstrap.Client with an entitlements method:

type Client interface {
    EnsureWorkspaceProxy(context.Context, RegisterWorkspaceProxyRequest) (RegisterWorkspaceProxyResponse, error)
    EnsureProvisionerKey(context.Context, EnsureProvisionerKeyRequest) (EnsureProvisionerKeyResponse, error)
    DeleteProvisionerKey(ctx context.Context, coderURL, sessionToken, orgName, keyName string) error

    // Entitlements returns the deployment entitlements from /api/v2/entitlements.
    Entitlements(ctx context.Context, coderURL, sessionToken string) (codersdk.Entitlements, error)
}

Implement on SDKClient using the existing authenticated-client helper (reusing timeout + optional rate-limit bypass):

func (c *SDKClient) Entitlements(ctx context.Context, coderURL, sessionToken string) (codersdk.Entitlements, error) {
    if coderURL == "" { return codersdk.Entitlements{}, xerrors.New("coder URL is required") }
    if sessionToken == "" { return codersdk.Entitlements{}, xerrors.New("session token is required") }

    client, err := newAuthenticatedClient(coderURL, sessionToken)
    if err != nil { return codersdk.Entitlements{}, err }

    ent, err := withOptionalRateLimitBypass(ctx, func(requestCtx context.Context) (codersdk.Entitlements, error) {
        return client.Entitlements(requestCtx)
    })
    if err != nil { return codersdk.Entitlements{}, xerrors.Errorf("get entitlements: %w", err) }

    if ent.Features == nil {
        // Fail-fast: the API contract should always include a features map.
        return codersdk.Entitlements{}, xerrors.New("assertion failed: entitlements.features is nil")
    }
    return ent, nil
}

Why do this instead of direct codersdk.New(...) in the reconciler?

  • Keeps all Coder-API concerns (timeouts, bypass ratelimit header) in one place.
  • Makes unit tests easy by stubbing BootstrapClient.Entitlements.

4) Short-circuit in CoderProvisionerReconciler.Reconcile

File: internal/controller/coderprovisioner_controller.go

Add a new entitlement gate immediately after we have:

  • controlPlane.Status.URL (already required), and
  • sessionToken (bootstrap credentials).

Fast path:

  • If controlPlane.status.entitlementsLastChecked is set and controlPlane.status.externalProvisionerDaemonsEntitlement is present (and optionally not stale), use it to decide without calling the Coder API.

Fallback:

  • Otherwise call r.BootstrapClient.Entitlements(...) and evaluate external_provisioner_daemons directly.

Pseudo-flow:

// 1) Fast path: trust ControlPlane status if present.
if controlPlane.Status.EntitlementsLastChecked != nil && controlPlane.Status.ExternalProvisionerDaemonsEntitlement != "" {
	switch controlPlane.Status.ExternalProvisionerDaemonsEntitlement {
	case "entitled", "grace_period":
		setCondition(provisioner, coderv1alpha1.CoderProvisionerConditionExternalProvisionersEntitled,
			metav1.ConditionTrue, "Entitled", "Coder deployment is entitled to external provisioner daemons")
		// Proceed.
	case "not_entitled":
		setCondition(provisioner, coderv1alpha1.CoderProvisionerConditionExternalProvisionersEntitled,
			metav1.ConditionFalse, "NotEntitled", "Coder deployment is not entitled to external provisioner daemons")
		_ = r.Status().Update(ctx, provisioner)
		return ctrl.Result{RequeueAfter: 2 * time.Minute}, nil
	default:
		// Unknown → fall through to API check.
	}
}

// 2) Fallback: query coderd.
ent, err := r.BootstrapClient.Entitlements(ctx, controlPlane.Status.URL, sessionToken)
if err != nil {
	// Set ExternalProvisionersEntitled=False, Reason=EntitlementsQueryFailed and retry.
}

feature, ok := ent.Features[codersdk.FeatureExternalProvisionerDaemons]
if !ok || !feature.Entitlement.Entitled() {
	// Not entitled → short-circuit with RequeueAfter.
}

setCondition(... True, "Entitled", "Coder deployment is entitled to external provisioner daemons")

Recommended condition reasons/messages:

  • NotEntitled: “Coder deployment is not entitled to external provisioner daemons; install a Premium/Enterprise license to enable external provisioners.”
  • EntitlementsQueryFailed: “Failed to query Coder entitlements; retrying.”
  • NotSupported (HTTP 404): “Coder deployment does not expose /api/v2/entitlements; cannot verify license.” (treat as blocked, requeue)
  • Forbidden (HTTP 401/403): “Bootstrap token is not authorized to read entitlements; retrying.”

Re-evaluation requirement:

  • Use RequeueAfter when not entitled so the controller will automatically re-check.
  • (Optional) Add jitter to the requeue interval to avoid synchronized thundering-herd.

Placement note:

  • Keep the check out of the deletion path so finalizer cleanup continues to be best-effort even in unlicensed states.

5) (Optional but recommended) Watch CoderControlPlane changes to reconcile faster

File: internal/controller/coderprovisioner_controller.go (SetupWithManager)

Add a watch on CoderControlPlane that enqueues CoderProvisioner objects in the same namespace whose spec.controlPlaneRef.name matches the updated control plane.

This reduces time-to-recover after a license install from “up to RequeueAfter” to “immediate”.

Implementation sketch:

  • Add an index field on CoderProvisioner.spec.controlPlaneRef.name.
  • In SetupWithManager, add Watches(&coderv1alpha1.CoderControlPlane{}, handler.EnqueueRequestsFromMapFunc(...)).

(If we do this, keep the periodic RequeueAfter anyway; it’s the safety net.)

Tests

Unit/envtest: CoderControlPlane surfaces license tier + entitlements

File: internal/controller/codercontrolplane_controller_test.go

Add coverage for the new status fields:

  • Use a fake EntitlementsInspector (or whatever interface we introduce) that returns a controlled codersdk.Entitlements payload.
  • Assert CoderControlPlane.status is populated after reconcile when the control plane is Ready and operator access is available:
    • entitlementsLastChecked is set.
    • externalProvisionerDaemonsEntitlement matches the entitlements payload.
    • licenseTier is derived correctly (e.g., none when has_license=false, trial when trial=true, premium when custom_roles or multiple_organizations are entitled, else enterprise).

Unit/envtest: CoderProvisioner short-circuit behavior

File: internal/controller/coderprovisioner_controller_test.go

Extend the shared fakeBootstrapClient (currently in coderworkspaceproxy_controller_test.go) to implement:

Entitlements(ctx context.Context, coderURL, sessionToken string) (codersdk.Entitlements, error)

Add test cases:

  1. Not entitled (fast path)

    • Set controlPlane.status.entitlementsLastChecked and controlPlane.status.externalProvisionerDaemonsEntitlement=not_entitled.
    • Assert:
      • BootstrapClient.Entitlements is not called.
      • EnsureProvisionerKey is not called.
      • No key Secret / Deployment is created.
      • Condition ExternalProvisionersEntitled=False.
      • Reconcile returns RequeueAfter > 0.
  2. Not entitled (fallback API)

    • Leave the control plane entitlement fields unset/unknown.
    • Fake entitlements returns features[external_provisioner_daemons].entitlement=not_entitled.
    • Assert the same short-circuit behavior.
  3. Entitled

    • Either set control plane status to entitled/grace_period or have the fake entitlements return entitled.
    • Assert existing behavior unchanged (key created, resources created) and condition ExternalProvisionersEntitled=True.
  4. Entitlements API forbidden / error

    • Fake entitlements returns a codersdk.Error with 401/403 (or generic error).
    • Assert condition is set to False with Forbidden/EntitlementsQueryFailed and the reconcile retries.
  5. (If we implement the optional watch) verify indexer + mapping function enqueues provisioners.

Validation / Rollout

  • Run: make test, make build, make lint.
  • Because API types/CRDs change (new CoderControlPlaneStatus fields), run make codegen and make manifests, and commit generated changes.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $1.23

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please review the entitlement status + provisioner gating changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e54c913ecd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/codercontrolplane_controller.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed feedback about status-update loops by only updating entitlementsLastChecked
when first set or when derived entitlement/tier values change.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1887fdddbd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/coderprovisioner_controller.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the new feedback by avoiding no-op status updates on the entitlement
requeue/error paths in CoderProvisionerReconciler.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0efa5b7b64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/coderprovisioner_controller.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the stale fast-path concern by re-checking entitlements via API once
CoderControlPlane.status.entitlementsLastChecked is older than the configured
retry interval.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f20b3933b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/coderprovisioner_controller.go
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the stale entitled fast-path concern: stale cached entitlement status
(now for both entitled and not-entitled fast paths) falls through to a fresh
/api/v2/entitlements check before gating decisions.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 167caedd49

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/codercontrolplane_controller.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the timestamp freshness concern by refreshing
status.entitlementsLastChecked periodically (not every reconcile) and on
entitlement/tier value changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4ddad17c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/codercontrolplane_controller.go Outdated
Add CoderControlPlane status fields for license tier and external provisioner
entitlements, reconcile them from coderd, and gate CoderProvisioner
reconciliation when external provisioner daemons are not entitled.

Also add a provisioner condition for entitlement state, wire a control-plane
watch/index for faster rechecks, and extend bootstrap/tests for entitlements.

Follow-ups:
- avoid control-plane entitlements status churn while still refreshing
  `status.entitlementsLastChecked` periodically (and on value changes) and
  requeueing entitlement checks on the refresh interval
- avoid no-op status writes on provisioner entitlement requeue paths so the
  configured backoff is respected in unlicensed environments
- re-check stale control-plane entitlement cache entries (including both
  entitled and not-entitled fast paths) via API before deciding gating

---

_Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$1.23`_

<!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=1.23 -->
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest concern by scheduling periodic entitlement reconciliation
requeues in CoderControlPlaneReconciler on the refresh interval.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Feb 12, 2026
Merged via the queue into main with commit 453dd1d Feb 12, 2026
11 checks passed
@ThomasK33 ThomasK33 deleted the provisioner-jrq9 branch February 12, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant