Skip to content

feat: reconcile control planes and workspace proxies#23

Merged
ibetitsmike merged 7 commits into
mainfrom
mike/operator-controlplane-workspaceproxy
Feb 10, 2026
Merged

feat: reconcile control planes and workspace proxies#23
ibetitsmike merged 7 commits into
mainfrom
mike/operator-controlplane-workspaceproxy

Conversation

@ibetitsmike
Copy link
Copy Markdown
Collaborator

@ibetitsmike ibetitsmike commented Feb 10, 2026

Summary

This PR upgrades coder-k8s from a placeholder controller into a functional operator for two resource types:

  • CoderControlPlane (coder.com/v1alpha1)
  • WorkspaceProxy (coder.com/v1alpha1)

It reconciles control-plane and workspace-proxy Deployments/Services, supports optional proxy bootstrap token creation through the Coder SDK, and updates generated manifests/codegen/output accordingly.

Background

coder-k8s previously scaffolded API types and controller wiring, but reconciliation was intentionally no-op. We need a usable operator path for deploying and managing:

  • the Coder control plane, and
  • external workspace proxies

through CRDs stored in etcd and managed with standard Kubernetes workflows.

Implementation

  • Expanded CoderControlPlane API:
    • richer spec (image, replicas, service, extraArgs, extraEnv, imagePullSecrets)
    • richer status (observedGeneration, readyReplicas, url, phase, conditions)
  • Added shared API helpers in api/v1alpha1/types_shared.go.
  • Added WorkspaceProxy API (workspaceproxy_types.go) with:
    • direct token mode (primaryAccessURL + proxySessionTokenSecretRef)
    • optional bootstrap mode (bootstrap.coderURL, credentials secret, generated token secret)
  • Replaced placeholder control-plane reconcile loop with real reconciliation:
    • manages Deployment + Service
    • updates status based on observed workload state
  • Added WorkspaceProxyReconciler:
    • manages Deployment + Service
    • manages token Secret when bootstrap mode is enabled
    • updates proxy status fields
  • Added SDK bootstrap integration in internal/coderbootstrap/client.go:
    • uses github.com/coder/coder/v2/codersdk as-is
    • creates/updates workspace proxy in Coder and regenerates tokens when needed
  • Wired both reconcilers in controller app startup.
  • Added/updated tests:
    • control-plane reconcile coverage
    • workspace-proxy reconcile coverage
    • SDK bootstrap client tests
  • Regenerated and updated:
    • deepcopy code
    • CRD manifests
    • RBAC manifests
    • sample manifests
    • go.mod/go.sum/vendor.

Validation

  • make codegen
  • make manifests
  • make test
  • make build
  • make verify-vendor

Risks

  • Dependency footprint / churn (medium): adding codersdk currently pulls in a large transitive dependency graph and causes substantial vendor churn.
  • Bootstrap semantics (medium): bootstrap mode assumes valid Coder credentials and expected workspace-proxy API behavior; misconfigured secrets will block readiness.
  • Image/feature compatibility (medium): workspace proxy reconciliation assumes a Coder image that supports wsproxy server.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $0.54

Latest updates (February 10, 2026)

  • Rebased the branch onto main (806142a).

  • Addressed Codex feedback to avoid unnecessary bootstrap credential reads:

    • WorkspaceProxyReconciler.resolveProxyCredentials now checks for an existing generated proxy token Secret before reading bootstrap credentials.
    • This allows steady-state reconciliation to continue even after bootstrap credentials are removed/rotated, as long as the generated proxy token Secret is present.
  • Added regression coverage:

    • TestWorkspaceProxyReconcile_WithBootstrap_UsesExistingTokenWithoutCredentials verifies reconcile succeeds with an existing token secret and no bootstrap credential secret.
  • Re-synced go.mod, go.sum, and vendor/ after the rebase so vendored dependencies and module metadata remain consistent.

  • Bounded app.kubernetes.io/instance label values for WorkspaceProxy children to 63 characters with deterministic hash suffixing to avoid reconciliation failures on long CR names; added TestWorkspaceProxyReconcile_TruncatesLongInstanceLabelValue coverage.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c22f1b1df

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/controller/workspaceproxy_controller.go Outdated
Comment thread internal/coderbootstrap/client.go
@ibetitsmike ibetitsmike force-pushed the mike/operator-controlplane-workspaceproxy branch from 5c22f1b to 91002f5 Compare February 10, 2026 09:56
@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the previously unresolved Codex threads and pushed follow-up fixes:

  • prefixed WorkspaceProxy child Deployment/Service names to avoid collisions with CoderControlPlane children in the same namespace
  • configured an explicit timeout on the bootstrap codersdk HTTP client
  • added/updated controller tests for the new naming behavior and collision scenario

Validation run locally:

  • make verify-vendor
  • make test
  • make build

@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

@codex review

Follow-up fix pushed after CI feedback:

  • fixed crd-ref-docs markdown template row rendering so generated reference tables emit one row per line
  • regenerated docs/reference/api/codercontrolplane.md

Re-validated locally:

  • make verify-vendor
  • make test
  • make build

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a75c6ff317

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/controller/workspaceproxy_controller.go Outdated
Comment thread internal/controller/workspaceproxy_controller.go Outdated
@ibetitsmike ibetitsmike force-pushed the mike/operator-controlplane-workspaceproxy branch from a75c6ff to 1d76c01 Compare February 10, 2026 10:30
@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

@codex review\n\nRebased this branch onto current main and addressed the unresolved bootstrap-credential thread by short-circuiting on an existing generated proxy token secret before reading bootstrap credentials. Added regression coverage in TestWorkspaceProxyReconcile_WithBootstrap_UsesExistingTokenWithoutCredentials.\n\nValidation run locally:\n- make verify-vendor\n- make build\n- make test

@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

@codex review

Follow-up fix pushed for the remaining label-length thread:

  • workspaceProxyLabels now bounds app.kubernetes.io/instance values to 63 chars (deterministic truncate+hash), so long WorkspaceProxy names no longer fail child Deployment/Service validation.
  • Added TestWorkspaceProxyReconcile_TruncatesLongInstanceLabelValue regression coverage.

Validation run locally:

  • make test
  • make build

@ibetitsmike ibetitsmike added this pull request to the merge queue Feb 10, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b864dc315

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

func controlPlaneLabels(name string) map[string]string {
return map[string]string{
"app.kubernetes.io/name": "coder-control-plane",
"app.kubernetes.io/instance": name,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bound control-plane instance label to 63 chars

controlPlaneLabels sets app.kubernetes.io/instance to the full CoderControlPlane name, but Kubernetes label values are limited to 63 characters; any valid CR name longer than that will cause child Deployment/Service writes to be rejected during reconciliation. This means long-named control planes cannot converge even though the CR itself can be created, so this label needs the same truncate+hash handling added for WorkspaceProxy.

Useful? React with 👍 / 👎.

Merged via the queue into main with commit 671d3ed Feb 10, 2026
10 checks passed
@ibetitsmike ibetitsmike deleted the mike/operator-controlplane-workspaceproxy branch February 10, 2026 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant