Skip to content

🤖 feat: add WATCH support to aggregated API server storage#72

Merged
ThomasK33 merged 12 commits into
mainfrom
api-server-y1p0
Feb 13, 2026
Merged

🤖 feat: add WATCH support to aggregated API server storage#72
ThomasK33 merged 12 commits into
mainfrom
api-server-y1p0

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Add WATCH support (rest.Watcher) to the aggregated API server's codersdk-backed storage for coderworkspaces and codertemplates resources, enabling kubectl get --watch, informers, and other standard Kubernetes watch clients.

Background

The aggregated API server (aggregation.coder.com/v1alpha1) previously served CoderWorkspace and CoderTemplate resources through custom REST storage backed by codersdk, but did not implement rest.Watcher. This meant clients could not use ?watch=true — a fundamental Kubernetes API capability used by kubectl --watch, controller informers, and client-go watch helpers.

This is a best-effort watch implementation: events are emitted when mutations pass through this API server. Out-of-band Coder changes (e.g., builds completing, changes via the Coder UI) are not reflected until a future enhancement adds background polling or bridges codersdk watch streams.

Implementation

New shared helpers (internal/aggregated/storage/watch.go)

  • filterForListOptions(requestNamespace, opts) — builds a watch.FilterFunc that applies namespace scoping, label selector, and field selector filtering
  • validateFieldSelector(sel) — rejects unsupported fields; allows metadata.name and metadata.namespace
  • watchBroadcasterQueueLen constant (100 events)

Storage changes (workspace.go, template.go)

  • Added *watch.Broadcaster and sync.Once to each storage struct
  • Initialized broadcaster in constructors with watch.DropIfChannelFull policy
  • Destroy() now shuts down broadcaster (idempotent via sync.Once)
  • Added Watch(ctx, opts) implementing rest.Watcher:
    • Defensive nil assertions
    • Namespace, label, and field selector filtering
    • Context cancellation and timeout support
    • Compile-time _ rest.Watcher = (*Storage)(nil) assertions
  • Emit watch events from successful mutations (best-effort, never fails the API request):
    • WorkspaceStorage: Added on Create, Modified on Update (non-no-op only), Modified on Delete (async deletion)
    • TemplateStorage: Added on Create, Modified on Update, Deleted on Delete (synchronous)

Tests (internal/aggregated/storage/watch_test.go)

  • TestTemplateStorageWatch_AddedModifiedDeleted — full lifecycle
  • TestWorkspaceStorageWatch_AddedModified — create + update toggle
  • TestWatchRespectsFieldSelectorMetadataName — field selector filtering
  • TestWatchStopsOnContextCancel — cleanup on disconnect
  • TestValidateFieldSelector — shared helper unit tests
  • TestFilterForListOptions — filter construction unit tests

Validation

  • make build
  • make test
  • make lint
  • make verify-vendor

Risks

  • Low: Watch semantics are best-effort (no resume/resourceVersion tracking, no out-of-band event bridging). This is documented and intentional for the MVP — clients that depend on complete event streams should be aware of this limitation.
  • None for existing behavior: All changes are additive (new interface implementation + new file). Existing CRUD paths are unchanged except for the best-effort broadcast call after success returns, which is fire-and-forget.

📋 Implementation Plan

Plan: Add WATCH support to aggregated API server storage

Context / Why

The aggregated API server currently serves aggregation.coder.com/v1alpha1 resources (coderworkspaces, codertemplates) via custom, codersdk-backed REST storage.

Today those storage implementations do not implement k8s.io/apiserver/pkg/registry/rest.Watcher, so Kubernetes clients cannot do ?watch=true (e.g. kubectl get … --watch, informers).

Goal: implement a best-effort WATCH endpoint by:

  • Implementing rest.Watcher on both storages.
  • Broadcasting watch events when this API server successfully performs CREATE / UPDATE / DELETE operations.
  • Supporting basic label/field selector filtering so normal client-go/kubectl usage works.

Note: because these resources are not persisted in etcd (they are proxied to Coder), watch semantics will not be identical to native Kubernetes resources. This plan focuses on correctness + practicality for the common case.

Evidence (what I verified)

  • Storage implementations:
    • internal/aggregated/storage/workspace.go and internal/aggregated/storage/template.go implement CRUD and list/get, but do not implement rest.Watcher.
    • They are codersdk-backed, not in-memory.
  • Aggregated API server installs these storages directly:
    • internal/app/apiserverapp/apiserverapp.go (VersionedResourcesStorageMap maps coderworkspaces/codertemplates to storage.New*Storage).
  • Required interface:
    • vendor/k8s.io/apiserver/pkg/registry/rest/rest.go defines type Watcher interface { Watch(ctx, *internalversion.ListOptions) (watch.Interface, error) }.
  • Watch primitives available in-tree:
    • vendor/k8s.io/apimachinery/pkg/watch/mux.go provides watch.Broadcaster.
    • vendor/k8s.io/apimachinery/pkg/watch/filter.go provides watch.Filter.

Implementation details

1) Add a watch broadcaster to each storage and make Destroy() shut it down

Files:

  • internal/aggregated/storage/workspace.go
  • internal/aggregated/storage/template.go

Changes:

  • Extend each storage struct with:
    • broadcaster *watch.Broadcaster
    • broadcasterShutdown sync.Once (so Destroy() can be called multiple times)
  • Initialize in constructors:
    • watch.NewBroadcaster(<queueLen>, watch.DropIfChannelFull)
      • Prefer DropIfChannelFull so slow/abandoned watchers can’t stall API requests.
  • Update Destroy():
    • broadcasterShutdown.Do(func(){ broadcaster.Shutdown() })

Also update interface assertions at the top of each file:

var _ rest.Watcher = (*WorkspaceStorage)(nil)
// and
var _ rest.Watcher = (*TemplateStorage)(nil)

2) Implement Watch(ctx, options) for both storage types

Files:

  • internal/aggregated/storage/workspace.go
  • internal/aggregated/storage/template.go
  • (new) internal/aggregated/storage/watch.go (shared helpers)

High-level behavior:

  • Create a new watcher from the broadcaster.
  • Ensure it is stopped when the request context is done.
  • Apply label/field selector filtering using watch.Filter.

Suggested helper shape (shared):

// supportedFieldKeys: metadata.name, metadata.namespace, metadata.uid
func validateFieldSelector(sel fields.Selector) error { ... }

func filterForListOptions(
  requestNamespace string,
  opts *metainternalversion.ListOptions,
) (watch.FilterFunc, error) { ... }

Watch method pseudo-code:

func (s *TemplateStorage) Watch(ctx context.Context, opts *metainternalversion.ListOptions) (watch.Interface, error) {
  if s == nil { return nil, fmt.Errorf("assertion failed: template storage must not be nil") }
  if ctx == nil { return nil, fmt.Errorf("assertion failed: context must not be nil") }
  if s.broadcaster == nil { return nil, fmt.Errorf("assertion failed: template broadcaster must not be nil") }

  // Apply timeout from opts.TimeoutSeconds if set.
  if opts != nil && opts.TimeoutSeconds != nil {
    var cancel context.CancelFunc
    ctx, cancel = context.WithTimeout(ctx, time.Duration(*opts.TimeoutSeconds)*time.Second)
    defer cancel()
  }

  w, err := s.broadcaster.Watch()
  if err != nil { return nil, err }

  // Ensure cleanup on disconnect.
  go func() {
    <-ctx.Done()
    w.Stop()
  }()

  ns, err := namespaceFromRequestContext(ctx)
  if err != nil { w.Stop(); return nil, err }

  filter, err := filterForListOptions(ns, opts)
  if err != nil { w.Stop(); return nil, err }
  if filter != nil {
    w = watch.Filter(w, filter)
  }

  return w, nil
}

Selector behavior details:

  • LabelSelector:
    • If nil/empty: match all.
    • Else: evaluate against ObjectMeta.Labels.
  • FieldSelector:
    • Validate requirements up-front and return BadRequest for unsupported fields.
    • Support exact-match selection on:
      • metadata.name
      • metadata.namespace
      • metadata.uid
    • Evaluate via fields.Set{...} + sel.Matches(...).
  • Namespace scoping:
    • If request namespace is non-empty (normal namespaced watch), filter out events whose metadata.namespace differs.
    • If request namespace is empty (--all-namespaces), do not apply a namespace filter.

3) Emit watch events from successful mutations

Files:

  • internal/aggregated/storage/workspace.go
  • internal/aggregated/storage/template.go

Pattern:

  • After the SDK call succeeds and we have the object we’re returning, broadcast a deep-copied event:
    • watch.Added after Create
    • watch.Modified after Update
    • watch.Deleted after template Delete

Template delete:

  • We already fetch the template before deletion (TemplateByName). Convert that to k8s once and reuse it for:
    • delete validation
    • broadcasting the Deleted event

Workspace delete (special):

  • Delete is asynchronous in Coder and returns deleted=false.
  • Don’t emit a Deleted watch event.
  • Instead, capture the returned delete-transition build (currently ignored), update the in-memory workspace struct like Update() does, and emit a Modified event:
    • This gives watchers a signal that deletion was requested without lying about final deletion.

Important: do not fail the API request if broadcasting fails (broadcaster may be shutting down). Treat broadcaster errors as best-effort.

4) Add focused unit tests for watch behavior

Files:

  • (new) internal/aggregated/storage/watch_test.go (preferred) or extend internal/aggregated/storage/storage_test.go.

Tests to add:

  1. TestTemplateStorageWatchEmitsAddedModifiedDeleted
    • Start watch, create template, update template, delete template.
    • Assert watch channel receives the corresponding event types and object names.
  2. TestWorkspaceStorageWatchEmitsAddedModified
    • Start watch, create workspace, update workspace (toggle running).
    • Assert Added then Modified events.
  3. TestWatchRespectsFieldSelectorMetadataName
    • Start a watch with fields.OneTermEqualSelector("metadata.name", "acme.ops-template").
    • Create a different template and assert no event.
    • Create the target template and assert event received.
  4. TestWatchStopsOnContextCancel
    • Start watch with cancellable context; cancel; ensure watcher stops (ResultChan closes or no goroutine leak).

Use existing helpers:

  • newMockCoderServer(t)
  • newTestClientProvider(t, server.URL)
  • namespacedContext("control-plane")

5) Validation

Run (in Exec mode):

  • make test
  • (if files touched beyond tests) make lint and make build
Future enhancements (intentionally out of MVP scope)
  • Reflect out-of-band Coder changes:
    • Workspace build status changes, deletions completing, template changes done outside Kubernetes won’t be emitted unless they pass through this API server.
    • Options:
      1. Background poller (List+diff) feeding the broadcaster.
      2. For workspaces, bridge codersdk.Client.WatchWorkspace(ctx, id) for per-object watches.
  • ResourceVersion / resume semantics:
    • This MVP treats watch as “from now” best-effort.
    • Supporting robust resume would require an event cache / RV tracking.
  • SendInitialEvents / bookmarks:
    • Client-go may use watch-list (sendInitialEvents=true) in some configurations.
    • If needed, implement initial synthetic Added events + Bookmark via Broadcaster.WatchWithPrefix.

Generated with mux • Model: anthropic:claude-opus-4-6 • Thinking: xhigh • Cost: $2.27

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0171590c9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/workspace.go
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed P2 feedback: workspace delete watch event now populates LatestBuild and UpdatedAt from the delete-transition build returned by CreateWorkspaceBuild, matching the pattern used in the Update method. This ensures watchers see the actual delete-transition state rather than the pre-delete snapshot.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f807a7aee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/workspace.go
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Implemented the requested watch-list handling change by explicitly rejecting unsupported watch-list options with BadRequest:

  • reject sendInitialEvents
  • reject resourceVersionMatch

Applied to both WorkspaceStorage.Watch and TemplateStorage.Watch, and added focused unit tests for these paths.

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Rebased onto current main, resolved conflicts, and preserved the watch-list option rejection fix + tests. Local verify-vendor/test/build/lint all pass again after rebase.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf9327bc76

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/watch.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the new P1:

  • watch options validation now allows the defaulted legacy combo (sendInitialEvents=true + resourceVersionMatch=NotOlderThan) when resourceVersion is empty or 0
  • still rejects explicit unsupported watch-list semantics outside that legacy/defaulted path
  • added unit/integration coverage for both allowed defaulted cases and rejected non-legacy cases

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d70ee1d64d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/watch.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest P1 by reverting legacy watch-list acceptance:

  • defaulted/explicit sendInitialEvents=true is now rejected again (BadRequest)
  • resourceVersionMatch remains rejected
  • added/updated tests to explicitly cover rejection for legacy default combos (resourceVersion="" and "0")
  • retained acceptance of sendInitialEvents=false

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: faf95ad4c4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/watch.go
Comment thread internal/aggregated/storage/template.go
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed both latest P1 threads:

  1. Watch now rejects non-empty resourceVersion (BadRequest) to avoid pretending to support replay semantics.
  2. Broadcaster initialization now uses watch.WaitIfChannelFull (workspace + template) to avoid silently dropping events under load.

Also updated watch tests to cover these behaviors (resourceVersion rejection and legacy/defaulted watch-list rejection expectations).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7738fe2d38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/workspace.go Outdated
Comment thread internal/aggregated/storage/watch.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed both latest P1 findings with a compromise that preserves correctness without blocking writes:

  • Kept broadcaster in WaitIfChannelFull mode to avoid silent per-watcher event drops.
  • Moved mutation event emission to broadcastEventAsync(...), so Create/Update/Delete request handlers are not blocked by slow watch consumers.
  • Stopped rejecting plain resourceVersion (list-watch flow), while still rejecting unsupported watch-list semantics (sendInitialEvents=true, resourceVersionMatch).
  • Updated unit tests accordingly.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc07eba7e2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/watch.go Outdated
Comment thread internal/aggregated/storage/watch.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest P1s:

  • Replaced goroutine-per-event broadcasting with per-storage ordered dispatch queues (watchEvents channel + single dispatcher goroutine), preserving mutation ordering in emitted watch events.
  • This also avoids unbounded goroutine accumulation under watcher backpressure.
  • ResourceVersion remains accepted for normal list-watch flows; unsupported watch-list semantics are still rejected.
  • Local verify-vendor/test/build/lint all pass.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 430048757f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/workspace.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest DoS concern:

  • enqueueWatchEvent is now non-blocking under watch backpressure.
  • If the dispatch queue is full, it drops the oldest queued event and enqueues the newest event to keep writes responsive and preserve recency.
  • This keeps mutation handlers from stalling behind slow/stuck watch clients while still preserving ordered dispatch for queued events via the single worker goroutine.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff88b2d274

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/template.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Applied the requested broadcaster-mode change:

  • Workspace and template storage now initialize broadcasters with watch.DropIfChannelFull.
  • This prevents one slow/stuck watcher from stalling distribution to all other watchers.
  • Combined with the non-blocking local watch-event queue policy, mutation handlers remain responsive under watch backpressure.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b13ec5d96

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/aggregated/storage/template.go Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest watch stopper leak concern:

  • Introduced stopAwareWatch wrapper in watch.go.
  • WorkspaceStorage.Watch and TemplateStorage.Watch now use this wrapper and their stopper goroutines observe both context/timeout and watcher termination (Done()), so calling Stop() with a live context no longer leaves stopper goroutines running.
  • Added focused tests for stopAwareWatch behavior.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Feb 13, 2026
@ThomasK33
Copy link
Copy Markdown
Member Author

Merged via the queue into main with commit b4535b3 Feb 13, 2026
8 checks passed
@ThomasK33 ThomasK33 deleted the api-server-y1p0 branch February 13, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant