🤖 fix: back off provisioner reconciliation on coderd 429s by ThomasK33 · Pull Request #65 · coder/coder-k8s

ThomasK33 · 2026-02-12T13:42:36Z

Summary

This PR hardens CoderProvisioner reconciliation against coderd API rate limiting by adding explicit 429 backoff behavior, plus optional rate-limit bypass with fallback in bootstrap client calls. It also includes the existing flake.nix dev-tooling update present in this workspace.

Background

Provisioner reconciliation was entering tight error loops after coderd returned HTTP 429 (Too Many Requests), which amplified API pressure and prevented stable reconciliation. The operator needed controller-level pacing and safer bootstrap API behavior when bypass headers are unavailable.

Implementation

Added explicit per-resource jittered exponential backoff for coderd 429s in CoderProvisionerReconciler:
- base 2s, cap 2m, floor 1s, jitter ratio 0.2
- converts 429 failures into RequeueAfter instead of immediate error retries
- sets ProvisionerKeyReady=False with Reason=RateLimited
- resets per-resource backoff after non-rate-limited outcomes
Added bootstrap SDK helpers:
- withOptionalRateLimitBypass(...)
- bypassRateLimitRoundTripper to inject X-Coder-Bypass-Ratelimit: true when requested
- automatic retry without bypass when server rejects bypass (412 Precondition Required)
- exported IsRateLimitError(err) helper for controller logic
Applied bypass/fallback flow to:
- workspace proxy create/patch operations
- provisioner key create/query/delete paths
Added tests:
- controller backoff + RateLimited condition coverage
- workspace proxy bypass fallback coverage
- provisioner key ensure/delete bypass fallback coverage
Included existing workspace change in flake.nix (yazi added to dev shell packages)

Validation

make test
make build
make lint
make verify-vendor

Risks

Low-to-medium: reconcile timing changes for 429 paths only.
Main risk is slower convergence during sustained coderd throttling; this is intentional to prevent hot-loop amplification and API overload.
Scope is limited to bootstrap calls and provisioner reconciliation retry behavior.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $1.46

- add explicit jittered exponential requeue for CoderProvisioner rate-limit responses - add optional X-Coder-Bypass-Ratelimit handling with automatic fallback when rejected - extend controller and bootstrap tests for backoff + bypass fallback behavior - include existing flake.nix tooling update present in workspace --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$1.46`_

ThomasK33 · 2026-02-12T13:42:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ffc4f20e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- avoid unconditional status writes in 429 defer path - only persist condition changes when status actually changed - keep a stable RateLimited condition message so self-updates do not trigger rapid reconciles - update rate-limit test assertion accordingly --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$1.46`_

ThomasK33 · 2026-02-12T13:55:39Z

@codex review

chatgpt-codex-connector · 2026-02-12T14:04:14Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector Bot reviewed Feb 12, 2026

View reviewed changes

Comment thread internal/controller/coderprovisioner_controller.go Outdated

ThomasK33 added this pull request to the merge queue Feb 12, 2026

Merged via the queue into main with commit d68b694 Feb 12, 2026
8 checks passed

ThomasK33 deleted the operator-spzg branch February 12, 2026 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤖 fix: back off provisioner reconciliation on coderd 429s#65

🤖 fix: back off provisioner reconciliation on coderd 429s#65
ThomasK33 merged 2 commits into
mainfrom
operator-spzg

ThomasK33 commented Feb 12, 2026

Uh oh!

ThomasK33 commented Feb 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

ThomasK33 commented Feb 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ThomasK33 commented Feb 12, 2026

Summary

Background

Implementation

Validation

Risks

Uh oh!

ThomasK33 commented Feb 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Feb 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant