Skip to content

Support the scale subresource on ModelDeployment#155

Merged
dennis-upbound merged 1 commit into
mainfrom
tipping-the-scales
Jun 16, 2026
Merged

Support the scale subresource on ModelDeployment#155
dennis-upbound merged 1 commit into
mainfrom
tipping-the-scales

Conversation

@negz

@negz negz commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Fixes #87

A ModelDeployment scales along one axis: spec.replicas. Each replica is a complete serving instance, so scaling means adding or removing whole replicas. Until now nothing exposed that axis through the Kubernetes scale subresource, so kubectl scale and event-driven autoscalers had no standard way to drive it. The design called for the subresource, but XRDs couldn't configure one until Crossplane v2.3 (crossplane/crossplane#7004), released recently.

This declares the scale subresource on the ModelDeployment XRD, mapping spec.replicas to the scale spec and status.replicas.total to the scale status. total is the count of scheduled replicas, which the composition function already writes; it mirrors how Deployment and LeaderWorkerSet report status.replicas as the observed total while keeping readiness in a separate field. The subresource declares no labelSelectorPath: a ModelDeployment's replica pods run on remote workload clusters, not alongside the XR, so a metric-based HorizontalPodAutoscaler can't observe them to scrape metrics. Scaling is by kubectl scale or an external-metrics autoscaler like KEDA. The project now requires Crossplane >=v2.3.0.

$ kubectl scale modeldeployment/qwen3-8b --replicas=3
modeldeployment.modelplane.ai/qwen3-8b scaled

Generating the Python schemas for an XRD with a scale subresource clobbered the ModelDeployment model with the autoscaling Scale type, which broke the composition functions that import it. crossplane/cli#119 fixes that in the schema generator; the crossplane-cli flake input (already pinned to main for an unreleased fix) moves forward to a main commit that includes it.

I have:

  • Read and followed Modelplane's contribution process.
  • Run nix flake check (or ./nix.sh flake check) and made sure it passes.
  • Added or updated tests covering any composition function changes. No function changes; the XRD already had the status.replicas.total the subresource points at.
  • Signed off every commit with git commit -s.

Comment thread design/design.md Outdated
Comment thread crossplane-project.yaml Outdated
@negz negz force-pushed the tipping-the-scales branch from 1dff9cc to 16bc4fa Compare June 15, 2026 22:57
ModelDeployment's only scaling axis is spec.replicas: each replica is a
complete serving instance, and scaling means adding or removing whole
replicas. Nothing exposed that axis through the Kubernetes scale
subresource, so `kubectl scale` and event-driven autoscalers had no
standard way to drive it. The design called for the subresource, but
XRDs couldn't configure one until Crossplane v2.3.

This declares the scale subresource on the XRD, mapping spec.replicas to
the scale spec and status.replicas.total to the scale status. The total
is the count of scheduled replicas, mirroring how Deployment and
LeaderWorkerSet report status.replicas as the observed total while
keeping readiness in a separate field. It declares no labelSelectorPath:
a ModelDeployment's replica pods run on remote workload clusters, not
alongside the XR, so a metric-based HorizontalPodAutoscaler can't observe
them. Scaling is by kubectl or an external-metrics autoscaler like KEDA.

The project now requires Crossplane v2.3, the first release in which an
XRD can configure the scale subresource.

Fixes #87.

Signed-off-by: Nic Cope <nicc@rk0n.org>
@negz negz force-pushed the tipping-the-scales branch from 16bc4fa to da635d3 Compare June 15, 2026 23:29
@negz negz marked this pull request as ready for review June 15, 2026 23:29
Copilot AI review requested due to automatic review settings June 15, 2026 23:29

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Kubernetes scale-subresource support to the ModelDeployment XRD so standard tools (e.g., kubectl scale) and event-driven autoscalers can drive spec.replicas, and bumps the minimum Crossplane version to one that supports XRD scale subresources.

Changes:

  • Declare the scale subresource on the ModelDeployment XRD, mapping spec.replicas and status.replicas.total.
  • Require Crossplane >= v2.3.0 (XRD scale-subresource support) and update documentation accordingly.
  • Advance the pinned crossplane/cli flake input to include the schema-generation fix for XRDs with scale subresources.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
flake.nix Documents why crossplane/cli remains pinned to main (now includes the scale-subresource schema-gen fix).
flake.lock Updates the pinned crossplane/cli revision to a commit containing the needed generator fix.
design/design.md Updates autoscaling design notes to reflect status.replicas.total as the scale status source.
crossplane-project.yaml Bumps the minimum Crossplane version to >=v2.3.0 for XRD scale-subresource support.
apis/modeldeployments/definition.yaml Adds the XRD subresources.scale mapping from spec.replicas to status.replicas.total.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dennis-upbound dennis-upbound merged commit 92d34bb into main Jun 16, 2026
6 checks passed
@negz negz deleted the tipping-the-scales branch June 16, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support scale subresource on ModelDeployment

3 participants