-
Notifications
You must be signed in to change notification settings - Fork 108
feat: add AzureNodeClass CRD at karpenter.azure.com/v1alpha1 #1487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: comtalyst/vm-path-refactor-3-helpers
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,198 @@ | ||
| --- | ||
| apiVersion: apiextensions.k8s.io/v1 | ||
| kind: CustomResourceDefinition | ||
| metadata: | ||
| annotations: | ||
| controller-gen.kubebuilder.io/version: v0.19.0 | ||
| name: azurenodeclasses.karpenter.azure.com | ||
| spec: | ||
| group: karpenter.azure.com | ||
| names: | ||
| categories: | ||
| - karpenter | ||
| kind: AzureNodeClass | ||
| listKind: AzureNodeClassList | ||
| plural: azurenodeclasses | ||
| shortNames: | ||
| - aznc | ||
| - azncs | ||
| singular: azurenodeclass | ||
| scope: Cluster | ||
| versions: | ||
| - additionalPrinterColumns: | ||
| - jsonPath: .status.conditions[?(@.type=='Ready')].status | ||
| name: Ready | ||
| type: string | ||
| - jsonPath: .metadata.creationTimestamp | ||
| name: Age | ||
| type: date | ||
| - jsonPath: .spec.imageID | ||
| name: ImageID | ||
| priority: 1 | ||
| type: string | ||
| name: v1alpha1 | ||
| schema: | ||
| openAPIV3Schema: | ||
| description: |- | ||
| AzureNodeClass is the Schema for the AzureNodeClass API. | ||
| AzureNodeClass is a more generic node class for provisioning Azure VMs | ||
| that are not necessarily managed by AKS. It supports custom images, | ||
| custom bootstrap data (userData), and per-NodeClass identity configuration. | ||
| properties: | ||
| apiVersion: | ||
| description: |- | ||
| APIVersion defines the versioned schema of this representation of an object. | ||
| Servers should convert recognized schemas to the latest internal value, and | ||
| may reject unrecognized values. | ||
| More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | ||
| type: string | ||
| kind: | ||
| description: |- | ||
| Kind is a string value representing the REST resource this object represents. | ||
| Servers may infer this from the endpoint the client submits requests to. | ||
| Cannot be updated. | ||
| In CamelCase. | ||
| More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | ||
| type: string | ||
| metadata: | ||
| type: object | ||
| spec: | ||
| description: |- | ||
| spec is the top level specification for the Azure Karpenter Provider. | ||
| This will contain configuration necessary to launch instances in Azure. | ||
| properties: | ||
| imageID: | ||
| description: |- | ||
| imageID is the ARM resource ID of the image that instances use. | ||
| This can be a Compute Gallery image, Shared Image Gallery image, Community Gallery image, | ||
| or any valid Azure image resource ID. | ||
| When set, imageFamily-based image resolution is bypassed entirely. | ||
| The user is responsible for ensuring the image is compatible with the selected instance types. | ||
| Examples: | ||
| /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Compute/galleries/{gallery}/images/{image}/versions/{version} | ||
| /CommunityGalleries/{gallery}/images/{image}/versions/{version} | ||
| maxLength: 1024 | ||
| pattern: (?i)^(\/subscriptions\/[^\/]+\/resourceGroups\/[^\/]+\/providers\/Microsoft\.Compute\/.*|\/CommunityGalleries\/[^\/]+\/images\/[^\/]+\/versions\/[^\/]+)$ | ||
| type: string | ||
| managedIdentities: | ||
| description: |- | ||
| managedIdentities is a list of user-assigned managed identity resource IDs | ||
| to attach to provisioned VMs. These are merged with any global identities | ||
| configured via the --node-identities flag. | ||
| items: | ||
| type: string | ||
| maxItems: 10 | ||
| type: array | ||
| osDiskSizeGB: | ||
| description: osDiskSizeGB is the size of the OS disk in GB. | ||
| format: int32 | ||
| maximum: 4096 | ||
| minimum: 30 | ||
| type: integer | ||
| security: | ||
| description: security is a collection of security related karpenter | ||
| fields. | ||
| properties: | ||
| encryptionAtHost: | ||
| description: |- | ||
| encryptionAtHost specifies whether host-level encryption is enabled for provisioned nodes. | ||
| For more information, see: | ||
| https://learn.microsoft.com/en-us/azure/virtual-machines/disk-encryption#encryption-at-host---end-to-end-encryption-for-your-vm-data | ||
| type: boolean | ||
| type: object | ||
| tags: | ||
| additionalProperties: | ||
| type: string | ||
| description: tags to be applied on Azure resources like instances. | ||
| type: object | ||
| x-kubernetes-validations: | ||
| - message: tags keys must be less than 512 characters | ||
| rule: self.all(k, size(k) <= 512) | ||
| - message: tags keys must not contain '<', '>', '%', '&', or '?' | ||
| rule: self.all(k, !k.matches('[<>%&?]')) | ||
| - message: tags keys must not contain '\' | ||
| rule: self.all(k, !k.contains('\\')) | ||
| - message: tags values must be less than 256 characters | ||
| rule: self.all(k, size(self[k]) <= 256) | ||
| userData: | ||
| description: |- | ||
| userData is the base64-encoded custom data that will be passed to the VM at creation time. | ||
| The caller must pre-encode their cloud-init or bootstrap script to base64, as the Azure API | ||
| expects this field to contain a base64-encoded string. | ||
| The user is fully responsible for providing valid bootstrap/cloud-init data. | ||
| When this field is set, no Karpenter-managed bootstrapping is performed. | ||
| maxLength: 87380 | ||
| type: string | ||
| vnetSubnetID: | ||
| description: |- | ||
| vnetSubnetID is the subnet used by nics provisioned with this nodeclass. | ||
| If not specified, we will use the default --vnet-subnet-id specified in karpenter's options config. | ||
| pattern: (?i)^\/subscriptions\/[^\/]+\/resourceGroups\/[a-zA-Z0-9_\-().]{0,89}[a-zA-Z0-9_\-()]\/providers\/Microsoft\.Network\/virtualNetworks\/[^\/]+\/subnets\/[^\/]+$ | ||
| type: string | ||
| type: object | ||
| status: | ||
| description: status contains the resolved state of the AzureNodeClass. | ||
| properties: | ||
| conditions: | ||
| description: conditions contains signals for health and readiness | ||
| items: | ||
| description: Condition aliases the upstream type and adds additional | ||
| helper methods | ||
| properties: | ||
| lastTransitionTime: | ||
| description: |- | ||
| lastTransitionTime is the last time the condition transitioned from one status to another. | ||
| This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. | ||
| format: date-time | ||
| type: string | ||
| message: | ||
| description: |- | ||
| message is a human readable message indicating details about the transition. | ||
| This may be an empty string. | ||
| maxLength: 32768 | ||
| type: string | ||
| observedGeneration: | ||
| description: |- | ||
| observedGeneration represents the .metadata.generation that the condition was set based upon. | ||
| For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date | ||
| with respect to the current state of the instance. | ||
| format: int64 | ||
| minimum: 0 | ||
| type: integer | ||
| reason: | ||
| description: |- | ||
| reason contains a programmatic identifier indicating the reason for the condition's last transition. | ||
| Producers of specific condition types may define expected values and meanings for this field, | ||
| and whether the values are considered a guaranteed API. | ||
| The value should be a CamelCase string. | ||
| This field may not be empty. | ||
| maxLength: 1024 | ||
| minLength: 1 | ||
| pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ | ||
| type: string | ||
| status: | ||
| description: status of the condition, one of True, False, Unknown. | ||
| enum: | ||
| - "True" | ||
| - "False" | ||
| - Unknown | ||
| type: string | ||
| type: | ||
| description: type of condition in CamelCase or in foo.example.com/CamelCase. | ||
| maxLength: 316 | ||
| pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ | ||
| type: string | ||
| required: | ||
| - lastTransitionTime | ||
| - message | ||
| - reason | ||
| - status | ||
| - type | ||
| type: object | ||
| type: array | ||
| type: object | ||
| type: object | ||
| served: true | ||
| storage: true | ||
| subresources: | ||
| status: {} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| # AzureVM Provision Mode | ||
|
|
||
| **Author:** @comtalyst | ||
|
|
||
| **Last updated:** March 7, 2026 | ||
|
|
||
| **Status:** Proposed | ||
|
|
||
| ## Overview | ||
|
|
||
| AzureVM provision mode enables Karpenter to provision standalone Azure Virtual Machines that are **not** part of an AKS cluster. This opens Karpenter as a general-purpose VM autoscaler for any Kubernetes distribution running on Azure (e.g., self-managed k8s, Rancher, OpenShift, Talos). | ||
|
|
||
| In existing AKS modes (`AKSMachineAPI`, `AKSScriptless`, `BootstrappingClient`), Karpenter relies heavily on AKS-specific infrastructure: image family resolution via AKS VHD build system, node bootstrapping via AKS's cloud-init/CSE pipeline, AKS billing extensions, and AKS load balancer backend pool management. AzureVM mode bypasses all of these, giving the user full control over image selection and node bootstrapping. | ||
|
|
||
| ### Goals | ||
|
|
||
| * Allow Karpenter to provision VMs outside of AKS clusters | ||
| * Support user-provided VM images (Compute Gallery, SIG, or any ARM image resource) | ||
| * Support user-provided bootstrap data (cloud-init / custom scripts via `userData`) | ||
| * Support per-NodeClass subscription, resource group, and location overrides for multi-subscription deployments | ||
| * Support per-NodeClass managed identity assignment | ||
| * Support optional data disk attachment | ||
| * Maintain backward compatibility — existing AKS modes are unaffected | ||
|
|
||
| ### Non-Goals | ||
|
|
||
| * Windows VM support (Linux only for now) | ||
| * Automatic image updates or OS patching | ||
| * Karpenter-managed node bootstrapping (the user provides their own) | ||
| * AKS billing extension or AKS identifying extension in AzureVM mode | ||
| * Node auto-join to AKS clusters (use AKS modes for that) | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### New CRD: AzureNodeClass (karpenter.azure.com/v1alpha1) | ||
|
|
||
| A new CRD `AzureNodeClass` is introduced alongside the existing `AKSNodeClass`. It contains fields relevant to generic Azure VM provisioning: | ||
|
|
||
| ```yaml | ||
| apiVersion: karpenter.azure.com/v1alpha1 | ||
| kind: AzureNodeClass | ||
| metadata: | ||
| name: my-nodeclass | ||
| spec: | ||
| imageID: "/subscriptions/.../Microsoft.Compute/galleries/.../versions/1.0.0" | ||
| userData: "#!/bin/bash\nkubeadm join ..." | ||
| vnetSubnetID: "/subscriptions/.../subnets/worker-subnet" | ||
| osDiskSizeGB: 128 | ||
| dataDiskSizeGB: 256 | ||
| subscriptionID: "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" | ||
| resourceGroup: "my-custom-rg" | ||
| location: "westus2" | ||
| managedIdentities: | ||
| - "/subscriptions/.../userAssignedIdentities/my-identity" | ||
| tags: | ||
| environment: production | ||
| security: | ||
| encryptionAtHost: true | ||
| ``` | ||
|
|
||
| ### Adapter Pattern | ||
|
|
||
| Internally, `AzureNodeClass` is converted to `AKSNodeClass` via an adapter function (`AKSNodeClassFromAzureNodeClass`). The VM provider always operates on `AKSNodeClass`, with AzureVM-specific fields carried via `json:"-"` fields that don't appear in the AKSNodeClass CRD schema: | ||
|
|
||
| ``` | ||
| AzureNodeClass → adapter → AKSNodeClass (with hidden fields) → VM Provider | ||
| ``` | ||
|
|
||
| This avoids duplicating the entire VM provider and keeps the code path unified. | ||
|
|
||
| ### Provider Behavior by Mode | ||
|
|
||
| | Behavior | AKS Modes | AzureVM Mode | | ||
| |---|---|---| | ||
| | Image resolution | AKS VHD image families | User-provided `imageID` | | ||
| | Node bootstrap | AKS cloud-init / CSE | User-provided `userData` | | ||
| | LB backend pools | Configured from AKS LB | Skipped | | ||
| | NSG lookup | AKS-managed NSG | Skipped | | ||
| | Billing extension | Installed | Skipped | | ||
| | Identifying extension | Installed | Skipped | | ||
| | CSE extension | Installed (bootstrappingclient) | Skipped | | ||
| | K8s version validation | Required | Skipped | | ||
| | Data disks | Not supported | Optional via `dataDiskSizeGB` | | ||
| | Multi-subscription | Not supported | Optional via `subscriptionID` | | ||
|
|
||
| ## Decisions | ||
|
|
||
| ### Decision 1: Separate CRD vs. extending AKSNodeClass | ||
|
|
||
| #### Option A: Add all fields to AKSNodeClass | ||
| Pro: Single CRD. Con: Pollutes the AKSNodeClass with non-AKS fields; confusing UX for AKS users. | ||
|
|
||
| #### Option B: New AzureNodeClass CRD with adapter pattern | ||
| Pro: Clean separation of concerns; each CRD has only the fields relevant to its use case. Con: Slight code complexity from the adapter. | ||
|
|
||
| #### Conclusion: Option B | ||
| The adapter pattern keeps the AKSNodeClass API clean and focused on AKS, while AzureNodeClass serves the standalone VM use case. The adapter is a simple mapping function, not a complex abstraction layer. | ||
|
|
||
| ### Decision 2: Multi-subscription client management | ||
|
|
||
| #### Option A: Create new SDK clients per-request | ||
| Pro: Simple. Con: Expensive — Azure SDK client creation involves HTTP transport setup. | ||
|
|
||
| #### Option B: Lazy, cached per-subscription client pool (AZClientManager) | ||
| Pro: Clients are created once per subscription and reused. Thread-safe via double-checked locking. Con: Slight memory overhead for cached clients. | ||
|
|
||
| #### Conclusion: Option B | ||
| `AZClientManager` provides `GetClients(subscriptionID)` which returns cached `SubscriptionClients` (containing VirtualMachinesClient, VirtualMachineExtensionsClient, NetworkInterfacesClient, SubnetsClient). Default subscription returns the existing AZClient's clients directly. | ||
|
|
||
| ### Decision 3: Data disk configuration | ||
|
|
||
| Data disks are configured as Premium_LRS managed disks attached at LUN 0 with auto-delete on VM termination. This is a simple, opinionated default suitable for container runtime storage. Future iterations may support multiple disks, custom storage account types, or per-disk configuration. | ||
|
|
||
| ## PR Chain | ||
|
|
||
| The feature is delivered as a chain of incremental PRs: | ||
|
|
||
| 1. **PR 1487 — AzureNodeClass CRD** (`dd8cb731`): Defines the new CRD and adapter | ||
| 2. **PR 1488 — AzureVM provision mode** (`ad6c5a2d`): Adds `--provision-mode=azurevm` flag with relaxed validation | ||
| 3. **PR 1489 — Azure VM provider** (`3bfa8942`): Core VM provider changes for AzureVM mode | ||
| 4. **PR 1497 — Multi-subscription + data disk** (`d0963558`): Per-NodeClass overrides, AZClientManager, data disk | ||
|
|
||
| ## Testing | ||
|
|
||
| * Unit tests for all new helper functions (configureStorageProfile, configureOSProfile, buildVMIdentity, configureDataDisk, resolveEffectiveClients) | ||
| * Unit tests for AZClientManager (default subscription, lazy creation) | ||
| * Unit tests for AKSNodeClassFromAzureNodeClass adapter (all field mappings) | ||
| * Unit tests for GetManagedExtensionNames (AzureVM mode returns no extensions) | ||
| * E2E tests planned with self-managed k8s cluster using custom images | ||
|
|
||
| ## Production Readiness | ||
|
|
||
| * **RBAC**: The controller's managed identity / service principal must have VM Contributor and Network Contributor roles in any target subscription | ||
| * **Quotas**: Standard Azure VM quotas apply per-subscription | ||
| * **Observability**: Existing Karpenter metrics (vm_create_start, vm_create_failure) apply. Error codes are extracted via `ErrorCodeForMetrics` | ||
| * **Upgrade path**: AzureNodeClass is v1alpha1; field changes before GA are expected | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,6 +30,8 @@ var ( | |
| //CompatibilityGroup = "compatibility." + Group | ||
| //go:embed crds/karpenter.azure.com_aksnodeclasses.yaml | ||
| AKSNodeClassCRD []byte | ||
| //go:embed crds/karpenter.azure.com_azurenodeclasses.yaml | ||
| AzureNodeClassCRD []byte | ||
| //go:embed crds/karpenter.sh_nodepools.yaml | ||
| NodePoolCRD []byte | ||
| //go:embed crds/karpenter.sh_nodeclaims.yaml | ||
|
|
@@ -38,6 +40,7 @@ var ( | |
| NodeOverlayCRD []byte | ||
| CRDs = []*apiextensionsv1.CustomResourceDefinition{ | ||
| object.Unmarshal[apiextensionsv1.CustomResourceDefinition](AKSNodeClassCRD), | ||
| object.Unmarshal[apiextensionsv1.CustomResourceDefinition](AzureNodeClassCRD), | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Question: Adding |
||
| object.Unmarshal[apiextensionsv1.CustomResourceDefinition](NodePoolCRD), | ||
| object.Unmarshal[apiextensionsv1.CustomResourceDefinition](NodeClaimCRD), | ||
| object.Unmarshal[apiextensionsv1.CustomResourceDefinition](NodeOverlayCRD), | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: The design doc YAML example includes fields that don't exist in this PR's CRD:
subscriptionID,resourceGroup,location, anddataDiskSizeGB. These are added in PR #1497 later in the chain. This creates a confusing artifact where the design doc advertises an API surface that doesn't match the CRD in this commit. Consider either:†are added in a subsequent PR" to set expectations.