|
| 1 | +# AzureVM Provision Mode |
| 2 | + |
| 3 | +**Author:** @comtalyst |
| 4 | + |
| 5 | +**Last updated:** March 7, 2026 |
| 6 | + |
| 7 | +**Status:** Proposed |
| 8 | + |
| 9 | +## Overview |
| 10 | + |
| 11 | +AzureVM provision mode enables Karpenter to provision standalone Azure Virtual Machines that are **not** part of an AKS cluster. This opens Karpenter as a general-purpose VM autoscaler for any Kubernetes distribution running on Azure (e.g., self-managed k8s, Rancher, OpenShift, Talos). |
| 12 | + |
| 13 | +In existing AKS modes (`AKSMachineAPI`, `AKSScriptless`, `BootstrappingClient`), Karpenter relies heavily on AKS-specific infrastructure: image family resolution via AKS VHD build system, node bootstrapping via AKS's cloud-init/CSE pipeline, AKS billing extensions, and AKS load balancer backend pool management. AzureVM mode bypasses all of these, giving the user full control over image selection and node bootstrapping. |
| 14 | + |
| 15 | +### Goals |
| 16 | + |
| 17 | +* Allow Karpenter to provision VMs outside of AKS clusters |
| 18 | +* Support user-provided VM images (Compute Gallery, SIG, or any ARM image resource) |
| 19 | +* Support user-provided bootstrap data (cloud-init / custom scripts via `userData`) |
| 20 | +* Support per-NodeClass subscription, resource group, and location overrides for multi-subscription deployments |
| 21 | +* Support per-NodeClass managed identity assignment |
| 22 | +* Support optional data disk attachment |
| 23 | +* Maintain backward compatibility — existing AKS modes are unaffected |
| 24 | + |
| 25 | +### Non-Goals |
| 26 | + |
| 27 | +* Windows VM support (Linux only for now) |
| 28 | +* Automatic image updates or OS patching |
| 29 | +* Karpenter-managed node bootstrapping (the user provides their own) |
| 30 | +* AKS billing extension or AKS identifying extension in AzureVM mode |
| 31 | +* Node auto-join to AKS clusters (use AKS modes for that) |
| 32 | + |
| 33 | +## Architecture |
| 34 | + |
| 35 | +### New CRD: AzureNodeClass (karpenter.azure.com/v1alpha1) |
| 36 | + |
| 37 | +A new CRD `AzureNodeClass` is introduced alongside the existing `AKSNodeClass`. It contains fields relevant to generic Azure VM provisioning: |
| 38 | + |
| 39 | +```yaml |
| 40 | +apiVersion: karpenter.azure.com/v1alpha1 |
| 41 | +kind: AzureNodeClass |
| 42 | +metadata: |
| 43 | + name: my-nodeclass |
| 44 | +spec: |
| 45 | + imageID: "/subscriptions/.../Microsoft.Compute/galleries/.../versions/1.0.0" |
| 46 | + userData: "#!/bin/bash\nkubeadm join ..." |
| 47 | + vnetSubnetID: "/subscriptions/.../subnets/worker-subnet" |
| 48 | + osDiskSizeGB: 128 |
| 49 | + dataDiskSizeGB: 256 |
| 50 | + subscriptionID: "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" |
| 51 | + resourceGroup: "my-custom-rg" |
| 52 | + location: "westus2" |
| 53 | + managedIdentities: |
| 54 | + - "/subscriptions/.../userAssignedIdentities/my-identity" |
| 55 | + tags: |
| 56 | + environment: production |
| 57 | + security: |
| 58 | + encryptionAtHost: true |
| 59 | +``` |
| 60 | +
|
| 61 | +### Adapter Pattern |
| 62 | +
|
| 63 | +Internally, `AzureNodeClass` is converted to `AKSNodeClass` via an adapter function (`AKSNodeClassFromAzureNodeClass`). The VM provider always operates on `AKSNodeClass`, with AzureVM-specific fields carried via `json:"-"` fields that don't appear in the AKSNodeClass CRD schema: |
| 64 | + |
| 65 | +``` |
| 66 | +AzureNodeClass → adapter → AKSNodeClass (with hidden fields) → VM Provider |
| 67 | +``` |
| 68 | +
|
| 69 | +This avoids duplicating the entire VM provider and keeps the code path unified. |
| 70 | +
|
| 71 | +### Provider Behavior by Mode |
| 72 | +
|
| 73 | +| Behavior | AKS Modes | AzureVM Mode | |
| 74 | +|---|---|---| |
| 75 | +| Image resolution | AKS VHD image families | User-provided `imageID` | |
| 76 | +| Node bootstrap | AKS cloud-init / CSE | User-provided `userData` | |
| 77 | +| LB backend pools | Configured from AKS LB | Skipped | |
| 78 | +| NSG lookup | AKS-managed NSG | Skipped | |
| 79 | +| Billing extension | Installed | Skipped | |
| 80 | +| Identifying extension | Installed | Skipped | |
| 81 | +| CSE extension | Installed (bootstrappingclient) | Skipped | |
| 82 | +| K8s version validation | Required | Skipped | |
| 83 | +| Data disks | Not supported | Optional via `dataDiskSizeGB` | |
| 84 | +| Multi-subscription | Not supported | Optional via `subscriptionID` | |
| 85 | +
|
| 86 | +## Decisions |
| 87 | +
|
| 88 | +### Decision 1: Separate CRD vs. extending AKSNodeClass |
| 89 | +
|
| 90 | +#### Option A: Add all fields to AKSNodeClass |
| 91 | +Pro: Single CRD. Con: Pollutes the AKSNodeClass with non-AKS fields; confusing UX for AKS users. |
| 92 | +
|
| 93 | +#### Option B: New AzureNodeClass CRD with adapter pattern |
| 94 | +Pro: Clean separation of concerns; each CRD has only the fields relevant to its use case. Con: Slight code complexity from the adapter. |
| 95 | +
|
| 96 | +#### Conclusion: Option B |
| 97 | +The adapter pattern keeps the AKSNodeClass API clean and focused on AKS, while AzureNodeClass serves the standalone VM use case. The adapter is a simple mapping function, not a complex abstraction layer. |
| 98 | +
|
| 99 | +### Decision 2: Multi-subscription client management |
| 100 | +
|
| 101 | +#### Option A: Create new SDK clients per-request |
| 102 | +Pro: Simple. Con: Expensive — Azure SDK client creation involves HTTP transport setup. |
| 103 | +
|
| 104 | +#### Option B: Lazy, cached per-subscription client pool (AZClientManager) |
| 105 | +Pro: Clients are created once per subscription and reused. Thread-safe via double-checked locking. Con: Slight memory overhead for cached clients. |
| 106 | +
|
| 107 | +#### Conclusion: Option B |
| 108 | +`AZClientManager` provides `GetClients(subscriptionID)` which returns cached `SubscriptionClients` (containing VirtualMachinesClient, VirtualMachineExtensionsClient, NetworkInterfacesClient, SubnetsClient). Default subscription returns the existing AZClient's clients directly. |
| 109 | +
|
| 110 | +### Decision 3: Data disk configuration |
| 111 | +
|
| 112 | +Data disks are configured as Premium_LRS managed disks attached at LUN 0 with auto-delete on VM termination. This is a simple, opinionated default suitable for container runtime storage. Future iterations may support multiple disks, custom storage account types, or per-disk configuration. |
| 113 | +
|
| 114 | +## PR Chain |
| 115 | +
|
| 116 | +The feature is delivered as a chain of incremental PRs: |
| 117 | +
|
| 118 | +1. **PR 1487 — AzureNodeClass CRD** (`dd8cb731`): Defines the new CRD and adapter |
| 119 | +2. **PR 1488 — AzureVM provision mode** (`ad6c5a2d`): Adds `--provision-mode=azurevm` flag with relaxed validation |
| 120 | +3. **PR 1489 — Azure VM provider** (`3bfa8942`): Core VM provider changes for AzureVM mode |
| 121 | +4. **PR 1497 — Multi-subscription + data disk** (`d0963558`): Per-NodeClass overrides, AZClientManager, data disk |
| 122 | +
|
| 123 | +## Testing |
| 124 | +
|
| 125 | +* Unit tests for all new helper functions (configureStorageProfile, configureOSProfile, buildVMIdentity, configureDataDisk, resolveEffectiveClients) |
| 126 | +* Unit tests for AZClientManager (default subscription, lazy creation) |
| 127 | +* Unit tests for AKSNodeClassFromAzureNodeClass adapter (all field mappings) |
| 128 | +* Unit tests for GetManagedExtensionNames (AzureVM mode returns no extensions) |
| 129 | +* E2E tests planned with self-managed k8s cluster using custom images |
| 130 | +
|
| 131 | +## Production Readiness |
| 132 | +
|
| 133 | +* **RBAC**: The controller's managed identity / service principal must have VM Contributor and Network Contributor roles in any target subscription |
| 134 | +* **Quotas**: Standard Azure VM quotas apply per-subscription |
| 135 | +* **Observability**: Existing Karpenter metrics (vm_create_start, vm_create_failure) apply. Error codes are extracted via `ErrorCodeForMetrics` |
| 136 | +* **Upgrade path**: AzureNodeClass is v1alpha1; field changes before GA are expected |
0 commit comments