KVM: Assign explicit PCI slot when hot-plugging NIC for sequential naming#12826
KVM: Assign explicit PCI slot when hot-plugging NIC for sequential naming#12826jmsperu wants to merge 2 commits into
Conversation
…ial naming When hot-plugging a NIC to a running VM, libvirt auto-assigns the next free PCI slot. Since non-NIC devices (virtio-serial, disk, balloon, watchdog) occupy slots immediately after existing NICs, the hot-plugged NIC gets a much higher slot number (e.g. 0x09 instead of 0x05), causing the guest to see non-sequential interface names (ens9 instead of ens5). This fix queries the domain XML to find all used PCI slots and assigns the next free slot after the highest existing NIC slot. This matches the approach already used by LibvirtReplugNicCommandWrapper which preserves PCI slots during re-plug operations. Fixes apache#12825 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12826 +/- ##
============================================
+ Coverage 16.24% 16.26% +0.01%
- Complexity 13411 13436 +25
============================================
Files 5664 5666 +2
Lines 500463 500657 +194
Branches 60779 60804 +25
============================================
+ Hits 81308 81434 +126
- Misses 410059 410111 +52
- Partials 9096 9112 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR aims to make KVM NIC hot-plug behavior more deterministic by explicitly setting a PCI slot for the newly attached NIC, with the goal of preserving predictable/sequential interface naming inside the guest.
Changes:
- Adds logic in
LibvirtPlugNicCommandWrapperto compute the next available PCI slot from the domain XML. - Sets the computed PCI slot on the
InterfaceDefbefore callingDomain.attachDevice(). - Introduces fallback behavior to let libvirt auto-assign the slot when XML cannot be read or when no slots appear available.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| String domXml = vm.getXMLDesc(0); | ||
|
|
||
| // Parse all PCI slot numbers currently in use | ||
| Set<Integer> usedSlots = new HashSet<>(); | ||
| Pattern slotPattern = Pattern.compile("slot='0x([0-9a-fA-F]+)'"); | ||
| Matcher matcher = slotPattern.matcher(domXml); | ||
| while (matcher.find()) { | ||
| usedSlots.add(Integer.parseInt(matcher.group(1), 16)); | ||
| } |
| /** | ||
| * Finds the next available PCI slot for a hot-plugged NIC by examining | ||
| * all PCI slots currently in use by the domain. This ensures the new NIC | ||
| * gets a sequential PCI address relative to existing NICs, resulting in | ||
| * predictable interface naming in the guest OS (e.g. ens5 instead of ens9). | ||
| */ | ||
| private Integer findNextAvailablePciSlot(final Domain vm, final List<InterfaceDef> pluggedNics) { | ||
| try { | ||
| String domXml = vm.getXMLDesc(0); | ||
|
|
||
| // Parse all PCI slot numbers currently in use | ||
| Set<Integer> usedSlots = new HashSet<>(); | ||
| Pattern slotPattern = Pattern.compile("slot='0x([0-9a-fA-F]+)'"); | ||
| Matcher matcher = slotPattern.matcher(domXml); | ||
| while (matcher.find()) { | ||
| usedSlots.add(Integer.parseInt(matcher.group(1), 16)); | ||
| } | ||
|
|
||
| // Find the highest PCI slot used by existing NICs | ||
| int maxNicSlot = 0; | ||
| for (InterfaceDef pluggedNic : pluggedNics) { | ||
| if (pluggedNic.getSlot() != null && pluggedNic.getSlot() > maxNicSlot) { | ||
| maxNicSlot = pluggedNic.getSlot(); | ||
| } | ||
| } | ||
|
|
||
| // Find next free slot starting from maxNicSlot + 1 | ||
| // PCI slots range from 0x01 to 0x1f (slot 0 is reserved for host bridge) | ||
| for (int slot = maxNicSlot + 1; slot <= 0x1f; slot++) { | ||
| if (!usedSlots.contains(slot)) { | ||
| return slot; | ||
| } | ||
| } | ||
|
|
||
| logger.warn("No free PCI slots available, letting libvirt auto-assign"); | ||
| return null; | ||
| } catch (LibvirtException e) { | ||
| logger.warn("Failed to get domain XML for PCI slot calculation, letting libvirt auto-assign", e); | ||
| return null; | ||
| } |
| // Explicitly assign PCI slot to ensure sequential NIC naming in the guest. | ||
| // Without this, libvirt auto-assigns the next free PCI slot which may be | ||
| // non-sequential with existing NICs (e.g. ens9 instead of ens5), causing | ||
| // guest network configuration to fail. | ||
| Integer nextSlot = findNextAvailablePciSlot(vm, pluggedNics); | ||
| if (nextSlot != null) { | ||
| interfaceDef.setSlot(nextSlot); | ||
| logger.debug("Assigning PCI slot 0x" + String.format("%02x", nextSlot) + " to hot-plugged NIC"); | ||
| } |
|
@jmsperu |
Test ResultsTested on CloudStack 4.22.0.0 with patched Environment
Test 1: Hot-plug NIC gets sequential PCI slot ✅BEFORE — VM with 1 NIC, PCI slots 0x01–0x07 in use: slot='0x01' (function 0x1) Action: Added NIC via AFTER — New NIC assigned slot 0x08 (sequentially after highest used slot 0x07): slot='0x01' (function 0x1) Test 2: VM remains functional ✅VM continued running throughout the hot-plug operation. Both NICs operational. Test 3: Fallback behavior ✅Covered by code review — Checklist
|
|
Two-month gentle ping. The test results posted on 2026-03-26 cover the scenarios I worked through — happy to address any additional verification asks. The current codecov/project FAILURE is on lines not touched by this patch (the change is a 1-line PCI-slot assignment fix in LibvirtVMDef.java). If anyone has a moment to take a look, I'm here. |
| // Parse all PCI slot numbers currently in use | ||
| Set<Integer> usedSlots = new HashSet<>(); | ||
| Pattern slotPattern = Pattern.compile("slot='0x([0-9a-fA-F]+)'"); | ||
| Matcher matcher = slotPattern.matcher(domXml); | ||
| while (matcher.find()) { | ||
| usedSlots.add(Integer.parseInt(matcher.group(1), 16)); | ||
| } | ||
|
|
||
| // Find the highest PCI slot used by existing NICs | ||
| int maxNicSlot = 0; | ||
| for (InterfaceDef pluggedNic : pluggedNics) { | ||
| if (pluggedNic.getSlot() != null && pluggedNic.getSlot() > maxNicSlot) { | ||
| maxNicSlot = pluggedNic.getSlot(); | ||
| } | ||
| } | ||
|
|
||
| // Find next free slot starting from maxNicSlot + 1 | ||
| // PCI slots range from 0x01 to 0x1f (slot 0 is reserved for host bridge) | ||
| for (int slot = maxNicSlot + 1; slot <= 0x1f; slot++) { | ||
| if (!usedSlots.contains(slot)) { | ||
| return slot; | ||
| } | ||
| } |
There was a problem hiding this comment.
I’d prefer these three loops to be three separate methods. (just a style remark but might be useful in code de-dup and stack-trace analysis at some point)
| // Find the highest PCI slot used by existing NICs | ||
| int maxNicSlot = 0; | ||
| for (InterfaceDef pluggedNic : pluggedNics) { | ||
| if (pluggedNic.getSlot() != null && pluggedNic.getSlot() > maxNicSlot) { | ||
| maxNicSlot = pluggedNic.getSlot(); | ||
| } | ||
| } | ||
|
|
||
| // Find next free slot starting from maxNicSlot + 1 | ||
| // PCI slots range from 0x01 to 0x1f (slot 0 is reserved for host bridge) | ||
| for (int slot = maxNicSlot + 1; slot <= 0x1f; slot++) { | ||
| if (!usedSlots.contains(slot)) { | ||
| return slot; | ||
| } | ||
| } |
| String domXml = vm.getXMLDesc(0); | ||
|
|
||
| // Parse all PCI slot numbers currently in use | ||
| Set<Integer> usedSlots = new HashSet<>(); | ||
| Pattern slotPattern = Pattern.compile("slot='0x([0-9a-fA-F]+)'"); | ||
| Matcher matcher = slotPattern.matcher(domXml); | ||
| while (matcher.find()) { | ||
| usedSlots.add(Integer.parseInt(matcher.group(1), 16)); | ||
| } |
| // Explicitly assign PCI slot to ensure sequential NIC naming in the guest. | ||
| // Without this, libvirt auto-assigns the next free PCI slot which may be | ||
| // non-sequential with existing NICs (e.g. ens9 instead of ens5), causing | ||
| // guest network configuration to fail. | ||
| Integer nextSlot = findNextAvailablePciSlot(vm, pluggedNics); | ||
| if (nextSlot != null) { | ||
| interfaceDef.setSlot(nextSlot); | ||
| logger.debug("Assigning PCI slot 0x" + String.format("%02x", nextSlot) + " to hot-plugged NIC"); | ||
| } | ||
|
|
||
| vm.attachDevice(interfaceDef.toString()); |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 18043 |
|
@jmsperu , unit test error on both deb and rpm: |
LibvirtPlugNicCommandWrapper.findNextAvailablePciSlot calls vm.getXMLDesc(0) and pipes the result straight into Pattern.matcher, which NPEs if libvirt returned null (or, in the LibvirtComputingResourceTest.testPlugNicCommandNoMatchMack unit test, when the Domain mock isn't stubbed for getXMLDesc). Reported by @DaanHoogland after the SL packaging run on apache#12826. Defensive null check returns null from findNextAvailablePciSlot when the domain XML can't be parsed, which falls through to libvirt's auto-assignment of the PCI slot — same behaviour as before this PR when nextSlot is null. Also stubs Domain.getXMLDesc(0) in testPlugNicCommandNoMatchMack with a minimal <domain> XML that exercises the parser path (rather than just relying on the null-fallback), so the test continues to cover the happy path of the new logic.
|
Thanks for the catch @DaanHoogland 🙏 You were right — Just pushed two fixes:
Will keep an eye on the next SL run. |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18047 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-16195)
|
Description
Fixes #12825
When hot-plugging a NIC to a running KVM VM, libvirt auto-assigns the next free PCI slot. Since non-NIC PCI devices (virtio-serial controller, virtio-disk, memballoon, watchdog) occupy slots immediately after existing NICs (slots 0x05-0x08), the hot-plugged NIC gets a much higher slot number (e.g., 0x09), causing the guest to see
ens9instead of the expected sequentialens5. The NIC shows as DOWN with no IP.Changes
Modified
LibvirtPlugNicCommandWrapper.javato explicitly assign a PCI slot when hot-plugging:This matches the approach already used by
LibvirtReplugNicCommandWrapperwhich preserves PCI slots during re-plug operations (line 70:interfaceDef.setSlot(oldPluggedNic.getSlot())).Falls back gracefully to libvirt auto-assignment if:
Evidence
Before fix — hot-plugged NIC at non-sequential PCI slot 0x09:
After relocating non-NIC devices to high slots (proving the concept):
Note: For complete sequential naming on existing VMs, a follow-up change is needed to assign non-NIC PCI devices to higher slot numbers during VM creation in
createDevicesDef().Test plan
🤖 Generated with Claude Code