USHIFT-7237: fix bootupd --update-firmware failure in QEMU/KVM VMs#6885
USHIFT-7237: fix bootupd --update-firmware failure in QEMU/KVM VMs#6885eslutsky wants to merge 2 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds ChangesBootloader Installation Workaround
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eslutsky The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
c238125 to
bab7eb0
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/kickstart-templates/includes/main-prologue.cfg`:
- Around line 7-19: The current sed monkey-patch in the %pre section lacks
robustness and validation. First, make the sed pattern more precise by anchoring
it with `^` and matching the opening parenthesis (e.g., use `/^def
have_bootupd\(/` instead of `/def have_bootupd/`) to avoid unintended matches in
comments or docstrings. Second, add explicit validation after the sed command
completes to verify the return statement was actually injected into the util.py
file; if the injection fails, echo a clear error message to `/dev/console` and
exit with a non-zero status rather than silently failing. Finally, add log
messages to `/dev/console` to report whether the patch succeeded or what went
wrong, so debugging failures is easier when the installer fails later.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5e613d65-f0df-4cf2-80cc-7a51416e0367
📒 Files selected for processing (1)
test/kickstart-templates/includes/main-prologue.cfg
| # Work around bootupd 0.2.31 regression where "bootupctl backend install --auto" | ||
| # fails with "No update metadata for component BIOS/EFI found" in QEMU/KVM VMs. | ||
| # Patch anaconda's have_bootupd() to return False so the legacy bootloader | ||
| # configuration path is used instead of calling bootupctl. | ||
| # The %pre section runs before anaconda imports the rpm-ostree payload module, | ||
| # so the patched source file is picked up on first import. | ||
| %pre --erroronfail --log=/dev/console | ||
| UTIL_FILE=$(find /usr/lib*/python3*/site-packages/pyanaconda/modules/payloads/payload/rpm_ostree -name util.py 2>/dev/null | head -1) | ||
| if [ -n "${UTIL_FILE}" ] && grep -q 'def have_bootupd' "${UTIL_FILE}"; then | ||
| sed -i '/def have_bootupd/a\ return False' "${UTIL_FILE}" | ||
| find "$(dirname "${UTIL_FILE}")" -name '__pycache__' -type d -exec rm -rf {} + 2>/dev/null || true | ||
| fi | ||
| %end |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the pyanaconda rpm_ostree util.py have_bootupd() function signature and implementation in RHEL 9.8?
💡 Result:
In RHEL 9.8 (and consistent with recent versions of the Anaconda installer), the have_bootupd function is a utility designed to detect whether the bootupd utility is present and applicable for an OSTree-based installation [1][2]. Function Signature: The function signature is: def have_bootupd(sysroot) Implementation Details: The function is located in the pyanaconda/modules/payloads/payload/rpm_ostree/util.py module. While specific lines may vary slightly by patch level, the implementation typically checks for the existence of the bootupctl binary within the context of the target system root (sysroot) [1]. It is used by Anaconda, specifically within the rpm-ostree payload installation tasks, to determine whether to utilize bootupd for bootloader configuration (such as installing boot loader updates or writing UUIDs) or to fall back to traditional methods like moving GRUB configuration files [1][3][2]. For example, in the ConfigureBootloader task, the code performs the following check [1]: if have_bootupd(self._sysroot): self._install_bootupd else: self._move_grub_config This utility ensures that Anaconda can handle modern OSTree-based systems that require bootupd for distribution-independent bootloader updates [2][4].
Citations:
- 1: https://github.com/rhinstaller/anaconda/blob/main/pyanaconda/modules/payloads/payload/rpm_ostree/installation.py
- 2: Automatic bootupd support rhinstaller/anaconda#5342
- 3: Dual boot support for bootupd with static GRUB config installations fedora-silverblue/issue-tracker#530
- 4: https://github.com/cgwalters/bootupd/tree/use-bootc-utils
🏁 Script executed:
# Check the PR/commit context and objectives in the repository
git log --oneline --all test/kickstart-templates/includes/main-prologue.cfg | head -20Repository: openshift/microshift
Length of output: 465
🏁 Script executed:
# Look for any PR description, issue references, or related discussions in the repo
find . -type f \( -name "*.md" -o -name "*.txt" -o -name "*.rst" \) | xargs rg -l "bootloader.*leavebootorder|have_bootupd" 2>/dev/null | head -10Repository: openshift/microshift
Length of output: 46
🏁 Script executed:
# Check if there's a specific pyanaconda version constraint or reference in the codebase
rg "pyanaconda|anaconda" -t cfg -t ini -t toml test/ | grep -i "version\|require" | head -20Repository: openshift/microshift
Length of output: 92
🏁 Script executed:
# Inspect the actual kickstart file for more context around the patch
cat -n test/kickstart-templates/includes/main-prologue.cfg | head -30Repository: openshift/microshift
Length of output: 1696
🏁 Script executed:
# Check commit message and full context for the file
git show bab7eb077 --statRepository: openshift/microshift
Length of output: 50378
🏁 Script executed:
# Look for any references to bootloader, leavebootorder, or related discussions
rg "bootloader|leavebootorder|have_bootupd" -i . --type sh --type py --type md 2>/dev/null | grep -v "node_modules\|\.git" | head -30Repository: openshift/microshift
Length of output: 46
🏁 Script executed:
# Check if there are any issue/PR references in the repo directory
find . -name "*.md" -o -name "*.txt" | xargs cat 2>/dev/null | grep -i "bootloader\|leavebootorder" | head -10Repository: openshift/microshift
Length of output: 46
Remove silent failure mode and improve robustness of the sed patch.
The monkey-patch approach works but lacks validation. If the pyanaconda file doesn't exist, the function definition differs, or the sed injection fails for any reason, the script exits silently and the installer fails later with a cryptic "No update metadata for component BIOS/EFI found" error instead of failing fast with clear diagnostics. Additionally, the sed pattern /def have_bootupd/ is unanchored and could inadvertently match text in comments or docstrings.
Recommend:
- Add explicit validation: check that the sed injection succeeded (e.g., verify the return statement exists in the file afterwards), and exit with a clear error message if it didn't.
- Use a more precise sed pattern (e.g.,
/^def have_bootupd\(/or/^def have_bootupd(/) to reduce the risk of unintended matches. - Consider adding a log message (to
/dev/consolesince--logis already set) to report success or failure for debugging.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kickstart-templates/includes/main-prologue.cfg` around lines 7 - 19, The
current sed monkey-patch in the %pre section lacks robustness and validation.
First, make the sed pattern more precise by anchoring it with `^` and matching
the opening parenthesis (e.g., use `/^def have_bootupd\(/` instead of `/def
have_bootupd/`) to avoid unintended matches in comments or docstrings. Second,
add explicit validation after the sed command completes to verify the return
statement was actually injected into the util.py file; if the injection fails,
echo a clear error message to `/dev/console` and exit with a non-zero status
rather than silently failing. Finally, add log messages to `/dev/console` to
report whether the patch succeeded or what went wrong, so debugging failures is
easier when the installer fails later.
bootupd was upgraded from 0.2.27 to 0.2.31 in RHEL 9.8 repos. The new version's "bootupctl backend install --auto" fails with "No update metadata for component BIOS/EFI found" in QEMU/KVM VMs because the edge-commit ostree image lacks bootupd update metadata. Work around this by: 1. Patching anaconda's _install_bootupd() in %pre to suppress the BootloaderInstallationError, allowing the installation to complete. 2. Installing GRUB to the MBR manually in %post --nochroot for BIOS boot systems (EFI systems get bootloader files from the ostree deploy). This affects all kickstart templates via the shared main-prologue.cfg. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bab7eb0 to
2b77220
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/kickstart-templates/includes/main-prologue.cfg`:
- Around line 22-26: The BOOT_DISK fallback logic on line 24 is too narrow, only
handling the vda device which assumes a QEMU environment, and will fail silently
on other disk types like NVMe (nvme0n1) or SATA (sda). Additionally, line 25's
|| true suppression mask grub2-install errors completely, preventing debugging
of installation failures. Fix this by enhancing the disk detection logic to
probe for common disk types (nvme0n1, sda, hda, and others) as fallbacks in
sequence, and remove the || true suppression, instead capturing the
grub2-install exit status and stderr and logging both to /dev/console to enable
debugging when the fallback path is taken.
- Around line 13-17: The sed command used to patch the INSTALL_FILE has three
issues: the pattern for matching raise BootloaderInstallationError is unanchored
and could match in comments or docstrings, the two-line replacement via n;s/^/#
/ assumes the raise statement spans two lines but won't handle single-line
raises correctly, and there is no validation after sed executes to confirm the
patch was successfully applied. Anchor the sed pattern to match only code
statements using `^\s*raise BootloaderInstallationError\b` to avoid false
positives in comments or docstrings, extend the sed replacement to handle both
single-line and multi-line raise statements, and add a validation check after
sed completes to verify the return statement was injected; if validation fails,
log an error message and exit the script rather than silently continuing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 3d0024bb-9c47-494a-b67a-262f99d8fd60
📒 Files selected for processing (1)
test/kickstart-templates/includes/main-prologue.cfg
| if [ ! -d /sys/firmware/efi ]; then | ||
| BOOT_DISK=$(lsblk -ndo PKNAME "$(findmnt -no SOURCE /mnt/sysroot)" 2>/dev/null | head -1) | ||
| [ -z "${BOOT_DISK}" ] && BOOT_DISK=vda | ||
| grub2-install --target=i386-pc --boot-directory=/mnt/sysroot/boot "/dev/${BOOT_DISK}" 2>&1 || true | ||
| fi |
There was a problem hiding this comment.
Hardcoded boot disk fallback "vda" is fragile; silent error suppression masks failures.
Line 24's fallback to vda assumes a QEMU/virtio environment but will silently fail on NVMe (nvme0n1), SATA (sda), or other disk types. Additionally, line 25's || true suppresses grub2-install errors entirely, making it impossible to diagnose whether GRUB was actually installed if the fallback path is taken.
Recommend:
- Enhance disk detection to handle NVMe and other common disk types (e.g.,
nvme0n1p*,sda*,hda*). - Remove
|| truefrom line 25 and log grub2-install's exit status and stderr to/dev/consolefor debugging.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kickstart-templates/includes/main-prologue.cfg` around lines 22 - 26,
The BOOT_DISK fallback logic on line 24 is too narrow, only handling the vda
device which assumes a QEMU environment, and will fail silently on other disk
types like NVMe (nvme0n1) or SATA (sda). Additionally, line 25's || true
suppression mask grub2-install errors completely, preventing debugging of
installation failures. Fix this by enhancing the disk detection logic to probe
for common disk types (nvme0n1, sda, hda, and others) as fallbacks in sequence,
and remove the || true suppression, instead capturing the grub2-install exit
status and stderr and logging both to /dev/console to enable debugging when the
fallback path is taken.
|
/retest |
8e52706 to
2e8f1d6
Compare
|
@eslutsky: This pull request references USHIFT-7237 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
2e8f1d6 to
7a8d496
Compare
Extend the bootupd workaround to handle EFI boot systems (ARM) and generate a proper grub.cfg for ostree BLS boot entries. - Remove stale grub.cfg from ESP to avoid overlay resolution errors (BZ#1575957), run grub2-install with the appropriate EFI target and --no-nvram for EFI systems. - Generate grub.cfg with blscfg and a UUID-based search for the boot partition (the partition has no filesystem label). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7a8d496 to
94312e3
Compare
|
@eslutsky: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
The fix suggested in this PR is too intrusive. We are looking for other solutions. |
|
@ggiguash: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Summary
bootupdto the ostree sysroot, causing anaconda to switch from the legacy grub bootloader path to thebootupdcode pathbootupdpath passes--update-firmwaretobootupctlby default, which tries to write EFI firmware variables — this fails in QEMU/KVM VMsel98-src@*scenario fails withBootloaderInstallationError: failed to write boot loader configurationbootloader --leavebootorderto the kickstart prologue, which tells anaconda to skip the--update-firmwareflagRoot cause
Anaconda's
_install_bootupd()method adds--update-firmwarewhenbootloader.KeepBootOrderis False (default). The--update-firmwareflag causesbootupctl backend installto attempt writing UEFI boot entries, which fails in VMs without real EFI firmware variable storage.The kickstart
bootloader --leavebootorderdirective setsKeepBootOrder=True, preventing the flag.Timeline
--update-firmwarepath triggersError trace
Test plan
el98-src@*) pass in e2e-aws-tests🤖 Generated with Claude Code
Summary by CodeRabbit