Skip to content

feat: skip cloud-init ready report and add standalone report_ready script#8056

Draft
awesomenix wants to merge 2 commits intomainfrom
nishchay/exciting-work
Draft

feat: skip cloud-init ready report and add standalone report_ready script#8056
awesomenix wants to merge 2 commits intomainfrom
nishchay/exciting-work

Conversation

@awesomenix
Copy link
Contributor

Bake experimental_skip_ready_report into VHD via cloud.cfg.d to skip
cloud-init's built-in health ready report to Azure fabric. Add a
standalone Python script (report_ready.py) that can be invoked from
CSE to report ready at the appropriate time during node provisioning.

Depends on canonical/cloud-init#6771.

I tried importing the cloud init library directly and copilot suggested this

The problems:

  1. Heavy import chain — importing cloudinit.sources.helpers.azure pulls in cloudinit.url_helper → requests, cloudinit.distros, cloudinit.subp, and dozens
    more. It works on the VM (cloud-init's deps are installed), but it's a large dependency surface.
  2. get_metadata_from_fabric() requires a distro object — it's a full cloud-init Distro class instance (needed for ISO ejection). You'd have to either
    construct one or monkey-patch it.
  3. report_success_to_host() depends on cloud-init's reporting handler registry — kvp.get_kvp_handler() looks up
    instantiated_handler_registry.registered_items["telemetry"], which is only populated during cloud-init's normal boot. Outside of cloud-init's lifecycle, this
    returns None and silently skips KVP reporting.
  4. Version coupling — your script would break if cloud-init refactors these internal APIs (they're not public API). Different VHDs may ship different
    cloud-init versions.

A middle ground: you could import just the KVP handler class directly and the low-level wireserver pieces, bypassing the high-level functions:

from cloudinit.reporting.handlers import HyperVKvpReportingHandler
handler = HyperVKvpReportingHandler()
handler.write_key("PROVISIONING_REPORT", report_string)

But the wireserver reporting (GoalState + health POST) has deep entanglement with url_helper, distro objects, and telemetry decorators that make it hard to
call standalone.

Copilot AI review requested due to automatic review settings March 10, 2026 20:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a VHD-baked mechanism to suppress cloud-init’s built-in “ready” report and replaces it with an explicit, CSE-invoked readiness/failure report to Azure wireserver.

Changes:

  • Add a standalone report_ready.py script that writes Hyper-V KVP provisioning status and POSTs health to wireserver.
  • Bake report_ready.py into multiple VHD build variants and copy it to /opt/azure/containers/.
  • Add skipCloudInitReadyReport to write cloud-init config, and invoke the reporting script from cse_start.sh.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
vhdbuilder/packer/vhd-image-builder-mariner.json Adds report_ready.py to files uploaded into the build VM.
vhdbuilder/packer/vhd-image-builder-mariner-cvm.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-mariner-arm64.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-flatcar.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-flatcar-arm64.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-cvm.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-base.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-arm64-gen2.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/vhd-image-builder-acl.json Same: stage report_ready.py into build VM.
vhdbuilder/packer/packer_source.sh Copies report_ready.py into /opt/azure/containers/ with execute perms.
vhdbuilder/packer/install-dependencies.sh Invokes skipCloudInitReadyReport during VHD build.
vhdbuilder/packer/imagecustomizer/azlosguard/azlosguard.yml Ensures OSGuard imagecustomizer also places report_ready.py into /opt/azure/containers/.
parts/linux/cloud-init/artifacts/report_ready.py New standalone readiness/failure reporting implementation.
parts/linux/cloud-init/artifacts/cse_start.sh Calls report_ready.py on provisioning success/failure if present on the VHD.
parts/linux/cloud-init/artifacts/cse_install.sh Adds skipCloudInitReadyReport() helper that writes cloud-init config to skip built-in ready report.

addMarinerNvidiaRepo
updateDnfWithNvidiaPkg
overrideNetworkConfig || exit 1
skipCloudInitReadyReport || exit 1
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipCloudInitReadyReport is already invoked unconditionally earlier in this script, so calling it again in the Mariner/AzureLinux block is redundant and makes the flow harder to reason about. Consider removing this second call to keep the configuration applied in a single place.

Suggested change
skipCloudInitReadyReport || exit 1

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +106
if [ -x /opt/azure/containers/report_ready.py ]; then
if [ "$EXIT_CODE" -eq 0 ]; then
python3 /opt/azure/containers/report_ready.py -v || echo "WARNING: Failed to report ready to Azure fabric"
else
python3 /opt/azure/containers/report_ready.py -v --failure --description "ExitCode: ${EXIT_CODE}, ${message_string}" || echo "WARNING: Failed to report failure to Azure fabric"
fi
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This report_ready.py invocation runs synchronously before log upload/exit, and can block provisioning for up to ~100s on wireserver timeouts/retries (GET/POST timeouts are 30s with multiple retries). If this is intended to be best-effort (as suggested by || echo "WARNING"), consider running it in the background on success and/or moving it after upload_logs (especially on failure) or passing tighter retry/timeout settings to avoid delaying provisioning and log upload.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +106
if [ -x /opt/azure/containers/report_ready.py ]; then
if [ "$EXIT_CODE" -eq 0 ]; then
python3 /opt/azure/containers/report_ready.py -v || echo "WARNING: Failed to report ready to Azure fabric"
else
python3 /opt/azure/containers/report_ready.py -v --failure --description "ExitCode: ${EXIT_CODE}, ${message_string}" || echo "WARNING: Failed to report failure to Azure fabric"
fi
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change updates cloud-init/CSE scripts under parts/, which are covered by snapshot-style golden tests in pkg/agent/testdata/* (e.g., baker_test.go reads ./testdata/<folder>/CustomData). Please run make generate (or regenerate the testdata via the repo’s standard workflow) and include the updated golden files in this PR; otherwise CI is likely to fail due to mismatched expected CustomData/CSE outputs.

Copilot uses AI. Check for mistakes.
awesomenix and others added 2 commits March 10, 2026 18:47
…ript

   Bake experimental_skip_ready_report into VHD via cloud.cfg.d to skip
   cloud-init's built-in health ready report to Azure fabric. Add a
   standalone Python script (report_ready.py) that can be invoked from
   CSE to report ready at the appropriate time during node provisioning.

   Depends on canonical/cloud-init#6771.

   Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants