Skip to content

CP-312160: secureboot certificate update design doc#7006

Open
chunjiez wants to merge 1 commit intomasterfrom
private/chunjiez/doc
Open

CP-312160: secureboot certificate update design doc#7006
chunjiez wants to merge 1 commit intomasterfrom
private/chunjiez/doc

Conversation

@chunjiez
Copy link
Copy Markdown
Collaborator

Microsoft Secure Boot certificates from 2011 are reaching end-of-life, and legacy VMs may still contain only the old certificate set.

We design an out-of-band mechanism to update per-VM UEFI Secure Boot variables safely and at scale.

Signed-off-by: Chunjie Zhu <chunjie.zhu@cloud.com>
@chunjiez chunjiez force-pushed the private/chunjiez/doc branch from 5d5aeb9 to c4ccf61 Compare April 13, 2026 08:25
@robhoes robhoes requested a review from psafont April 13, 2026 09:47
@dinhngtu
Copy link
Copy Markdown
Contributor

Thanks for the document.

I'd like to ask a couple questions:

  • There's no detection or handling of VMs with Secure Boot PCR7 binding. Is this explicitly out of scope and up to the admins to decide?
  • The certificate update check helper is not yet detailed. As EFI variable handling is part of varstored, is it a script that's part of varstored, or something else? Does it actively connect to XAPI and read the VM state or does it simply check the EFI-variables data handed over by XAPI (i.e. is it an active or passive script)?
  • (technically a question for varstored) Does the update involve a full override of all Secure Boot variables, or does it only overwrite/append a few specific ones?

@robhoes
Copy link
Copy Markdown
Member

robhoes commented Apr 13, 2026

Thanks @chunjiez, this toolstack design accurately reflects what we have discussed internally.

From @dinhngtu's question, I think it would be good to add some more details about the interfaces between varstored and xapi:

  • A script will be introduced, which xapi calls to determine the SB cert state. Please explain the inputs, that it comes uses a library shared with varstored and that it will be deprivileged like varstored.
  • varstored uses the existing varstored-guard process to make calls to xapi, which will need to be extended.

- `ok`: No update required (including non-applicable VM types)
- `update_available`: Update required
- `update_on_boot`: Update scheduled for next boot

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names and the meaning seem to diverge. If an update is required, the state should reflect that and not dance around it.

  • ok
  • update_required
  • reboot_required

In the current proposal it is not obvious what is expected from the user.


### 3.1 VM Certificate State Model

`VM.secureboot_certificates_state` applies to VM-class objects, including:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"including" could mean that there are more but unlisted objects. Would say: "these VM-class objects:"

Behavior:

- `mark=true`: require current state `update_available`, then set `update_on_boot`
- `mark=false`: require current state `update_on_boot`, then set `update_available`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would marked or scheduled be better?


Rules:

- `update=yes` -> set state `update_available`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the name ambiguous. Does it mean: this is an update? This has been updated? Is update a state (noun) or an action (verb)?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means it will be updated after calling set_NVRAM_EFI_variables

But shoult it be:
update=yes -> set state ok

Comment on lines +69 to +74
- `mark=true`: require current state `update_available`, then set `update_on_boot`
- `mark=false`: require current state `update_on_boot`, then set `update_available`

Validation:

- Reject invalid transitions with `OPERATION_NOT_ALLOWED`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple state diagram done in mermaid would work well to show the interactions between this field and the NVRAM

https://mermaid.ai/open-source/syntax/stateDiagram.html


When varstored initializes a VM and sees `secureboot_certificates_state=update_on_boot`:

- Perform certificate update flow during boot-time initialization
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this flow end up in an invalid state that needs to be recovered from? That is, neither the new nor the outdated certificates are stored in the NVRAM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants