Skip to content

Backport of Stop processing ACME verifications when active node is stepped down into release/1.15.x#23286

Merged
stevendpclark merged 1 commit intorelease/1.15.xfrom
backport/stevendpclark/vault-20315-acme-active-node-change/verbally-large-baboon
Sep 26, 2023
Merged

Backport of Stop processing ACME verifications when active node is stepped down into release/1.15.x#23286
stevendpclark merged 1 commit intorelease/1.15.xfrom
backport/stevendpclark/vault-20315-acme-active-node-change/verbally-large-baboon

Conversation

@hc-github-team-secure-vault-core
Copy link
Collaborator

Backport

This PR is auto-generated from #23278 to be assessed for backporting due to the inclusion of the label backport/1.15.x.

The below text is copied from the body of the original PR.


  • Do not load existing ACME challenges persisted within storage on non-active nodes. This was the main culprit of the issues, secondary nodes would load existing persisted challenges trying to resolve them but writes would fail leading to the excessive logging.

    • We now handle this by not starting the ACME background thread on non-active nodes, while also checking within the scheduling loop and breaking out. That will force a re-reading of the Closing channel that should have been called by the PKI plugin's Cleanup method.
  • If a node is stepped down from being the active node while it is actively processing a verification, we could get into an infinite loop due to an ErrReadOnly error attempting to clean up a challenge entry

  • Add a maximum number of retries for errors around attempting to decode,fetch challenge/authorization entries from disk. We use double the number of "normal" max attempts for these types of errors, than we would for normal ACME retry attempts to avoid collision issues. Note that these additional retry attempts are not persisted to disk and will restart on every node start

  • Add a 1 second backoff to any disk related error to not immediately spin on disk/io errors for challenges.


Overview of commits

@hc-github-team-secure-vault-core hc-github-team-secure-vault-core force-pushed the backport/stevendpclark/vault-20315-acme-active-node-change/verbally-large-baboon branch from e0d748e to fc18ac7 Compare September 26, 2023 17:59
@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Sep 26, 2023
@stevendpclark stevendpclark self-assigned this Sep 26, 2023
@stevendpclark stevendpclark added this to the 1.15.1 milestone Sep 26, 2023
@stevendpclark stevendpclark enabled auto-merge (squash) September 26, 2023 18:02
@github-actions
Copy link

Build Results:
All builds succeeded! ✅

@stevendpclark stevendpclark merged commit 0943b82 into release/1.15.x Sep 26, 2023
@stevendpclark stevendpclark deleted the backport/stevendpclark/vault-20315-acme-active-node-change/verbally-large-baboon branch September 26, 2023 18:18
@github-actions
Copy link

CI Results:
All Go tests succeeded! ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants