VAULT-28192 fix Agent and Proxy consuming large amounts of CPU for auto-auth self-healing#27518
VAULT-28192 fix Agent and Proxy consuming large amounts of CPU for auto-auth self-healing#27518VioletHynes merged 5 commits intomainfrom
Conversation
…to-auth self-healing
|
CI Results: |
|
Build Results: |
| invalidTokenCh <- err | ||
| } | ||
| default: | ||
| case err := <-ts.runner.ServerErrCh: |
There was a problem hiding this comment.
I wonder if this is a possible scenario -
- IncomingToken receives a new token at the same time template sends an error back to ServerErrCh
- We reauthenticate first by honoring the ServerErrCh select first. Now IncomingCh has two values in the channel
- We try using the first token in IncomingCh but there is an error. Another error is sent to ServerErrCh
- Again the SeverErrCh is honored first and we reauthenticate.
We are now stuck in a loop where we always honor the token one behind valid token.
There was a problem hiding this comment.
Hmm, that's a good point! I do think it's likely in this scenario that both tokens will be valid, but it's still not a great state to be in. I'll rework this to drain the incoming channel in the same place we drain the invalid token channel. I think that should prevent any looping
There was a problem hiding this comment.
I definitely understand why we had it the way we had it before though, but I do think this might be the best fix, and the only situation it would struggle is if we have the two channels filled exactly simultaneously
There was a problem hiding this comment.
There was a problem hiding this comment.
I like this!! Thanks for adding Violet!
Description
Fixes an issue introduced in 1.17 where CPU usage in Agent and Proxy are extremely high due to the code taking the same path down a select statement repeatedly (in an infinite loop).
Will be backported to 1.17.
Fixes #27505
TODO only if you're a HashiCorp employee
getting backported to N-2, use the new style
backport/ent/x.x.x+entlabelsinstead of the old style
backport/x.x.xlabels.the normal
backport/x.x.xlabel (there should be only 1).of a public function, even if that change is in a CE file, double check that
applying the patch for this PR to the ENT repo and running tests doesn't
break any tests. Sometimes ENT only tests rely on public functions in CE
files.
in the PR description, commit message, or branch name.
description. Also, make sure the changelog is in this PR, not in your ENT PR.