Describe the bug
The auto-generated erlang cookie (#68) changes on each helm deployment. As a result, it causes issues with rolling over the cluster.
Version of Helm and Kubernetes: 3.8.3 and 1.21
What happened: Each time we roll over the cluster (i.e. deploy config changes), it short circuits on replication failure b/c the two CouchDB nodes are running with different cookies. As a result, it never completes roll-over and stops with one restarted unhealthily:
NAME READY STATUS RESTARTS AGE
couch-1-couchdb-0 1/1 Running 0 105m
couch-1-couchdb-1 0/1 Running 0 61m
What you expected to happen: Roll over the cluster happily
How to reproduce it (as minimally and precisely as possible): Deploy helm chart twice w/ different config and replicas > 1
Anything else we need to know:
Potentially related to #77
Possible fix is to use a stateful generation pattern like #74
Alternatively, could make clear in the docs that setting the erlangFlags.setcookie value is required in order for "rollover" to happen cleanly, change the update policy for the chart, etc.
Related: if you set this variable, then it passes as a command line argument and not as a secret env var. This should be able to be toggled independently IMO (since an HA cluster being able to restart safely / consistently depends on consistency here, and some people prefer using secrets)
Describe the bug
The auto-generated erlang cookie (#68) changes on each helm deployment. As a result, it causes issues with rolling over the cluster.
Version of Helm and Kubernetes: 3.8.3 and 1.21
What happened: Each time we roll over the cluster (i.e. deploy config changes), it short circuits on replication failure b/c the two CouchDB nodes are running with different cookies. As a result, it never completes roll-over and stops with one restarted unhealthily:
What you expected to happen: Roll over the cluster happily
How to reproduce it (as minimally and precisely as possible): Deploy helm chart twice w/ different config and replicas > 1
Anything else we need to know:
Potentially related to #77
Possible fix is to use a stateful generation pattern like #74
Alternatively, could make clear in the docs that setting the
erlangFlags.setcookievalue is required in order for "rollover" to happen cleanly, change the update policy for the chart, etc.Related: if you set this variable, then it passes as a command line argument and not as a secret env var. This should be able to be toggled independently IMO (since an HA cluster being able to restart safely / consistently depends on consistency here, and some people prefer using secrets)