Skip to content

Changed auto-generated erlang cookie causes cluster restart issues #78

@colearendt

Description

@colearendt

Describe the bug

The auto-generated erlang cookie (#68) changes on each helm deployment. As a result, it causes issues with rolling over the cluster.

Version of Helm and Kubernetes: 3.8.3 and 1.21

What happened: Each time we roll over the cluster (i.e. deploy config changes), it short circuits on replication failure b/c the two CouchDB nodes are running with different cookies. As a result, it never completes roll-over and stops with one restarted unhealthily:

NAME                READY   STATUS    RESTARTS   AGE
couch-1-couchdb-0   1/1     Running   0          105m
couch-1-couchdb-1   0/1     Running   0          61m

What you expected to happen: Roll over the cluster happily

How to reproduce it (as minimally and precisely as possible): Deploy helm chart twice w/ different config and replicas > 1

Anything else we need to know:

Potentially related to #77

Possible fix is to use a stateful generation pattern like #74

Alternatively, could make clear in the docs that setting the erlangFlags.setcookie value is required in order for "rollover" to happen cleanly, change the update policy for the chart, etc.

Related: if you set this variable, then it passes as a command line argument and not as a secret env var. This should be able to be toggled independently IMO (since an HA cluster being able to restart safely / consistently depends on consistency here, and some people prefer using secrets)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions