Skip to content

fix: allow MEET against isolated shard primaries during scale-up#147

Merged
jdheyburn merged 1 commit into
valkey-io:mainfrom
daanvinken:fix/meet-target-isolated-primary
Apr 24, 2026
Merged

fix: allow MEET against isolated shard primaries during scale-up#147
jdheyburn merged 1 commit into
valkey-io:mainfrom
daanvinken:fix/meet-target-isolated-primary

Conversation

@daanvinken
Copy link
Copy Markdown
Contributor

@daanvinken daanvinken commented Apr 22, 2026

Description:

Scaling a single-node cluster (1 shard, 0 replicas) to add a replica gets stuck in an infinite reconciliation loop. findMeetTarget() skips shard primaries with cluster_known_nodes <= 1, so the new replica is never CLUSTER MEET'd to the primary. CLUSTER REPLICATE fails with "Unknown node" on every reconciliaton.

The same bug also prevents adding a new shard (1 shard -> 2 shards), since the new shard's primary can't MEET the existing isolated primary either.

A shard primary that owns slots is always a valid MEET target regardless of cluster_known_nodes. This removes the IsIsolated() guard for shard primaries.

Discovered while investigating feedback on #135.

Testing:
Reproduced both cases on a local kind cluster: create a 1-shard-0-replica cluster, wait for Ready, then scale up.

Case 1: add replica (replicas: 0 -> replicas: 1)

Before fix - cluster stuck at Reconciling, replica never MEET'd:

$ kubectl get valkeycluster
NAME         STATE         REASON
test-scale   Reconciling   Reconciling

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME             READY   ROLE
test-scale-0-0   true    primary
test-scale-0-1   true    primary

$ kubectl exec statefulset/valkey-test-scale-0-1 -c server -- valkey-cli CLUSTER INFO | grep cluster_known_nodes
cluster_known_nodes:1

Operator logs repeating every 2s:

DEBUG  replica does not yet know primary (gossip pending); will retry  replica=10.244.0.9 primaryId=3441aa9a...
DEBUG  skipping replica; primary not ready yet
DEBUG  missing replicas, requeue..

After fix - MEET succeeds, cluster reaches Ready:

$ kubectl get valkeycluster
NAME         STATE   REASON
test-scale   Ready   ClusterHealthy

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME             READY   ROLE
test-scale-0-0   true    primary
test-scale-0-1   true    replica

$ kubectl exec statefulset/valkey-test-scale-0-0 -c server -- valkey-cli CLUSTER NODES
3441aa9a... 10.244.0.8:6379@16379 myself,master - 0 0 0 connected 0-16383
bd334739... 10.244.0.9:6379@16379 slave 3441aa9a... 0 1776858716401 0 connected

Operator logs:

DEBUG  meet node  node=10.244.0.9 target=10.244.0.8
DEBUG  events  Introduced 1 isolated node(s) to the cluster

Case 2: add shard (shards: 1 -> shards: 2)

After fix - new shard joins and slots are rebalanced:

$ kubectl get valkeycluster
NAME          STATE   REASON
test-scale2   Ready   ClusterHealthy

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME              READY   ROLE
test-scale2-0-0   true    primary
test-scale2-1-0   true    primary

$ kubectl exec statefulset/valkey-test-scale2-0-0 -c server -- valkey-cli CLUSTER NODES
a6e1a811... 10.244.0.13:6379@16379 master - 0 1776860233046 1 connected 0-8191
e9dae06c... 10.244.0.12:6379@16379 myself,master - 0 0 0 connected 8192-16383

// isolated (fresh bootstrap, first reconcile).
func findMeetTarget(state *valkey.ClusterState, isolated []*valkey.NodeState) *valkey.NodeState {
for _, shard := range state.Shards {
if p := shard.GetPrimaryNode(); p != nil && !p.IsIsolated() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly it feels a bit weird removing this check, but a shard primary (in state.Shards) always has slots assigned, meaning it went through the full slot assignment flow and is a legitimate cluster member.

The only scenario where such a primary has cluster_known_nodes <= 1 is a single-node cluster being scaled up. Network partitions don't cause this because cluster_known_nodes counts remembered peers even if they're in a failed state, and fresh nodes can't get slots while isolated thanks to the guard in assignSlotsToPendingPrimaries() here.

@jdheyburn
Copy link
Copy Markdown
Collaborator

Thanks for raising this! I did a test locally, looks like there is a regression from the previous PR, where 1 shard with 0 replica doesn't produce a healthy cluster:

# redis-cli cluster info
cluster_state:fail
cluster_slots_assigned:0
cluster_slots_ok:0
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_nodes_pfail:0
cluster_nodes_fail:0
cluster_voting_nodes_pfail:0
cluster_voting_nodes_fail:0
cluster_known_nodes:1
cluster_size:0
cluster_current_epoch:0
cluster_my_epoch:0
cluster_stats_messages_sent:0
cluster_stats_messages_received:0
total_cluster_links_buffer_limit_exceeded:0
#

Operator logs:

2026-04-23T15:45:01Z    DEBUG    reconcile...    {"controller": "valkeycluster", "controllerGroup": "valkey.io", "controllerKind": "ValkeyCluster", "ValkeyCluster": {"name":"cluster-sample","namespace":"v │
│ alkey-operator-system"}, "namespace": "valkey-operator-system", "name": "cluster-sample", "reconcileID": "e6691409-3b43-49e9-9f4f-a5c5cf47d889"}                                                             │
│ 2026-04-23T15:45:01Z    INFO    getting system users secret: cluster-sample    {"controller": "valkeycluster", "controllerGroup": "valkey.io", "controllerKind": "ValkeyCluster", "ValkeyCluster": {"name":" │
│ cluster-sample","namespace":"valkey-operator-system"}, "namespace": "valkey-operator-system", "name": "cluster-sample", "reconcileID": "e6691409-3b43-49e9-9f4f-a5c5cf47d889"}                               │
│ 2026-04-23T15:45:01Z    DEBUG    internal ACLs unchanged    {"controller": "valkeycluster", "controllerGroup": "valkey.io", "controllerKind": "ValkeyCluster", "ValkeyCluster": {"name":"cluster-sample","na │
│ mespace":"valkey-operator-system"}, "namespace": "valkey-operator-system", "name": "cluster-sample", "reconcileID": "e6691409-3b43-49e9-9f4f-a5c5cf47d889"}                                                  │
│ 2026-04-23T15:45:01Z    DEBUG    slots are not assigned, requeue..    {"controller": "valkeycluster", "controllerGroup": "valkey.io", "controllerKind": "ValkeyCluster", "ValkeyCluster": {"name":"cluster-s │
│ ample","namespace":"valkey-operator-system"}, "namespace": "valkey-operator-system", "name": "cluster-sample", "reconcileID": "e6691409-3b43-49e9-9f4f-a5c5cf47d889", "unassignedSlots": [{"Start":0,"End":1 │
│ 6383}]}                                                                                                                                                                                                      │

When I added a replica or shard to it, it seemed to have fixed it.

Do you have the same on your side?

findMeetTarget() skipped shard primaries with cluster_known_nodes <= 1
(IsIsolated). When scaling a single-node cluster (1 shard, 0 replicas)
to add a replica, the existing primary has cluster_known_nodes:1 and was
skipped. findMeetTarget fell through to isolated[0] (the new replica),
which the caller then trimmed from the iteration list, causing MEET to
iterate an empty slice. The replica was never introduced to the primary,
and CLUSTER REPLICATE failed with "Unknown node" on every reconcile.

A shard primary that owns slots is always a valid MEET target regardless
of cluster_known_nodes. Remove the IsIsolated guard for shard primaries.

Signed-off-by: Daan Vinken <daanvinken@tythus.com>
@daanvinken daanvinken force-pushed the fix/meet-target-isolated-primary branch from 9f33994 to cd304d7 Compare April 23, 2026 17:44
@daanvinken
Copy link
Copy Markdown
Contributor Author

The 1s/0r creation issue you're seeing is #135, which has been merged. I've rebased this PR onto the latest main which includes it. The initial creation should work now, and this PR adds the scale-up fix on top.

I tested it locally on this branch, if you are still able to reproduce please share the exact config 🙏

@jdheyburn
Copy link
Copy Markdown
Collaborator

Ah yeah sorry, I assumed that #135 was included in this branch. I did a round of testing and it works now - thanks for looking into this!

@jdheyburn jdheyburn merged commit 5e1621b into valkey-io:main Apr 24, 2026
8 checks passed
sandeepkunusoth pushed a commit that referenced this pull request Apr 29, 2026
## Description

Adds e2e coverage for scaling a 1-shard-0-replica cluster. This was
missing and led to the bugs found in #135 / #147. Two tests:
- Scale from 1 shard / 0 replicas to 1 shard / 1 replica (add replica)
- Scale from 1 shard / 0 replicas to 2 shards / 0 replicas (add shard)

Both verify the cluster reaches Ready with correct `cluster_known_nodes`
and `cluster_size`.

## Testing

Tests compile. Pattern follows the existing `rebalances slots on scale
out` e2e test. Validated locally by pointing context to kind cluster.

Signed-off-by: Daan Vinken <daanvinken@tythus.com>
sandeepkunusoth pushed a commit to sandeepkunusoth/valkey-k8s-operator that referenced this pull request May 5, 2026
…key-io#147)

**Description:**

Scaling a single-node cluster (1 shard, 0 replicas) to add a replica
gets stuck in an infinite reconciliation loop. `findMeetTarget()` skips
shard primaries with `cluster_known_nodes <= 1`, so the new replica is
never `CLUSTER MEET`'d to the primary. `CLUSTER REPLICATE` fails with
"Unknown node" on every reconciliaton.

The same bug also prevents adding a new shard (1 shard -> 2 shards),
since the new shard's primary can't MEET the existing isolated primary
either.

A shard primary that owns slots is always a valid `MEET` target
regardless of `cluster_known_nodes`. This removes the `IsIsolated()`
guard for shard primaries.

Discovered while investigating feedback on valkey-io#135.

**Testing:**
Reproduced both cases on a local kind cluster: create a
1-shard-0-replica cluster, wait for Ready, then scale up.

**Case 1: add replica** (`replicas: 0` -> `replicas: 1`)

Before fix - cluster stuck at Reconciling, replica never `MEET`'d:
```
$ kubectl get valkeycluster
NAME         STATE         REASON
test-scale   Reconciling   Reconciling

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME             READY   ROLE
test-scale-0-0   true    primary
test-scale-0-1   true    primary

$ kubectl exec statefulset/valkey-test-scale-0-1 -c server -- valkey-cli CLUSTER INFO | grep cluster_known_nodes
cluster_known_nodes:1
```

Operator logs repeating every 2s:
```
DEBUG  replica does not yet know primary (gossip pending); will retry  replica=10.244.0.9 primaryId=3441aa9a...
DEBUG  skipping replica; primary not ready yet
DEBUG  missing replicas, requeue..
```

After fix - MEET succeeds, cluster reaches Ready:
```
$ kubectl get valkeycluster
NAME         STATE   REASON
test-scale   Ready   ClusterHealthy

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME             READY   ROLE
test-scale-0-0   true    primary
test-scale-0-1   true    replica

$ kubectl exec statefulset/valkey-test-scale-0-0 -c server -- valkey-cli CLUSTER NODES
3441aa9a... 10.244.0.8:6379@16379 myself,master - 0 0 0 connected 0-16383
bd334739... 10.244.0.9:6379@16379 slave 3441aa9a... 0 1776858716401 0 connected
```

Operator logs:
```
DEBUG  meet node  node=10.244.0.9 target=10.244.0.8
DEBUG  events  Introduced 1 isolated node(s) to the cluster
```

**Case 2: add shard** (`shards: 1` -> `shards: 2`)

After fix - new shard joins and slots are rebalanced:
```
$ kubectl get valkeycluster
NAME          STATE   REASON
test-scale2   Ready   ClusterHealthy

$ kubectl get valkeynodes -o custom-columns=NAME:.metadata.name,READY:.status.ready,ROLE:.status.role
NAME              READY   ROLE
test-scale2-0-0   true    primary
test-scale2-1-0   true    primary

$ kubectl exec statefulset/valkey-test-scale2-0-0 -c server -- valkey-cli CLUSTER NODES
a6e1a811... 10.244.0.13:6379@16379 master - 0 1776860233046 1 connected 0-8191
e9dae06c... 10.244.0.12:6379@16379 myself,master - 0 0 0 connected 8192-16383
```

Signed-off-by: Daan Vinken <daanvinken@tythus.com>
sandeepkunusoth pushed a commit to sandeepkunusoth/valkey-k8s-operator that referenced this pull request May 5, 2026
## Description

Adds e2e coverage for scaling a 1-shard-0-replica cluster. This was
missing and led to the bugs found in valkey-io#135 / valkey-io#147. Two tests:
- Scale from 1 shard / 0 replicas to 1 shard / 1 replica (add replica)
- Scale from 1 shard / 0 replicas to 2 shards / 0 replicas (add shard)

Both verify the cluster reaches Ready with correct `cluster_known_nodes`
and `cluster_size`.

## Testing

Tests compile. Pattern follows the existing `rebalances slots on scale
out` e2e test. Validated locally by pointing context to kind cluster.

Signed-off-by: Daan Vinken <daanvinken@tythus.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants