KAFKA-19212: Correct the unclean leader election metric calculation for 3.9 #21134

mmatloka · 2025-12-12T08:32:44Z

This is attempt of backport #19590 specific for Kafka 3.9. Copying description from that PR:

"""
The current ElectionWasClean checks if the new leader is in the previous
ISR. However, there is a corner case in the partition reassignment. The
partition reassignment can change the partition replicas. If the new
preferred leader (the first one in the new replicas) is the last one to
join ISR, this preferred leader will be elected in the same partition
change.

For example: In the previous state, the partition is Leader: 0,
Replicas (2,1,0), ISR (1,0), Adding(2), removing(0). Then replica 2
joins the ISR. The new partition would be like: Leader: 2, Replicas
(2,1), ISR(1,2). The new leader 2 is not in the previous ISR (1,0) but
it is still a clean election.
"""

The original fix depends on ELR, however, AFAIK ELR's are valid for Kafka 4.0 and 4.1, not for 3.9. I prepared a simpler fix for 3.9 (we have encountered such problem under 3.9, and Confluent confirmed it is an issue).

Note: this is my first Apache contribution, I signed the ICLA, I read the contributing guidelines, sorry if I did something incorrectly :)
Let me also mention author of the original fix for 4.x - @CalvinConfluent

Thank you!

…or 3.9

chia7712 · 2025-12-12T10:52:25Z

@mmatloka thanks for this patch. Do you have time to add an integration test to reproduce this bug?

mmatloka · 2025-12-12T11:36:42Z

@mmatloka thanks for this patch. Do you have time to add an integration test to reproduce this bug?

Hi, I will try to add additional tests

mmatloka · 2025-12-12T12:39:31Z

@mmatloka thanks for this patch. Do you have time to add an integration test to reproduce this bug?

Hi, I added some additional tests. Could you check them, and if they’re not sufficient, point out where I should add more (I am not fully familiar with this codebase)?

metadata/src/main/java/org/apache/kafka/metadata/PartitionRegistration.java

mmatloka · 2025-12-15T09:21:00Z

metadata/src/main/java/org/apache/kafka/controller/metrics/ControllerMetricsChanges.java

            int[] prevIsr = prev != null ? prev.isr : next.replicas;
-            if (!PartitionRegistration.electionWasClean(next.leader, prevIsr)) {
+            // check if at the same step the partition we are adding to ISR is becoming the leader, don't treat that as unclean election.
+            int[] nextAddingReplicas = next.addingReplicas;


Could someone doublecheck my thinking? AddingReplicas will appear in prev or in next?

gaurav-narula · 2025-12-15T23:59:38Z

Thanks for the patch! I've a question:

The original fix depends on ELR

I'm unsure about this. Going through the original fix, IIUC, it modifies the "unclean" check to be based on LeaderRecoveryState.RECOVERING [0] which was introduced with KIP-704 and as a result should be available with 3.9. It also modifies the calculation of electionFromElrCounter which was introduced in 4.1 with KAFKA-18954 but that I suspect that can be dropped from the backport?

mmatloka · 2025-12-16T06:06:07Z

Thanks for the patch! I've a question:

The original fix depends on ELR

I'm unsure about this. Going through the original fix, IIUC, it modifies the "unclean" check to be based on LeaderRecoveryState.RECOVERING [0] which was introduced with KIP-704 and as a result should be available with 3.9. It also modifies the calculation of electionFromElrCounter which was introduced in 4.1 with KAFKA-18954 but that I suspect that can be dropped from the backport?

Hi, actually the good question is what events actually lead to the situation where adding replica becomes the leader in one step. One situation is probably when it is recovering. However, what we saw in practice, that the situation was probably(?) result of some partition re-assign (probably performed by Confluent SBC actually, new cluster, completely healthy, idle, nothing happening, suddenly adding partition becomes the leader and metrics show unclean election. maybe manual reassign could try to simulate this 🤔 ). Do the partition re-assign cause the leader go through recovering state?

KAFKA-19212: Correct the unclean leader election metric calculation f…

64fefc7

…or 3.9

Additional test and logic fix

50f7d45

Additional tests and improvements

16d02c4

Rename

5c366a7

FrankYang0529 reviewed Dec 15, 2025

View reviewed changes

metadata/src/main/java/org/apache/kafka/metadata/PartitionRegistration.java Outdated Show resolved Hide resolved

Code review fix - use addingReplicas directly

61d366d

mmatloka commented Dec 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KAFKA-19212: Correct the unclean leader election metric calculation for 3.9 #21134

KAFKA-19212: Correct the unclean leader election metric calculation for 3.9 #21134

Uh oh!

mmatloka commented Dec 12, 2025

Uh oh!

chia7712 commented Dec 12, 2025

Uh oh!

mmatloka commented Dec 12, 2025

Uh oh!

mmatloka commented Dec 12, 2025

Uh oh!

Uh oh!

mmatloka Dec 15, 2025

Uh oh!

gaurav-narula commented Dec 15, 2025 •

edited

Loading

Uh oh!

mmatloka commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KAFKA-19212: Correct the unclean leader election metric calculation for 3.9 #21134

Are you sure you want to change the base?

KAFKA-19212: Correct the unclean leader election metric calculation for 3.9 #21134

Uh oh!

Conversation

mmatloka commented Dec 12, 2025

Uh oh!

chia7712 commented Dec 12, 2025

Uh oh!

mmatloka commented Dec 12, 2025

Uh oh!

mmatloka commented Dec 12, 2025

Uh oh!

Uh oh!

mmatloka Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gaurav-narula commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmatloka commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gaurav-narula commented Dec 15, 2025 •

edited

Loading