Skip to content

KAFKA-9693: Kafka latency spikes caused by log segment flush on roll#13782

Closed
novosibman wants to merge 11 commits intoapache:trunkfrom
novosibman:trunk
Closed

KAFKA-9693: Kafka latency spikes caused by log segment flush on roll#13782
novosibman wants to merge 11 commits intoapache:trunkfrom
novosibman:trunk

Conversation

@novosibman
Copy link
Copy Markdown

@novosibman novosibman commented May 30, 2023

Trunk version of initial change: #13768 in branch "3.4"

Key difference with branched change:
Passed and used existing scheduler which already is being used for flushing large segment logs and indices.

In all cases snapshot's fileChannel is kept opened when passed to other threads for flushing and closing (so removing try-with-resource in this change).

Related issue https://issues.apache.org/jira/browse/KAFKA-9693

The issue with repeating latency spikes during Kafka log segments rolling still reproduced on the latest versions including kafka_2.13-3.4.0.

It was found that flushing Kafka snapshot file during segments rolling blocks producer request handling thread for some time. Reproduced latency improvement in the kafka_2.13-3.6.0-snapshot by offloading flush operation. Used available on my side single node test configuration:
kafka_2.13-3.6.0-snapshot - trunk version
kafka_2.13-3.6.0-snapshot-fix - trunk version with provided change

Test time increased to 1hr.

partitions=10 # rolling at each ~52 seconds
image

partitions=100 # rolling events about each 8.5 minute:
image

Copy link
Copy Markdown

@Hangleton Hangleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the patch and the collected data! Really interesting to see the impact of this change. A few questions:

  • What storage device and file system are used in the test?
  • Would you have a real-life workload where the impact of this change can be quantified? The workload generated by the producer-perf-test.sh exhibits the problem the most because the segments of all replicas on the brokers start rolling at the same time. Which is why it is also interesting to assess the impact using topic-partitions which have different ingress rate and/or use segments of different sizes.

@novosibman
Copy link
Copy Markdown
Author

novosibman commented May 31, 2023

Many thanks for the patch and the collected data! Really interesting to see the impact of this change. A few questions:

* What storage device and file system are used in the test?

In AWS config used: i3en.2xlarge with 2 x 2500 NVMe SSDs
In local lab config: 2 x Samsung_SSD_860_EVO_1TB
FS type: xfs
two dirs setup: log.dirs=/ssd1/kafka/data,/ssd2/kafka/data

The FS format had huge impact on results. Initially we used ext4 in our lab for regular testing:
some of ext4 example results (using regular Kafka release kafka_2.13-3.4.0):
image
after switched to xfs:
image
ext4 was much worse before and during Kafka logs rolling

* Would you have a real-life workload where the impact of this change can be quantified? The workload generated by the producer-perf-test.sh exhibits the problem the most because the segments of all replicas on the brokers start rolling at the same time. Which is why it is also interesting to assess the impact using topic-partitions which have different ingress rate and/or use segments of different sizes.

We have no any real-life workload scenarios available for Kafka perf testing. Alternative workload https://github.com/AzulSystems/kafka-benchmark has slightly different rolling behavior compared to OMB:

OMB results example on released kafka_2.13-3.4.0 version (using xfs):
image

Kafka Tussle benchmark:
image

^^^ same params used: acks=1 batchSize=1048510 consumers=4 lingerMs=1 mlen=1024 partitions=100 producers=4 rf=1 targetRate=200k time=30m topics=1

// we manually override the state offset here prior to taking the snapshot.
producerStateManager.updateMapEndOffset(newSegment.baseOffset)
producerStateManager.takeSnapshot()
producerStateManager.takeSnapshot(scheduler)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we need to increase the default size of background threads since we are adding more responsibility to it. Thoughts?

@Hangleton
Copy link
Copy Markdown

Hangleton commented Jun 1, 2023

@novosibman

Thanks for the reply. In the tests we conducted in KAFKA-9693, an nvme SSD (one log directory) and ext4 were used along with jbd2, which likely penalized performance.

Are all the graphs shared for OMB and Kafka Tussle generated for Kafka with the fix in this PR?

@novosibman
Copy link
Copy Markdown
Author

novosibman commented Jun 1, 2023

Are all the graphs shared for OMB and Kafka Tussle generated for Kafka with the fix in this PR?

Graphs with the fix noted in first description comment - marked with kafka_2.13-3.6.0-snapshot-fix label.

Other graphs in latter comment are examples of how rolling affects results on different configurations and benchmarks using regular Kafka release.

@novosibman
Copy link
Copy Markdown
Author

Provided updated change:
returned original try-with-resource on writing, added utility method for flushing:

        try (FileChannel fileChannel = FileChannel.open(file.toPath(), StandardOpenOption.CREATE, StandardOpenOption.WRITE)) {
            fileChannel.write(buffer);
        }
        if (scheduler != null) {
            scheduler.scheduleOnce("flush-producer-snapshot", () -> Utils.flushFileQuietly(file.toPath(), "producer-snapshot"));
        } else {
            Utils.flushFileQuietly(file.toPath(), "producer-snapshot");
        }

Copy link
Copy Markdown
Member

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the changes. The changes look good and I left some minor comments.

Since we are adding new responsibility to the scheduler threads, I think that we should probably advise the users to reconfigure the number of background thread [2]. Could you please add a note to the upgrade section [1] about considering an increase in background threads.

[1]

<h5><a id="upgrade_360_notable" href="#upgrade_360_notable">Notable changes in 3.6.0</a></h5>

[2] https://kafka.apache.org/documentation.html#brokerconfigs_background.threads

Comment thread clients/src/main/java/org/apache/kafka/common/utils/Utils.java Outdated
Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. LGTM, left a minor comment.

@showuon
Copy link
Copy Markdown
Member

showuon commented Jun 6, 2023

Also, there are compiling error, please fix it. Thanks.

… style check error corrected, open/close operations reduced for scheduler == null case
@novosibman
Copy link
Copy Markdown
Author

Open/close changes provided.
Also corrected style check issue (in task ':storage:checkstyleMain').

@divijvaidya
Copy link
Copy Markdown
Member

Hey @novosibman could you please respond to rest of the comments at #13782 (review) and #13782 (comment)

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the patch!

Some minor comments:

  1. The PR title please start with KAFKA-9693, ex: KAFKA-9693: Kafka latency spikes caused by log segment flush on roll
  2. Notable change for v3.6 please remember to update. If you have any problem please let us know. Of course it can be a follow-up PR if you want. You can refer to this PR change
  3. @divijvaidya , do you think we should backport this patch to 3.5 branch? This should have been existed for a long time, and should not be a regression bug. Maybe no? Thoughts?

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, found an issue.

Comment thread clients/src/main/java/org/apache/kafka/common/utils/Utils.java Outdated
@novosibman novosibman changed the title Suggest for performance fix: KAFKA-9693 Kafka latency spikes caused by log segment flush on roll - trunk version KAFKA-9693: Kafka latency spikes caused by log segment flush on roll Jun 15, 2023
@divijvaidya
Copy link
Copy Markdown
Member

@showuon - with this change we don't have consistent data on different flushed files on disk (since earlier they were flushed together but now it's done async). I want to ensure that this inconsistent is ok and recovery will not get hampered by it. Please wait for my review before merging this.

@github-actions
Copy link
Copy Markdown

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale Stale PRs label Nov 27, 2023
@divijvaidya
Copy link
Copy Markdown
Member

@novosibman @ocadaruma do we still need this change after https://github.com/apache/kafka/pull/14242/files? Asking because with the latter PR merged in, we are not blocking request handler thread while flushing producer snapshot. This is same was what this PR is trying to achieve. Hence, I think this could be closed.

@ocadaruma
Copy link
Copy Markdown
Contributor

@divijvaidya Yeah, my understanding is the same

@divijvaidya
Copy link
Copy Markdown
Member

Thanks for checking @ocadaruma . I am going to close this PR, please feel free to re-open if you think that this is still not fixed.

@divijvaidya divijvaidya closed this Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Kafka Broker performance stale Stale PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants