Skip to content

KAFKA-17166: Use NoOpScheduler to rewrite LogManagerTest#testLogRecoveryMetrics#16641

Merged
chia7712 merged 1 commit intoapache:trunkfrom
FrankYang0529:KAFKA-17166
Jul 22, 2024
Merged

KAFKA-17166: Use NoOpScheduler to rewrite LogManagerTest#testLogRecoveryMetrics#16641
chia7712 merged 1 commit intoapache:trunkfrom
FrankYang0529:KAFKA-17166

Conversation

@FrankYang0529
Copy link
Copy Markdown
Member

  • LogManagerTest#testLogRecoveryMetrics does not require the scheduler, so we can introduce the NoOpScheduler to handle the test case.
  • Move checkpoint.writeForTruncation(epochs.values()); to under LeaderEpochFileCache read lock to avoid race condition like:
    • take a snapshot of the epoch entries
    • a new epoch entry is added and is flushed to disk
    • the scheduler then writes the snapshot to disk

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

appendRecordsToLog(mockTime, logDir1, 0, mockBrokerTopicStats, expectedSegmentsPerLog)
appendRecordsToLog(mockTime, logDir2, 1, mockBrokerTopicStats, expectedSegmentsPerLog)

val noOpsScheduler = new NoOpsScheduler()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about creating a (noop) mock scheduler? that is more simple :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using mock can have same effect. Thank you.

@FrankYang0529 FrankYang0529 force-pushed the KAFKA-17166 branch 2 times, most recently from dc3093c to 0c1c1c0 Compare July 21, 2024 13:36
Copy link
Copy Markdown
Contributor

@ocadaruma ocadaruma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your quick fix.
LGTM

@FrankYang0529 FrankYang0529 marked this pull request as ready for review July 22, 2024 01:21
Copy link
Copy Markdown
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chia7712
Copy link
Copy Markdown
Member

@FrankYang0529 Could you please rebase code to trigger QA again?

…eryMetrics

Signed-off-by: PoAn Yang <payang@apache.org>
@chia7712 chia7712 merged commit 44a44b7 into apache:trunk Jul 22, 2024
@FrankYang0529 FrankYang0529 deleted the KAFKA-17166 branch July 22, 2024 14:27
Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrankYang0529 : Thanks for the PR. Just a couple of minor comments.

}
}

private void writeToFileForTruncation() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's clearer to rename this and checkpoint.writeForTruncation to sth like writeIfDirExists.

private void writeToFileForTruncation() {
lock.readLock().lock();
try {
checkpoint.writeForTruncation(epochs.values());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment on why it's important to hold the lock while writing the checkpoint file?

@junrao
Copy link
Copy Markdown
Contributor

junrao commented Jul 22, 2024

Also, could we cherry-pick this PR to 3.8 since it has the same correctness issue?

@chia7712
Copy link
Copy Markdown
Member

Also, could we cherry-pick this PR to 3.8 since it has the same correctness issue?

That will make 3.8 get in next RC. @jlprat WDYT?

@jlprat
Copy link
Copy Markdown
Contributor

jlprat commented Jul 22, 2024

Would this be considered a blocker for the release?
Right now RC2 is in voting stage. If this is a blocker for the 3.8.0 then I'll need to cut a new one.

We can definitely cherry pick it on the 3.8 branch and release this with the next patch or RC whatever happens first.

@chia7712
Copy link
Copy Markdown
Member

Would this be considered a blocker for the release?

This is correctness issue, so it should be considered a blocker even though it should be rare.

I will cherry-pick it (and #16614) to 3.8 later

@jlprat
Copy link
Copy Markdown
Contributor

jlprat commented Jul 22, 2024

Hi @chia7712 could you share this in the Vote thread for RC2? Just for completion / traceability https://lists.apache.org/thread/ownfopb4rmdkdqtqwcnzl3x6lbr03ql8

@chia7712
Copy link
Copy Markdown
Member

could you share this in the Vote thread for RC2? Just for completion / traceability https://lists.apache.org/thread/ownfopb4rmdkdqtqwcnzl3x6lbr03ql8

done

chia7712 pushed a commit that referenced this pull request Jul 22, 2024
…ryMetrics (#16641)

Reviewers: Okada Haruki <ocadaruma@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
@chia7712
Copy link
Copy Markdown
Member

I will cherry-pick it (and #16614) to 3.8 later

done. I have updated the jiras also.

@FrankYang0529 Please file a MINOR PR to address @junrao if you have free cycles.

@FrankYang0529
Copy link
Copy Markdown
Member Author

@FrankYang0529 Please file a MINOR PR

Yes, I will do it today.

abhi-ksolves pushed a commit to ksolves/kafka that referenced this pull request Jul 31, 2024
…ryMetrics (apache#16641)

Reviewers: Okada Haruki <ocadaruma@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants