Skip to content

[mysql-cdc] Fix the hung up of snapshot phase when reuse binaryLogClient#1915

Closed
lzshlzsh wants to merge 1 commit into
apache:masterfrom
lzshlzsh:fix-reuse-binlogclient
Closed

[mysql-cdc] Fix the hung up of snapshot phase when reuse binaryLogClient#1915
lzshlzsh wants to merge 1 commit into
apache:masterfrom
lzshlzsh:fix-reuse-binlogclient

Conversation

@lzshlzsh

Copy link
Copy Markdown
Contributor

Because callback( eventListeners and lifecycleListeners) of BinaryLogClient is a list, and BinaryLogClient may reuse (see MySqlSplitReader#checkSplitOrStartNext),when multiple snapshotSplits are submitted to a SnapshotSplitReader, the callback list contains already processed snapshotSplits's MySqlBinlogSplitReadTask#handleEvent。When a binlog event arrives, the processed snapshot's callbacks are called and causes the current snapshot's BackfillBinlogReadTask's execute function end before get the BINLOG_END watermark event. So the snapshot phase hangs.

The following is the log of our online environment, we can see muliple MySqlStreamingChangeEventSource(super calss of MySqlBinlogSplitReadTask) callbacks of different snapshotSplits.

io.debezium.connector.mysql.MySqlStreamingChangeEventSource - XXX: eventListeners(7): com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@61540cca,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@352b5758,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1014/1247290871@703f0cf,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1015/190751860@5a253136,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1016/10641269@12fef255,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@18c84a61,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@55443f, lifecycleListeners(5): com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@61540cca,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@352b5758,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener@730a6982,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@18c84a61,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@55443f

We believe, the imporper use of mysql BinlogClient is the root cause of some task hung up issues, such as #1156

@lzshlzsh

Copy link
Copy Markdown
Contributor Author

@leonardBang @kylemeow @minchowang Would you help to look at this problem.

@lzshlzsh lzshlzsh changed the title Fix the hung up of snapshot phase when reuse binaryLogClient [mysql-cdc] Fix the hung up of snapshot phase when reuse binaryLogClient Feb 14, 2023
@lzshlzsh

Copy link
Copy Markdown
Contributor Author

We just encountered this problem online. The snapshot stage is stuck, and the problem is solved after this repair. @minchowang

@leonardBang leonardBang self-requested a review February 22, 2023 13:27
@leonardBang

Copy link
Copy Markdown
Contributor

Thanks @lzshlzsh for the detail report and fix! I'll review this PR asap

@ruanhang1993 ruanhang1993 added this to the V2.5.0 milestone Jul 5, 2023
@yuxiqian

Copy link
Copy Markdown
Member

Hi @lzshlzsh, thanks for your contribution! Before this PR could be merged, could you please rebase it with latest master branch?

cc @leonardBang @PatrickRen

@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions Bot added Stale and removed Stale labels Sep 24, 2024
@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions Bot added the Stale label Nov 24, 2024
@github-actions

Copy link
Copy Markdown

This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request.

@github-actions github-actions Bot closed this Dec 25, 2024
@suxinglee

Copy link
Copy Markdown

@yuxiqian need attention

@misaya295

Copy link
Copy Markdown

@yuxiqian Is it fix in order pr? I cant find this code in flink-cdc master

@yuxiqian

Copy link
Copy Markdown
Member

Hi Misaya, it seems the original author of this PR has been inactive for a long time, and this PR is way off HEAD branch and not ready for merge. Feel free to open another PR for this if it's still reproducible in the latest version. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants