branch-4.0: [fix](load_stream) Fix flush token deadlock by ensuring wait_for_flush_tasks is called before destruction #60284#60285
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR fixes a deadlock issue in load_stream by ensuring flush tasks are properly waited for before TabletStream destruction. The fix reverts the use of shared_from_this() and instead relies on explicit lifetime management through the IndexStream destructor.
Changes:
- Removed
std::enable_shared_from_thisinheritance from TabletStream and allshared_from_this()usages - Added
wait_for_flush_tasks()method to TabletStream to safely wait for all flush tasks before destruction - Added IndexStream destructor to ensure all TabletStreams wait for flush tasks before cleanup
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| be/src/runtime/load_stream.h | Removed shared_from_this inheritance, added wait_for_flush_tasks() declaration, added IndexStream destructor, added _flush_tasks_done flag |
| be/src/runtime/load_stream.cpp | Replaced shared_from_this with this in lambdas, implemented wait_for_flush_tasks() and IndexStream destructor, refactored pre_close() to use wait_for_flush_tasks() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1529b65 to
ed0add4
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
ed0add4 to
3e507a7
Compare
…h_tasks is called before destruction Problem: When TabletStream is destroyed without pre_close() being called (e.g., on_idle_timeout scenario), the _flush_token destructor calls shutdown() which triggers deadlock detection if called from the pool thread. Root cause: - on_idle_timeout() directly calls brpc::StreamClose() without calling LoadStream::close() - This triggers the destruction chain without calling pre_close() on TabletStreams - If flush tasks are still running, TabletStream may be destroyed in pool thread Solution: - Add IndexStream::~IndexStream() to ensure wait_for_flush_tasks() is called on all TabletStreams - Add TabletStream::wait_for_flush_tasks() to wait for all flush tasks to complete - This ensures _flush_token is properly handled before TabletStream destruction - Revert commit c18ef17 (shared_from_this) as it is no longer needed
3e507a7 to
2a1ef37
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
skip buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…ait_for_flush_tasks is called before destruction apache#60284 (apache#60285) pick from:apache#60284
pick from:#60284