[SPARK-10210] [STREAMING] Filter out non-existent blocks before creating BlockRDD#8405
Closed
tdas wants to merge 4 commits into
Closed
[SPARK-10210] [STREAMING] Filter out non-existent blocks before creating BlockRDD#8405tdas wants to merge 4 commits into
tdas wants to merge 4 commits into
Conversation
|
Test build #41495 has finished for PR 8405 at commit
|
|
Test build #41497 has finished for PR 8405 at commit
|
Member
There was a problem hiding this comment.
Is it worth to add some warning log here? I think the user may forget to enable receiver log.
Member
|
LGTM except one minor comment |
|
Test build #41507 has finished for PR 8405 at commit
|
Member
|
LGTM |
|
Test build #41521 has finished for PR 8405 at commit
|
|
Test build #41523 has finished for PR 8405 at commit
|
Contributor
Author
|
Thanks @zsxwing for reviewing. Merging this to master and 1.5 |
asfgit
pushed a commit
that referenced
this pull request
Aug 25, 2015
…ing BlockRDD When write ahead log is not enabled, a recovered streaming driver still tries to run jobs using pre-failure block ids, and fails as the block do not exists in-memory any more (and cannot be recovered as receiver WAL is not enabled). This occurs because the driver-side WAL of ReceivedBlockTracker is recovers that past block information, and ReceiveInputDStream creates BlockRDDs even if those blocks do not exist. The solution in this PR is to filter out block ids that do not exist before creating the BlockRDD. In addition, it adds unit tests to verify other logic in ReceiverInputDStream. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8405 from tdas/SPARK-10210. (cherry picked from commit 1fc3758) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When write ahead log is not enabled, a recovered streaming driver still tries to run jobs using pre-failure block ids, and fails as the block do not exists in-memory any more (and cannot be recovered as receiver WAL is not enabled).
This occurs because the driver-side WAL of ReceivedBlockTracker is recovers that past block information, and ReceiveInputDStream creates BlockRDDs even if those blocks do not exist.
The solution in this PR is to filter out block ids that do not exist before creating the BlockRDD. In addition, it adds unit tests to verify other logic in ReceiverInputDStream.