Skip to content

[SPARK-1880] [SQL] Eliminate unnecessary job executions.#825

Closed
ueshin wants to merge 2 commits into
apache:masterfrom
ueshin:issues/SPARK-1880
Closed

[SPARK-1880] [SQL] Eliminate unnecessary job executions.#825
ueshin wants to merge 2 commits into
apache:masterfrom
ueshin:issues/SPARK-1880

Conversation

@ueshin

@ueshin ueshin commented May 19, 2014

Copy link
Copy Markdown
Member

There are unnecessary job executions in BroadcastNestedLoopJoin.

When Innner or LeftOuter join, preparation of rightOuterMatches for RightOuter or FullOuter join is not neccessary.
And when RightOuter or FullOuter, it should use not count and then reduce but fold.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15079/

@rxin

rxin commented May 20, 2014

Copy link
Copy Markdown
Contributor

Thanks for submitting this. I actually think this thing is going away to be replaced by broadcast hash join. #734

Maybe we should wait for that to be merged? Otherwise a lot of unnecessary conflicts.

@rxin

rxin commented May 20, 2014

Copy link
Copy Markdown
Contributor

Would be great if you can review that code, @ueshin.

@ueshin

ueshin commented May 20, 2014

Copy link
Copy Markdown
Member Author

@rxin Thank you for your comment.
I checked the code #734, not deeply yet, though.
It seems like broadcast hash join is used only for Inner join so broadcast nested loop join will not be replaced.

@rxin

rxin commented May 21, 2014

Copy link
Copy Markdown
Contributor

We can easily add right outer join support to the hash join though. In general, the nested loop join performs very unfavorably compared with a hash join implementation.

@ueshin

ueshin commented May 21, 2014

Copy link
Copy Markdown
Member Author

@rxin Ah, you mean that we should add right/full outer join support in addition to #734?
I agree with the unfavorable performance of the nested loop join, so we should wait for being merged and then add the right/full outer join support at another issue.

@ueshin

ueshin commented May 21, 2014

Copy link
Copy Markdown
Member Author

@rxin BTW, speaking of performance, could you please review the code #836?
I think this is a kind of blocker issue of join strategy.

@rxin

rxin commented May 21, 2014

Copy link
Copy Markdown
Contributor

Done & merged #836.

@ueshin

ueshin commented May 22, 2014

Copy link
Copy Markdown
Member Author

@rxin Thanks a lot!
Well, should I close this PR?

@rxin

rxin commented May 22, 2014

Copy link
Copy Markdown
Contributor

Probably makes sense to close this one now. Thanks!

@ueshin

ueshin commented May 23, 2014

Copy link
Copy Markdown
Member Author

Ok, close this. Thanks!

@ueshin ueshin closed this May 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants