Skip to content

[SQL]Extract the joinkeys from join condition#1190

Closed
chenghao-intel wants to merge 4 commits into
apache:masterfrom
chenghao-intel:extract_join_keys
Closed

[SQL]Extract the joinkeys from join condition#1190
chenghao-intel wants to merge 4 commits into
apache:masterfrom
chenghao-intel:extract_join_keys

Conversation

@chenghao-intel

Copy link
Copy Markdown
Contributor

Extract the join keys from equality conditions, that can be evaluated using equi-join.

@chenghao-intel chenghao-intel changed the title Extract the joinkeys from join condition [SQL]Extract the joinkeys from join condition Jun 24, 2014
@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dot should precede its operator immediately

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16049/

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished.

@AmplabJenkins

Copy link
Copy Markdown

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16051/

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

Jenkins, retest this please.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

@rxin , can you ask Jenkins to retest this? Seems he doesn't answer me. :)

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

Oh, Jenkins is working. :)

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16104/

@marmbrus

Copy link
Copy Markdown
Contributor

I'm not sure what the point of this change is. It is only serving to make the planner more brittle and tied to the specifics of the current implementation of the optimizer.

If the current pattern for hash joins is correct and more general, I think we should keep it.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

The join/where predicate push down has been done in PushPredicateThroughJoin of the logical plan optimizer, I don't think we really need to do it again here. Hence I wrote an new pattern ExtractEquiJoinKeys to extract the join keys only, which should be more specific.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

BTW, if I followed the current implementation pattern, which means I have to handle predicate push down for the outer join as it's done for inner join, too, that may make the code duplicated(with the optimizer) and confusing.

@marmbrus

Copy link
Copy Markdown
Contributor

Okay, you've convinced me with the outer join argument. Remove HashFilteredJoin as its pretty redundant with your pattern.

@marmbrus

Copy link
Copy Markdown
Contributor

and please rebase to master.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

Thank you @marmbrus , updated, let's see the testing result.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16136/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe annotation on line 48 can be modified.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@chenghao-intel

Copy link
Copy Markdown
Contributor Author

Thanks, updated.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16177/

@marmbrus

Copy link
Copy Markdown
Contributor

Thanks, merged into master.

@asfgit asfgit closed this in 981bde9 Jun 27, 2014
@chenghao-intel chenghao-intel deleted the extract_join_keys branch June 27, 2014 05:04
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Extract the join keys from equality conditions, that can be evaluated using equi-join.

Author: Cheng Hao <hao.cheng@intel.com>

Closes apache#1190 from chenghao-intel/extract_join_keys and squashes the following commits:

4a1060a [Cheng Hao] Fix some of the small issues
ceb4924 [Cheng Hao] Remove the redundant pattern of join keys extraction
cec34e8 [Cheng Hao] Update the code style issues
dcc4584 [Cheng Hao] Extract the joinkeys from join condition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants