[SPARK-2518][SQL] Fix foldability of Substring expression.#1432
Closed
ueshin wants to merge 1 commit into
Closed
Conversation
Contributor
|
LGTM |
|
QA tests have started for PR 1432. This patch merges cleanly. |
Contributor
|
LGTM |
|
QA results for PR 1432: |
Contributor
|
Merging in master and branch-1.0. Thanks. |
asfgit
pushed a commit
that referenced
this pull request
Jul 16, 2014
This is a follow-up of #1428. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits: 37d1ace [Takuya UESHIN] Fix foldability of Substring expression. (cherry picked from commit cc965ee) Signed-off-by: Reynold Xin <rxin@apache.org>
xiliu82
pushed a commit
to xiliu82/spark
that referenced
this pull request
Sep 4, 2014
This is a follow-up of apache#1428. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes apache#1432 from ueshin/issues/SPARK-2518 and squashes the following commits: 37d1ace [Takuya UESHIN] Fix foldability of Substring expression.
cloud-fan
pushed a commit
that referenced
this pull request
Nov 19, 2020
…id unnecessary sort operations
### What changes were proposed in this pull request?
This pull request tries to normalize the SortOrder properly to prevent unnecessary sort operators. Currently the sameOrderExpressions are not normalized as part of AliasAwareOutputOrdering.
Example: consider this join of three tables:
"""
|SELECT t2id, t3.id as t3id
|FROM (
| SELECT t1.id as t1id, t2.id as t2id
| FROM t1, t2
| WHERE t1.id = t2.id
|) t12, t3
|WHERE t1id = t3.id
""".
The plan for this looks like:
*(8) Project [t2id#1059L, id#1004L AS t3id#1060L]
+- *(8) SortMergeJoin [t2id#1059L], [id#1004L], Inner
:- *(5) Sort [t2id#1059L ASC NULLS FIRST ], false, 0 <-----------------------------
: +- *(5) Project [id#1000L AS t2id#1059L]
: +- *(5) SortMergeJoin [id#996L], [id#1000L], Inner
: :- *(2) Sort [id#996L ASC NULLS FIRST ], false, 0
: : +- Exchange hashpartitioning(id#996L, 5), true, [id=#1426]
: : +- *(1) Range (0, 10, step=1, splits=2)
: +- *(4) Sort [id#1000L ASC NULLS FIRST ], false, 0
: +- Exchange hashpartitioning(id#1000L, 5), true, [id=#1432]
: +- *(3) Range (0, 20, step=1, splits=2)
+- *(7) Sort [id#1004L ASC NULLS FIRST ], false, 0
+- Exchange hashpartitioning(id#1004L, 5), true, [id=#1443]
+- *(6) Range (0, 30, step=1, splits=2)
In this plan, the marked sort node could have been avoided as the data is already sorted on "t2.id" by the lower SortMergeJoin.
### Why are the changes needed?
To remove unneeded Sort operators.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
New UT added.
Closes #30302 from prakharjain09/SPARK-33400-sortorder.
Authored-by: Prakhar Jain <prakharjain09@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a follow-up of #1428.