SPARK-2407: Added internal implementation of SQL SUBSTR()#1359
Conversation
|
Merged build started. |
|
Merged build triggered. |
|
QA tests have started for PR 1359. This patch merges cleanly. |
There was a problem hiding this comment.
Hey Will -- I think we need to check if resolved is true here to not break Catalyst's contract. Similar to here.
|
Hey @willb - thanks for working on this, which is going to be very useful for Spark SQL! I left a couple minor comments. Another general concern is the performance of |
|
Thanks for the comments, @concretevitamin. I've made the changes from your comments and will think about reducing branch overhead before pushing an update. |
|
@concretevitamin Inlining |
|
This sounds pretty cool! |
|
QA results for PR 1359: |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
Merged build triggered. |
|
Merged build started. |
|
QA tests have started for PR 1359. This patch merges cleanly. |
|
QA results for PR 1359: |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
There was a problem hiding this comment.
Would it be possible to do this without view bounds? I think those are going to disappear soon: https://issues.scala-lang.org/browse/SI-7629
There was a problem hiding this comment.
Yes, I think I can do it with implicit parameters. I'll push an update once I've run the Catalyst suite against my change.
|
QA tests have started for PR 1359. This patch merges cleanly. |
|
QA results for PR 1359: |
|
QA tests have started for PR 1359. This patch merges cleanly. |
|
QA results for PR 1359: |
|
Hey Will, this looks great. Thanks for implementing! Mind fixing the conflict with master so I can merge? |
This replaces the Hive UDF for SUBSTR(ING) with an implementation in Catalyst and adds tests to verify correct operation. Squashes 8969d038 and e6419b4e.
* orders imports in stringOperations.scala * Substring.dataType throws exception if children are unresolved * inlines Substring.slice (~11.5% performance improvement on microbenchmark runs) * adds a special `toString` case for two-argument SUBSTR expressions * removes spurious I_ prefix to SUBSTR(ING) in HiveQL.scala Thanks to @concretevitamin for prompt and useful feedback!
|
add to whitelist |
|
QA tests have started for PR 1359. This patch merges cleanly. |
|
QA results for PR 1359: |
|
I'm going to go ahead and merge this. Someone can make a follow-up with nullability narrowing. |
|
Thanks! Merged into master and 1.0. |
This replaces the Hive UDF for SUBSTR(ING) with an implementation in Catalyst and adds tests to verify correct operation. Author: William Benton <willb@redhat.com> Closes #1359 from willb/internalSqlSubstring and squashes the following commits: ccedc47 [William Benton] Fixed too-long line. a30a037 [William Benton] replace view bounds with implicit parameters ec35c80 [William Benton] Adds fixes from review: 4f3bfdb [William Benton] Added internal implementation of SQL SUBSTR() (cherry picked from commit 61de65b) Signed-off-by: Michael Armbrust <michael@databricks.com>
This is a follow-up of #1359 with nullability narrowing. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #1426 from ueshin/issues/SPARK-2504 and squashes the following commits: 5157832 [Takuya UESHIN] Remove unnecessary white spaces. 80958ac [Takuya UESHIN] Fix nullability of Substring expression. (cherry picked from commit 632fb3d) Signed-off-by: Reynold Xin <rxin@apache.org>
This is a follow-up of #1359 with nullability narrowing. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #1426 from ueshin/issues/SPARK-2504 and squashes the following commits: 5157832 [Takuya UESHIN] Remove unnecessary white spaces. 80958ac [Takuya UESHIN] Fix nullability of Substring expression.
|
hi, it is really very useful for us, i tried this implementation from @willb , in spark-shell, i still got java.lang.UnsupportedOperationException by Query Plan, i made some change in SqlParser: chutium@1de83a7 |
|
Awesome! Please submit a pull request with that addition.
|
|
PR submitted #1442 |
follow-up of #1359 Author: chutium <teng.qiu@gmail.com> Closes #1442 from chutium/master and squashes the following commits: b49cc8a [chutium] SPARK-2407: Added Parser of SQL SUBSTRING() #1442 9a60ccf [chutium] SPARK-2407: Added Parser of SQL SUBSTR() #1442 06e933b [chutium] Merge https://github.com/apache/spark c870172 [chutium] Merge https://github.com/apache/spark 094f773 [chutium] Merge https://github.com/apache/spark 88cb37d [chutium] Merge https://github.com/apache/spark 1de83a7 [chutium] SPARK-2407: Added Parse of SQL SUBSTR()
This replaces the Hive UDF for SUBSTR(ING) with an implementation in Catalyst and adds tests to verify correct operation. Author: William Benton <willb@redhat.com> Closes apache#1359 from willb/internalSqlSubstring and squashes the following commits: ccedc47 [William Benton] Fixed too-long line. a30a037 [William Benton] replace view bounds with implicit parameters ec35c80 [William Benton] Adds fixes from review: 4f3bfdb [William Benton] Added internal implementation of SQL SUBSTR()
This is a follow-up of apache#1359 with nullability narrowing. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes apache#1426 from ueshin/issues/SPARK-2504 and squashes the following commits: 5157832 [Takuya UESHIN] Remove unnecessary white spaces. 80958ac [Takuya UESHIN] Fix nullability of Substring expression.
follow-up of apache#1359 Author: chutium <teng.qiu@gmail.com> Closes apache#1442 from chutium/master and squashes the following commits: b49cc8a [chutium] SPARK-2407: Added Parser of SQL SUBSTRING() apache#1442 9a60ccf [chutium] SPARK-2407: Added Parser of SQL SUBSTR() apache#1442 06e933b [chutium] Merge https://github.com/apache/spark c870172 [chutium] Merge https://github.com/apache/spark 094f773 [chutium] Merge https://github.com/apache/spark 88cb37d [chutium] Merge https://github.com/apache/spark 1de83a7 [chutium] SPARK-2407: Added Parse of SQL SUBSTR()
This replaces the Hive UDF for SUBSTR(ING) with an implementation in Catalyst
and adds tests to verify correct operation.