[SPARK-26503][CORE] Get rid of spark.sql.legacy.timeParser.enabled#23495
[SPARK-26503][CORE] Get rid of spark.sql.legacy.timeParser.enabled#23495srowen wants to merge 2 commits into
Conversation
|
Test build #100949 has finished for PR 23495 at commit
|
|
retest this please |
|
Test build #100955 has finished for PR 23495 at commit
|
| @@ -1452,105 +1452,103 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData { | |||
| } | |||
|
|
|||
| test("backward compatibility") { | |||
There was a problem hiding this comment.
As far as I remember this test requires https://github.com/apache/spark/pull/23495/files#diff-7f589e01d3e5e5ea284c1622527d4984L85 . I don't think it is possible to pass the test on new parser.
|
Test build #100972 has finished for PR 23495 at commit
|
|
Merged to master |
…rser.enabled ## What changes were proposed in this pull request? The SQL config `spark.sql.legacy.timeParser.enabled` was removed by #23495. The PR cleans up the SQL migration guide and the comment for `UnixTimestamp`. Closes #23529 from MaxGekk/get-rid-off-legacy-parser-followup. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
## What changes were proposed in this pull request? Per discussion in apache#23391 (comment) this proposes to just remove the old pre-Spark-3 time parsing behavior. This is a rebase of apache#23411 ## How was this patch tested? Existing tests. Closes apache#23495 from srowen/SPARK-26503.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
…rser.enabled ## What changes were proposed in this pull request? The SQL config `spark.sql.legacy.timeParser.enabled` was removed by apache#23495. The PR cleans up the SQL migration guide and the comment for `UnixTimestamp`. Closes apache#23529 from MaxGekk/get-rid-off-legacy-parser-followup. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
| // in the JSON object. | ||
| // - For Spark before 1.5.1, we do not generate UDTs. So, we manually added the UDT value to | ||
| // JSON objects generated by those Spark versions (col17). | ||
| // - If the type is NullType, we do not write data out. |
There was a problem hiding this comment.
What is the reason we removed this test case? This test case sounds very critical.
There was a problem hiding this comment.
We have to capture all such changes before the final release of Spark 3.0. We can delay the release but we have to capture all such changes. Please let us know if you are aware of any similar change we made in the upcoming 3.0 release.
There was a problem hiding this comment.
I think we need to add more such test cases for ensuring we do not break the backward compatibility for all the built-in data sources.
There was a problem hiding this comment.
The test is gone because the old behavior is gone; that's all that's going on here.
See the OP with a link to the actual change. The key discussions were:
#23391 (comment)
#23391 (comment)
I think the TL;DR is that the legacy behavior is error-prone and already susceptible to getting wrong answers for old dates. That seems worth 'fixing' despite the forced behavior change.
There was a problem hiding this comment.
Based on the latest discussions in #27710 (comment), we can't silently return the wrong results. The backward compatibility is very critical.
There was a problem hiding this comment.
I defer to @MaxGekk and @cloud-fan on that thread. It's trading one set of problems for another but it could be the right thing. We will never get rid of the legacy behavior now, I'm pretty sure :)
What changes were proposed in this pull request?
Per discussion in #23391 (comment) this proposes to just remove the old pre-Spark-3 time parsing behavior.
This is a rebase of #23411
How was this patch tested?
Existing tests.