[SPARK-10289] [SQL] A direct write API for testing Parquet#8454
Closed
liancheng wants to merge 2 commits into
Closed
[SPARK-10289] [SQL] A direct write API for testing Parquet#8454liancheng wants to merge 2 commits into
liancheng wants to merge 2 commits into
Conversation
|
Test build #41613 has started for PR 8454 at commit |
149c23c to
85747e4
Compare
|
Test build #41618 has finished for PR 8454 at commit
|
Contributor
There was a problem hiding this comment.
Why the specific imports?
Contributor
Author
There was a problem hiding this comment.
I thought we should be explicit and avoid wildcard imports according to our style guide. But just realized it's OK to have them for implicit methods.
Contributor
|
Seems useful, and only touches test code, so I'm gonna merge into master and 1.5 |
Contributor
|
Actually does not apply cleanly to branch-1.5, so I'll hold off. |
Contributor
Author
|
It's OK to not having this merged into branch-1.5. I've resolved SPARK-10289. |
asfgit
pushed a commit
that referenced
this pull request
Sep 9, 2015
…or nested structs We used to workaround SPARK-10301 with a quick fix in branch-1.5 (PR #8515), but it doesn't cover the case described in SPARK-10428. So this PR backports PR #8509, which had once been considered too big a change to be merged into branch-1.5 in the last minute, to fix both SPARK-10301 and SPARK-10428 for Spark 1.5. Also added more test cases for SPARK-10428. This PR looks big, but the essential change is only ~200 loc. All other changes are for testing. Especially, PR #8454 is also backported here because the `ParquetInteroperabilitySuite` introduced in PR #8515 depends on it. This should be safe since #8454 only touches testing code. Author: Cheng Lian <lian@databricks.com> Closes #8583 from liancheng/spark-10301/for-1.5.
ashangit
pushed a commit
to ashangit/spark
that referenced
this pull request
Oct 19, 2016
…or nested structs We used to workaround SPARK-10301 with a quick fix in branch-1.5 (PR apache#8515), but it doesn't cover the case described in SPARK-10428. So this PR backports PR apache#8509, which had once been considered too big a change to be merged into branch-1.5 in the last minute, to fix both SPARK-10301 and SPARK-10428 for Spark 1.5. Also added more test cases for SPARK-10428. This PR looks big, but the essential change is only ~200 loc. All other changes are for testing. Especially, PR apache#8454 is also backported here because the `ParquetInteroperabilitySuite` introduced in PR apache#8515 depends on it. This should be safe since apache#8454 only touches testing code. Author: Cheng Lian <lian@databricks.com> Closes apache#8583 from liancheng/spark-10301/for-1.5. (cherry picked from commit fca16c5) Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a direct write API for testing Parquet. It's a DSL flavored version of the
writeDirectmethod comes with parquet-avro testing code. With this API, it's much easier to construct arbitrary Parquet structures. It's especially useful when adding regression tests for various compatibility corner cases.Sample usage of this API can be found in the new test case added in
ParquetThriftCompatibilitySuite.