Skip to content

[SPARK-10304][SQL] Partition discovery should throw an exception if the dir structure is invalid#8840

Closed
viirya wants to merge 7 commits into
apache:masterfrom
viirya:detect_invalid_part_dir
Closed

[SPARK-10304][SQL] Partition discovery should throw an exception if the dir structure is invalid#8840
viirya wants to merge 7 commits into
apache:masterfrom
viirya:detect_invalid_part_dir

Conversation

@viirya

@viirya viirya commented Sep 19, 2015

Copy link
Copy Markdown
Member

JIRA: https://issues.apache.org/jira/browse/SPARK-10304

This patch detects if the structure of partition directories is not valid.

The test cases are from #8547. Thanks @zhzhan.

cc @liancheng

@SparkQA

SparkQA commented Sep 19, 2015

Copy link
Copy Markdown

Test build #42708 has finished for PR 8840 at commit 2d13156.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya

viirya commented Sep 19, 2015

Copy link
Copy Markdown
Member Author

retest this please.

@SparkQA

SparkQA commented Sep 19, 2015

Copy link
Copy Markdown

Test build #42718 has finished for PR 8840 at commit 2d13156.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen

Copy link
Copy Markdown
Contributor

Jenkins, retest this please.

@JoshRosen

Copy link
Copy Markdown
Contributor

@chenghao-intel, it looks like this PR's error message improvements are similar to the ones that you added as part of #8026. @viirya, could you take a look at @chenghao-intel's PR to see which approach you like better?

@SparkQA

SparkQA commented Oct 20, 2015

Copy link
Copy Markdown

Test build #44001 has finished for PR 8840 at commit 2d13156.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya

viirya commented Oct 21, 2015

Copy link
Copy Markdown
Member Author

@JoshRosen ok, I will take a look at it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this assert out of the resolvePartitions, since it's not so tight with the original purpose of this function, and we can avoid the code change for this function.

@SparkQA

SparkQA commented Oct 24, 2015

Copy link
Copy Markdown

Test build #44295 has finished for PR 8840 at commit 779fbd2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Oct 25, 2015

Copy link
Copy Markdown

Test build #44316 has finished for PR 8840 at commit cdf6dc4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya

viirya commented Oct 26, 2015

Copy link
Copy Markdown
Member Author

@chenghao-intel @JoshRosen any comments? Is this patch ready?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we writhe these like this?

val (basePaths, pathsWithPartitionValues) = paths.flatMap {
}.unzip

@davies

davies commented Nov 3, 2015

Copy link
Copy Markdown
Contributor

LGTM

@SparkQA

SparkQA commented Nov 3, 2015

Copy link
Copy Markdown

Test build #44888 has finished for PR 8840 at commit 3db4226.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya

viirya commented Nov 3, 2015

Copy link
Copy Markdown
Member Author

retest this please.

@SparkQA

SparkQA commented Nov 3, 2015

Copy link
Copy Markdown

Test build #44906 has finished for PR 8840 at commit 3db4226.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in d6035d9 Nov 3, 2015
@rxin

rxin commented Nov 3, 2015

Copy link
Copy Markdown
Contributor

Should this one be backported to branch-1.5?

cc @liancheng @yhuai

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a case that we will hit this branch? What will basePaths be at here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which case is for the one I mentioned in the jira?

@yhuai

yhuai commented Nov 3, 2015

Copy link
Copy Markdown
Contributor

@rxin I think it is fine to not backport it. Without this, a user will get the DataFrame and when he/she queries it, it will fail (and the error message does not say what's the cause). Since, it does not really let users make any real progress, I think we do not really need to backport it.

@rxin

rxin commented Nov 3, 2015

Copy link
Copy Markdown
Contributor

OK great. @viirya can you submit a followup pr to address @yhuai's feedback.

@viirya

viirya commented Nov 4, 2015

Copy link
Copy Markdown
Member Author

@rxin @yhuai OK. Submitted a following pr at #9459.

asfgit pushed a commit that referenced this pull request Nov 4, 2015
…ition discovery

This patch follows up #8840.

Author: Liang-Chi Hsieh <viirya@appier.com>

Closes #9459 from viirya/detect_invalid_part_dir_following.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need !columns.isEmpty?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see. It is for something like table/a=1/_temporary/something, right?

@viirya viirya deleted the detect_invalid_part_dir branch December 27, 2023 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants