Skip to content

[SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window#28987

Closed
BryanCutler wants to merge 3 commits into
apache:masterfrom
BryanCutler:pandas-grouped-map-test-output-SPARK-32162
Closed

[SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window#28987
BryanCutler wants to merge 3 commits into
apache:masterfrom
BryanCutler:pandas-grouped-map-test-output-SPARK-32162

Conversation

@BryanCutler

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Improve the error message in test GroupedMapInPandasTests.test_grouped_over_window_with_key to show the incorrect values.

Why are the changes needed?

This test failure has come up often in Arrow testing because it tests a struct with timestamp values through a Pandas UDF. The current error message is not helpful as it doesn't show the incorrect values, only that it failed. This change will instead raise an assertion error with the incorrect values on a failure.

Before:

======================================================================
FAIL: test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/spark/python/pyspark/sql/tests/test_pandas_grouped_map.py", line 588, in test_grouped_over_window_with_key
    self.assertTrue(all([r[0] for r in result]))
AssertionError: False is not true

After:

======================================================================
ERROR: test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests)
----------------------------------------------------------------------
...
AssertionError: {'start': datetime.datetime(2018, 3, 20, 0, 0), 'end': datetime.datetime(2018, 3, 25, 0, 0)}, != {'start': datetime.datetime(2020, 3, 20, 0, 0), 'end': datetime.datetime(2020, 3, 25, 0, 0)}

Does this PR introduce any user-facing change?

No

How was this patch tested?

Improved existing test

@BryanCutler

Copy link
Copy Markdown
Member Author

This is currently being looked at in apache/arrow#7604

Comment thread python/pyspark/sql/tests/test_pandas_grouped_map.py Outdated
@BryanCutler

Copy link
Copy Markdown
Member Author

ping @HyukjinKwon please take a look, thanks!

@HyukjinKwon HyukjinKwon left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA

SparkQA commented Jul 5, 2020

Copy link
Copy Markdown

Test build #124929 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #124992 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125000 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125011 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125020 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler

Copy link
Copy Markdown
Member Author

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125028 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

I think .. the machines became very slow .. for some reasons.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125031 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125050 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125061 has finished for PR 28987 at commit 70da8b5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

retest this please

@SparkQA

SparkQA commented Jul 6, 2020

Copy link
Copy Markdown

Test build #125068 has finished for PR 28987 at commit 70da8b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

Woah, finally

@HyukjinKwon

Copy link
Copy Markdown
Member

Merged to master and branch-3.0.

HyukjinKwon pushed a commit that referenced this pull request Jul 6, 2020
…map test with window

### What changes were proposed in this pull request?

Improve the error message in test GroupedMapInPandasTests.test_grouped_over_window_with_key to show the incorrect values.

### Why are the changes needed?

This test failure has come up often in Arrow testing because it tests a struct  with timestamp values through a Pandas UDF. The current error message is not helpful as it doesn't show the incorrect values, only that it failed. This change will instead raise an assertion error with the incorrect values on a failure.

Before:

```
======================================================================
FAIL: test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/spark/python/pyspark/sql/tests/test_pandas_grouped_map.py", line 588, in test_grouped_over_window_with_key
    self.assertTrue(all([r[0] for r in result]))
AssertionError: False is not true
```

After:
```
======================================================================
ERROR: test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests)
----------------------------------------------------------------------
...
AssertionError: {'start': datetime.datetime(2018, 3, 20, 0, 0), 'end': datetime.datetime(2018, 3, 25, 0, 0)}, != {'start': datetime.datetime(2020, 3, 20, 0, 0), 'end': datetime.datetime(2020, 3, 25, 0, 0)}
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Improved existing test

Closes #28987 from BryanCutler/pandas-grouped-map-test-output-SPARK-32162.

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@BryanCutler

Copy link
Copy Markdown
Member Author

Wow, Jenkins must have been slow back to work after the long weekend.. Thanks for the help @HyukjinKwon !

@BryanCutler BryanCutler deleted the pandas-grouped-map-test-output-SPARK-32162 branch July 6, 2020 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants