Run doctests.#9815
Conversation
|
All tests are now passing locally so I'm opening the draft for review. Thanks so much to everyone who contributed a fix for one or more of the issues this caught! |
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9815 +/- ##
================================================
- Coverage 10.49% 10.41% -0.08%
================================================
Files 119 119
Lines 20305 20541 +236
================================================
+ Hits 2130 2139 +9
- Misses 18175 18402 +227
Continue to review full report at Codecov.
|
vyasr
left a comment
There was a problem hiding this comment.
Looks pretty good to me! I just have a few small questions.
| >>> df = df.set_index([3, 2, 1, 0]) | ||
| ... 'y': [1.0, 3.3, 2.2, 4.4], | ||
| ... 'z': ['a', 'b', 'c', 'd']}) | ||
| >>> df = df.set_index(cudf.Series([3, 2, 1, 0])) |
There was a problem hiding this comment.
Is this necessary?
| >>> df = df.set_index(cudf.Series([3, 2, 1, 0])) | |
| >>> df = df.set_index([3, 2, 1, 0]) |
There was a problem hiding this comment.
Still confused by what caused this to be introduced.
There was a problem hiding this comment.
Missed this comment in the previous round. Without cudf.Series(...), this raises:
KeyError: 'None of [3, 2, 1, 0] are in the columns'
cudf/python/cudf/cudf/core/dataframe.py
Lines 2397 to 2398 in b01c846
Seems like a bug to me.
There was a problem hiding this comment.
Nope, pandas does the same.
>>> import pandas as pd
>>> df = pd.DataFrame({"a": [1, 2, 3, 4]})
>>> df
a
0 1
1 2
2 3
3 4
>>> df.set_index([3, 2, 0, 1])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File ".../site-packages/pandas/core/frame.py", line 5451, in set_index
raise KeyError(f"None of {missing} are in the columns")
KeyError: 'None of [3, 2, 0, 1] are in the columns'There was a problem hiding this comment.
Wow that is a terrible API. Passing a Series makes it become the index, while passing an iterable is assumed to mean you're passing an iterable of strings indicating the names of the columns that should become the new index. Well I see why the change is necessary here.
vyasr
left a comment
There was a problem hiding this comment.
There's one outstanding question from my last review, but nothing worth holding this up over. Address if you can. Nice work!
|
@shwina @vyasr FYI, I pushed two more commits that were needed to fix some issues. Feel free to review:
|
|
|
||
| # Capture stdout and include failing outputs in the traceback. | ||
| doctest_stdout = io.StringIO() | ||
| with contextlib.redirect_stdout(doctest_stdout): |
There was a problem hiding this comment.
Do we also need to redirect stderr, or does runner.run automatically do a 2>&1 internally?
There was a problem hiding this comment.
It probably doesn’t matter, the part I want to capture is always sent to stdout.
There was a problem hiding this comment.
The doctest runner just writes directly to stdout when failures occur, rather than raising or providing a helpful object with the failure info. Getting that into a traceback requires the capture.
|
@bdice the changes look good to me, just one question about output capturing. |
|
@gpucibot merge |
Similar to rapidsai/cudf#9815, this change uses doctest to test that the pylibraft example docstrings run without issue. This caught several errors in the example docstrings, that are also fixed in this PR: * a missing ‘device_ndarray’ import in kmeans fit when the centroids weren’t explicitly passed in * an error in the fused_l2_nn_argmin docstring where output wasn’t defined * An `AttributeError: module 'pylibraft.neighbors.ivf_pq' has no attribute 'np'` error in ivf_pq Closes rapidsai#981
Similar to rapidsai/cudf#9815, this change uses doctest to test that the pylibraft example docstrings run without issue. This caught several errors in the example docstrings, that are also fixed in this PR: * a missing ‘device_ndarray’ import in kmeans fit when the centroids weren’t explicitly passed in * an error in the fused_l2_nn_argmin docstring where output wasn’t defined * An `AttributeError: module 'pylibraft.neighbors.ivf_pq' has no attribute 'np'` error in ivf_pq Closes #981 Authors: - Ben Frederickson (https://github.com/benfred) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1073
Similar to rapidsai/cudf#9815, this change uses doctest to test that the pylibraft example docstrings run without issue. This caught several errors in the example docstrings, that are also fixed in this PR: * a missing ‘device_ndarray’ import in kmeans fit when the centroids weren’t explicitly passed in * an error in the fused_l2_nn_argmin docstring where output wasn’t defined * An `AttributeError: module 'pylibraft.neighbors.ivf_pq' has no attribute 'np'` error in ivf_pq Closes rapidsai#981 Authors: - Ben Frederickson (https://github.com/benfred) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1073
This PR adds doctests and resolves #9513. Several issues were found by running doctests that have now been resolved:
Series.applymap#9821topin categorical column #9825