-
Notifications
You must be signed in to change notification settings - Fork 17
separate cupy import from rapids #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
karlhigley
merged 7 commits into
NVIDIA-Merlin:main
from
jperez999:change-dispatch-imports
Feb 13, 2023
Merged
Changes from 6 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
2cdb6af
separate cupy import from rapids
jperez999 95f3454
change ... to pass for better readability
jperez999 f04d237
make other the priority
jperez999 14595f5
fix reference to cudf without ensuring cudf import
jperez999 fd322e5
remove changes for incorrectly added commit
jperez999 8453dbe
Merge branch 'main' into change-dispatch-imports
jperez999 41da7a2
Merge branch 'main' into change-dispatch-imports
karlhigley File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this line will change the meaning of this
HAS_GPUvariable. Currently it represents a combination both whether or not a GPU is available and whether cudf is installed and available.And I think this is expected by it's use both here in Core, NVTabular and possibly other places.
We've tried changing this line to
passin #99 , but then subsequently reverted in #112 after various errors started showing up in places where the assumptions about this variable started to breakdown.I think changing this line or changing the name of this variable would be an improvement, since it's name does not capture the full sense of what it is intended to capture. A clearer name would be something like
HAS_GPU_AND_CUDF. However, making this change will require furtther changes to where this variable is referenced across our libraries.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good insight, but I have not seen this breakdown. And I think it is ok because we have a change for dispatch make_df that checks both HAS_GPU and if cudf is not None. So by breaking it up, it makes it more clear I think. Where to use cudf you have to have both HAS_GPU and cudf needs to be available. Maybe you can show me where it actually trips up. I think in dispatch (and anywhere else we use it) we should check for both HAS_GPU and cudf before using cudf. to do something. We should really not be using cudf directly anywhere since, making a DF or a Series is covered in this dispatch. I have run some secondary tests to check the other libraries and I have not hit any errors. It would be great if you could tell me where you hit the errors that required #112 that would be helpful maybe we can fix those in other parts of the code. But I believe that HAS_GPU, in its current state in merlin.core.compat only detects if a GPU is available. So whether or not you have CUDF, that parameter should always represent if GPUs are available in the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One example from this file are these lines where we check this HAS_GPU variable and assume that cudf will be availble if it's True:
https://github.com/NVIDIA-Merlin/core/blob/v0.10.0/merlin/core/dispatch.py#L79-L81
And one other use as you mentioned is the
make_dffunction with the device parameter.In general the environment we need to check against is where GPUs exist (found by NVML device get count - resulting in the compat HAS_GPU variable being True), and where cudf is not installed. This is a setup we don't currently test against. If we can get all the libraries to work in this environment with this change then we should be ok to make this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was an excellent example. I have gone ahead and changed the code to match. Fortunately this is just used for typing and it is not referenced in dispatch or any other part of core. It seems to be used heavily by models and nvtabular for typing. I did a quick search and couldnt find any other locations in merlin where an import of cudf changes HAS_GPU or where they depend on each other. Let me know if you can find anymore.