Support Dataset cpu-mode in environment with GPUs that have not been detected #236
Conversation
karlhigley
left a comment
There was a problem hiding this comment.
Everything outside dispatch.py LGTM, but I have concerns about using the GPU classes there
| python -m pytest --cov-report term --cov merlin -rxs tests/unit | ||
|
|
||
|
|
||
| [testenv:test-gpu-not-visible] |
Co-authored-by: Karl Higley <kmhigley@gmail.com>
Co-authored-by: Karl Higley <kmhigley@gmail.com>
| output_col_types: List[Type] = [] | ||
|
|
||
| if cp: | ||
| if cp and HAS_GPU: |
There was a problem hiding this comment.
Is there a way we could put the HAS_GPU in merlin.core.compat so we don't have to repeat it in so many places?
There was a problem hiding this comment.
are you suggesting making cp return None if HAS_GPU=False or something else?
There was a problem hiding this comment.
I think so? Can we import CuPy successfully if we don't have a GPU?
There was a problem hiding this comment.
that depends on the extent of what "don't have a GPU" means
If CuPy or cudf fail to find relevant cuda drivers they raise an ImportError which we catch in compat. and cp will be None. However, if we set CUDA_VISIBLE_DEVICES= in an environment that has the relevant cuda drivers and libs available then the import is successful, cp will have the value of the cupy module, but if you try to use it e.g. cupy.array(...), then you get a CUDARuntimeError
There was a problem hiding this comment.
Is it possible this is a misuse of CUDA_VISIBLE_DEVICES and not a scenario we would like to support.
There was a problem hiding this comment.
In the latter case, is HAS_GPU false? If so, then seems like we might be able to use it in compat
|
This test environment looks like it breaks some of the Dask cluster tests |
Datasetcpu-mode in environment with GPUs that have not been detected.When CUDA visible devices environment variable is empty
CUDA_VISIBLE_DEVICES=""Currently you'll get an error like the following in the case where
cudfhas imported sucessfully whileHAS_GPUis False.