[Bug] Using numba to detect gpu availability breaks Dask-CUDA worker pinning

While attempting to benchmark https://github.com/NVIDIA-Merlin/NVTabular/pull/1687, I discovered that the [dask-criteo benchmark](https://github.com/NVIDIA-Merlin/NVTabular/blob/main/bench/examples/dask-nvtabular-criteo-benchmark.py) does not work with the latest version of NVTabular/Merlin-core.

As far as I can tell, the problem is that https://github.com/NVIDIA-Merlin/core/pull/98 added the following logic to detect GPU availability: `HAS_GPU = len(cuda.gpus.lst) > 0`. This logic works just fine within a local process, but breaks Dask-CUDA device pinning when it is included in a top-level import (or is performed in the global context of the program). In other words, code like this **shouldn't** be executed by an import statement, like `from merlin.core.compat import HAS_GPU`.

The problem becomes apparent in a simple (Merlin-free) reproducer:

```python
# reproducer.py
from dask_cuda import LocalCUDACluster
from numba import cuda # This is fine

HAS_GPU = len(cuda.gpus.lst) > 0  # This is not fine

if __name__ == "__main__":
    cluster = LocalCUDACluster()
```

If you execute `python ./reproducer.py`, you sill see warnings like:

```
/.../distributed/distributed/comm/ucx.py:67: UserWarning: Worker with process ID 49507 should have a CUDA context assigned to device 1, but instead the CUDA context is on device 0. This is often the result of a CUDA-enabled library calling a CUDA runtime function before Dask-CUDA can spawn worker processes. Please make sure any such function calls don't happen at import time or in the global scope of a program.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Using numba to detect gpu availability breaks Dask-CUDA worker pinning #144

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Using numba to detect gpu availability breaks Dask-CUDA worker pinning #144

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions