Skip to content

Run (most) CUDA tests in pixi#9285

Merged
crusaderky merged 1 commit into
dask:mainfrom
crusaderky:cuda
Jun 4, 2026
Merged

Run (most) CUDA tests in pixi#9285
crusaderky merged 1 commit into
dask:mainfrom
crusaderky:cuda

Conversation

@crusaderky

Copy link
Copy Markdown
Collaborator

Add a pixi task to run CUDA tests.
Note that this does not run on github actions as there are no CUDA-enabled VMs in the free tier.

Known issues

  • Could not test dask-cuda and dask-cudf, as they contain an exact pin of dask-core which conflicts with version 2099.0.0 set by ../dask/pixi.toml. A workaround could have been to write ad-hoc pixi-build recipes for these two projects but it felt like overengineering to me.
  • distributed/diagnostics/tests/test_nvml.py::test_visible_devices_uuid fails on my RTX 3080 on Linux 64 due to an upstream issue in pynvml:
>>> import pynvml
>>> pynvml.nvmlInit()
>>> h = pynvml.nvmlDeviceGetHandleByIndex(0)
>>> pynvml.nvmlDeviceGetSerial(h)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    pynvml.nvmlDeviceGetSerial(h)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/home/crusaderky/github/distributed/.pixi/envs/cuda/lib/python3.14/site-packages/pynvml.py", line 2946, in wrapper
    res = func(*args, **kwargs)
  File "/home/crusaderky/github/distributed/.pixi/envs/cuda/lib/python3.14/site-packages/pynvml.py", line 3345, in nvmlDeviceGetSerial
    _nvmlCheckReturn(ret)
    ~~~~~~~~~~~~~~~~^^^^^
  File "/home/crusaderky/github/distributed/.pixi/envs/cuda/lib/python3.14/site-packages/pynvml.py", line 1098, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported

@crusaderky crusaderky requested a review from fjetter as a code owner June 4, 2026 14:20
@crusaderky crusaderky requested review from quasiben and rjzamora June 4, 2026 14:21
@crusaderky

Copy link
Copy Markdown
Collaborator Author

FYI @rjzamora @quasiben

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    3 files   -     1      3 suites   - 1   0s ⏱️ ±0s
4 101 tests  -     4  3 902 ✅  -    35  196 💤 +32  3 ❌  - 1 
6 778 runs   - 2 731  6 405 ✅  - 2 636  370 💤  - 94  3 ❌  - 1 

For more details on these failures, see this check.

Results for commit 9cba932. ± Comparison against base commit 4578e5d.

This pull request removes 23 and adds 19 tests. Note that renamed tests count towards both.
distributed.diagnostics.tests.test_memray ‑ test_all_workers
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_scheduler
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_scheduler_report_args[False]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_scheduler_report_args[report_args0]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_workers[1]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_workers[False]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_workers[True]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_workers_report_args[False]
distributed.diagnostics.tests.test_memray ‑ test_basic_integration_workers_report_args[report_args0]
distributed.diagnostics.tests.test_nvml ‑ test_2_visible_devices[0,1]
…
distributed.cli.tests.test_dask_scheduler ‑ test_signal_handling[Signals.SIGINT]
distributed.cli.tests.test_dask_scheduler ‑ test_signal_handling[Signals.SIGTERM]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_scheduler[Signals.SIGINT]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_scheduler[Signals.SIGTERM]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGINT-Nanny]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGINT-Worker]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGTERM-Nanny]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGTERM-Worker]
distributed.cli.tests.test_dask_worker ‑ test_signal_handling[Signals.SIGINT---nanny]
distributed.cli.tests.test_dask_worker ‑ test_signal_handling[Signals.SIGINT---no-nanny]
…
This pull request removes 12 skipped tests and adds 13 skipped tests. Note that renamed tests count towards both.
distributed.diagnostics.tests.test_nvml ‑ test_2_visible_devices[0,1]
distributed.diagnostics.tests.test_nvml ‑ test_2_visible_devices[1,0]
distributed.diagnostics.tests.test_nvml ‑ test_gpu_metrics
distributed.diagnostics.tests.test_nvml ‑ test_gpu_monitoring_range_query
distributed.diagnostics.tests.test_nvml ‑ test_gpu_monitoring_recent
distributed.diagnostics.tests.test_nvml ‑ test_has_cuda_context
distributed.diagnostics.tests.test_nvml ‑ test_one_time
distributed.diagnostics.tests.test_nvml ‑ test_one_visible_devices
distributed.diagnostics.tests.test_nvml ‑ test_visible_devices_bad_uuid
distributed.diagnostics.tests.test_nvml ‑ test_visible_devices_uuid
…
distributed.cli.tests.test_dask_scheduler ‑ test_signal_handling[Signals.SIGINT]
distributed.cli.tests.test_dask_scheduler ‑ test_signal_handling[Signals.SIGTERM]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_scheduler[Signals.SIGINT]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_scheduler[Signals.SIGTERM]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGINT-Nanny]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGINT-Worker]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGTERM-Nanny]
distributed.cli.tests.test_dask_spec ‑ test_signal_handling_worker[Signals.SIGTERM-Worker]
distributed.cli.tests.test_dask_worker ‑ test_signal_handling[Signals.SIGINT---nanny]
distributed.cli.tests.test_dask_worker ‑ test_signal_handling[Signals.SIGINT---no-nanny]
…
This pull request skips 37 and un-skips 6 tests.
distributed.cli.tests.test_dask_scheduler ‑ test_uvloop
distributed.cli.tests.test_dask_spec ‑ test_uvloop
distributed.cli.tests.test_dask_worker ‑ test_uvloop[--nanny]
distributed.cli.tests.test_dask_worker ‑ test_uvloop[--no-nanny]
distributed.tests.test_client ‑ test_allow_restrictions
distributed.tests.test_client ‑ test_client_num_fds
distributed.tests.test_client ‑ test_client_replicate_host
distributed.tests.test_client ‑ test_file_descriptors_dont_leak[Nanny]
distributed.tests.test_client ‑ test_file_descriptors_dont_leak[Worker]
distributed.tests.test_client ‑ test_mixed_compression
…
distributed.dashboard.tests.test_scheduler_bokeh ‑ test_counters
distributed.dashboard.tests.test_worker_bokeh ‑ test_counters
distributed.deploy.tests.test_subprocess ‑ test_raise_on_windows
distributed.shuffle.tests.test_shuffle ‑ test_handle_null_partitions_2
distributed.tests.test_core ‑ test_tick_logging
distributed.tests.test_core ‑ test_ticks

@crusaderky crusaderky merged commit 3b14fd8 into dask:main Jun 4, 2026
48 of 52 checks passed
@crusaderky crusaderky deleted the cuda branch June 4, 2026 15:39
@crusaderky crusaderky mentioned this pull request Jun 5, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant