Skip to content

Use cuda::std::array in histogram APIs#3973

Merged
bernhardmgruber merged 4 commits intoNVIDIA:mainfrom
bernhardmgruber:ref_hist_array
Mar 5, 2025
Merged

Use cuda::std::array in histogram APIs#3973
bernhardmgruber merged 4 commits intoNVIDIA:mainfrom
bernhardmgruber:ref_hist_array

Conversation

@bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Mar 1, 2025

Fixes: #1765

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 1, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 1, 2025
@bernhardmgruber bernhardmgruber marked this pull request as ready for review March 3, 2025 08:28
@bernhardmgruber bernhardmgruber requested a review from a team as a code owner March 3, 2025 08:28
@bernhardmgruber bernhardmgruber requested a review from elstehle March 3, 2025 08:28
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2025

🟨 CI finished in 1h 21m: Pass: 55%/93 | Total: 21h 43m | Avg: 14m 01s | Max: 1h 17m | Hits: 93%/84792
  • 🟨 cub: Pass: 8%/45 | Total: 10h 55m | Avg: 14m 33s | Max: 1h 17m | Hits: 82%/4348

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 11m 44s | Avg:  5m 52s | Max:  6m 17s | Hits:  98%/2100  
      🔍 nvcc               Pass:   4%/43  | Total: 10h 43m | Avg: 14m 57s | Max:  1h 17m | Hits:  67%/2248  
    🟨 ctk
      🟥 12.0               Pass:   0%/5   | Total:  1h 11m | Avg: 14m 18s | Max:  1h 01m
      🟩 12.5               Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  67%/2248  
      🟨 12.8               Pass:   5%/38  | Total:  7h 34m | Avg: 11m 57s | Max:  1h 17m | Hits:  98%/2100  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 11m 44s | Avg:  5m 52s | Max:  6m 17s | Hits:  98%/2100  
      🟥 nvcc12.0           Pass:   0%/5   | Total:  1h 11m | Avg: 14m 18s | Max:  1h 01m
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  67%/2248  
      🟥 nvcc12.8           Pass:   0%/36  | Total:  7h 22m | Avg: 12m 17s | Max:  1h 17m
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total: 20m 13s | Avg:  5m 03s | Max:  7m 49s
      🟥 Clang15            Pass:   0%/2   | Total: 13m 53s | Avg:  6m 56s | Max:  7m 18s
      🟥 Clang16            Pass:   0%/2   | Total: 14m 11s | Avg:  7m 05s | Max:  7m 07s
      🟥 Clang17            Pass:   0%/2   | Total: 13m 56s | Avg:  6m 58s | Max:  7m 16s
      🟨 Clang18            Pass:  28%/7   | Total: 31m 56s | Avg:  4m 33s | Max:  7m 01s | Hits:  98%/2100  
      🟥 GCC7               Pass:   0%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  7m 30s
      🟥 GCC8               Pass:   0%/1   | Total:  6m 37s | Avg:  6m 37s | Max:  6m 37s
      🟥 GCC9               Pass:   0%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 23s
      🟥 GCC10              Pass:   0%/2   | Total: 14m 35s | Avg:  7m 17s | Max:  7m 18s
      🟥 GCC11              Pass:   0%/2   | Total: 15m 52s | Avg:  7m 56s | Max:  8m 07s
      🟥 GCC12              Pass:   0%/2   | Total: 15m 02s | Avg:  7m 31s | Max:  7m 33s
      🟥 GCC13              Pass:   0%/11  | Total:  1h 17m | Avg:  7m 00s | Max: 42m 25s
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 12m
      🟥 MSVC14.42          Pass:   0%/2   | Total:  2h 29m | Avg:  1h 14m | Max:  1h 17m
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  67%/2248  
    🟨 cxx_family
      🟨 Clang              Pass:  11%/17  | Total:  1h 34m | Avg:  5m 32s | Max:  7m 49s | Hits:  98%/2100  
      🟥 GCC                Pass:   0%/22  | Total:  2h 28m | Avg:  6m 46s | Max: 42m 25s
      🟥 MSVC               Pass:   0%/4   | Total:  4h 42m | Avg:  1h 10m | Max:  1h 17m
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  67%/2248  
    🟨 cpu
      🟨 amd64              Pass:   9%/43  | Total: 10h 41m | Avg: 14m 55s | Max:  1h 17m | Hits:  82%/4348  
      🟥 arm64              Pass:   0%/2   | Total: 13m 34s | Avg:  6m 47s | Max:  7m 12s
    🟨 gpu
      🟥 h100               Pass:   0%/3   | Total: 12m 31s | Avg:  4m 10s | Max: 12m 31s
      🟨 rtx2080            Pass:  11%/34  | Total: 10h 28m | Avg: 18m 28s | Max:  1h 17m | Hits:  82%/4348  
      🟥 rtxa6000           Pass:   0%/8   | Total: 14m 22s | Avg:  1m 47s | Max:  7m 21s
    🟨 jobs
      🟨 Build              Pass:  10%/37  | Total: 10h 55m | Avg: 17m 42s | Max:  1h 17m | Hits:  82%/4348  
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟥 HostLaunch         Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total: 12m 31s | Avg:  4m 10s | Max: 12m 31s
      🟥 90;90a;100         Pass:   0%/1   | Total: 42m 25s | Avg: 42m 25s | Max: 42m 25s
    🟨 std
      🟨 17                 Pass:  10%/20  | Total:  6h 13m | Avg: 18m 40s | Max:  1h 12m | Hits:  82%/2174  
      🟨 20                 Pass:   8%/25  | Total:  4h 41m | Avg: 11m 15s | Max:  1h 17m | Hits:  82%/2174  
    
  • 🟩 thrust: Pass: 100%/45 | Total: 9h 35m | Avg: 12m 47s | Max: 49m 55s | Hits: 94%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 16m 47s | Avg:  8m 23s | Max: 11m 07s | Hits:  99%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  9h 26m | Avg: 13m 09s | Max: 49m 55s | Hits:  93%/76573 
      🟩 arm64              Pass: 100%/2   | Total:  9m 48s | Avg:  4m 54s | Max:  5m 17s | Hits:  99%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 00m | Avg: 12m 04s | Max: 40m 24s | Hits:  94%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 38m | Avg: 49m 21s | Max: 49m 55s | Hits:  65%/3562  
      🟩 12.8               Pass: 100%/38  | Total:  6h 56m | Avg: 10m 58s | Max: 45m 20s | Hits:  95%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 24s | Hits:  99%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 00m | Avg: 12m 04s | Max: 40m 24s | Hits:  94%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 38m | Avg: 49m 21s | Max: 49m 55s | Hits:  65%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  6h 45m | Avg: 11m 16s | Max: 45m 20s | Hits:  95%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 24s | Hits:  99%/3562  
      🟩 nvcc               Pass: 100%/43  | Total:  9h 25m | Avg: 13m 08s | Max: 49m 55s | Hits:  93%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 15s | Avg:  5m 03s | Max:  5m 21s | Hits:  99%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 01s | Avg:  5m 30s | Max:  5m 46s | Hits:  99%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 35s | Avg:  5m 47s | Max:  5m 53s | Hits:  99%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 06s | Avg:  5m 33s | Max:  5m 45s | Hits:  99%/3562  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 14s | Avg:  6m 10s | Max: 10m 16s | Hits:  99%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 25s | Avg:  5m 12s | Max:  5m 24s | Hits:  99%/3564  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 14s | Avg:  5m 14s | Max:  5m 14s | Hits:  99%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 53s | Avg:  5m 26s | Max:  5m 40s | Hits:  99%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 16s | Avg:  5m 38s | Max:  5m 52s | Hits:  99%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 48s | Avg:  5m 54s | Max:  6m 05s | Hits:  99%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max:  6m 15s | Hits:  99%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 56m | Avg: 11m 38s | Max: 33m 12s | Hits:  95%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 21m | Avg: 40m 33s | Max: 40m 43s | Hits:  70%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 00m | Avg: 40m 08s | Max: 45m 20s | Hits:  70%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 38m | Avg: 49m 21s | Max: 49m 55s | Hits:  65%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 37m | Avg:  5m 43s | Max: 10m 16s | Hits:  99%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  2h 58m | Avg:  8m 29s | Max: 33m 12s | Hits:  97%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 21m | Avg: 40m 18s | Max: 45m 20s | Hits:  70%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 38m | Avg: 49m 21s | Max: 49m 55s | Hits:  65%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 29m 29s | Avg: 14m 44s | Max: 18m 06s | Hits:  88%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total:  6h 45m | Avg: 12m 17s | Max: 49m 55s | Hits:  94%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 20m | Avg: 14m 03s | Max: 44m 16s | Hits:  94%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  8h 05m | Avg: 12m 46s | Max: 49m 55s | Hits:  93%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 25s | Avg: 15m 28s | Max: 30m 49s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 14s | Avg: 11m 03s | Max: 11m 28s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 29m 29s | Avg: 14m 44s | Max: 18m 06s | Hits:  88%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 33m 12s | Avg: 33m 12s | Max: 33m 12s | Hits:  77%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  4h 23m | Avg: 13m 09s | Max: 48m 47s | Hits:  93%/35611 
      🟩 20                 Pass: 100%/23  | Total:  4h 56m | Avg: 12m 52s | Max: 49m 55s | Hits:  93%/40961 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 08s | Avg: 7m 34s | Max: 12m 31s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 08s | Avg:  7m 34s | Max: 12m 31s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 37s | Avg:  2m 37s | Max:  2m 37s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@bernhardmgruber bernhardmgruber requested a review from a team as a code owner March 3, 2025 17:44
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2025

🟨 CI finished in 1h 30m: Pass: 79%/93 | Total: 21h 59m | Avg: 14m 11s | Max: 1h 20m | Hits: 92%/110728
  • 🟨 cub: Pass: 57%/45 | Total: 11h 35m | Avg: 15m 26s | Max: 1h 20m | Hits: 84%/30464

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 46s | Hits:  98%/2104  
      🔍 nvcc               Pass:  55%/43  | Total: 11h 23m | Avg: 15m 53s | Max:  1h 20m | Hits:  83%/28360 
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  2h 31m | Avg:  8m 53s | Max: 24m 47s | Hits:  98%/20382 
      🔍 GCC                Pass:  13%/22  | Total:  1h 56m | Avg:  5m 17s | Max:  8m 18s | Hits:  98%/3660  
      🟩 MSVC               Pass: 100%/4   | Total:  4h 49m | Avg:  1h 12m | Max:  1h 20m | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 09m | Hits:  60%/2254  
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  1h 30m | Avg: 18m 00s | Max:  1h 03m | Hits:  80%/4702  
      🟩 12.5               Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 09m | Hits:  60%/2254  
      🟨 12.8               Pass:  52%/38  | Total:  7h 47m | Avg: 12m 18s | Max:  1h 20m | Hits:  87%/23508 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 46s | Hits:  98%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  1h 30m | Avg: 18m 00s | Max:  1h 03m | Hits:  80%/4702  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 09m | Hits:  60%/2254  
      🟨 nvcc12.8           Pass:  50%/36  | Total:  7h 35m | Avg: 12m 39s | Max:  1h 20m | Hits:  86%/21404 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 27m 08s | Avg:  6m 47s | Max:  7m 28s | Hits:  98%/4880  
      🟩 Clang15            Pass: 100%/2   | Total: 14m 10s | Avg:  7m 05s | Max:  7m 06s | Hits:  98%/2436  
      🟩 Clang16            Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max:  7m 50s | Hits:  98%/2436  
      🟩 Clang17            Pass: 100%/2   | Total: 15m 00s | Avg:  7m 30s | Max:  7m 46s | Hits:  98%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 19m | Avg: 11m 23s | Max: 24m 47s | Hits:  98%/8194  
      🟩 GCC7               Pass: 100%/2   | Total: 14m 04s | Avg:  7m 02s | Max:  7m 32s | Hits:  98%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 56s | Avg:  6m 56s | Max:  6m 56s | Hits:  98%/1220  
      🟥 GCC9               Pass:   0%/2   | Total: 14m 04s | Avg:  7m 02s | Max:  7m 20s
      🟥 GCC10              Pass:   0%/2   | Total: 14m 21s | Avg:  7m 10s | Max:  7m 17s
      🟥 GCC11              Pass:   0%/2   | Total: 14m 50s | Avg:  7m 25s | Max:  7m 47s
      🟥 GCC12              Pass:   0%/2   | Total: 15m 40s | Avg:  7m 50s | Max:  8m 09s
      🟥 GCC13              Pass:   0%/11  | Total: 36m 29s | Avg:  3m 19s | Max:  8m 18s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 35m | Avg:  1h 17m | Max:  1h 20m | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 09m | Hits:  60%/2254  
    🟨 cpu
      🟨 amd64              Pass:  58%/43  | Total: 11h 21m | Avg: 15m 50s | Max:  1h 20m | Hits:  83%/29246 
      🟨 arm64              Pass:  50%/2   | Total: 14m 02s | Avg:  7m 01s | Max:  7m 25s | Hits:  98%/1218  
    🟨 gpu
      🟥 h100               Pass:   0%/3   | Total:  5m 01s | Avg:  1m 40s | Max:  5m 01s
      🟨 rtx2080            Pass:  67%/34  | Total: 10h 27m | Avg: 18m 27s | Max:  1h 20m | Hits:  82%/26810 
      🟨 rtxa6000           Pass:  37%/8   | Total:  1h 02m | Avg:  7m 46s | Max: 24m 47s | Hits:  99%/3654  
    🟨 jobs
      🟨 Build              Pass:  64%/37  | Total: 10h 48m | Avg: 17m 30s | Max:  1h 20m | Hits:  83%/28028 
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟨 HostLaunch         Pass:  33%/3   | Total: 24m 47s | Avg:  8m 15s | Max: 24m 47s | Hits: 100%/1218  
      🟨 TestGPU            Pass:  33%/3   | Total: 22m 12s | Avg:  7m 24s | Max: 22m 12s | Hits: 100%/1218  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  5m 01s | Avg:  1m 40s | Max:  5m 01s
      🟥 90;90a;100         Pass:   0%/1   | Total:  7m 50s | Avg:  7m 50s | Max:  7m 50s
    🟨 std
      🟨 17                 Pass:  70%/20  | Total:  6h 37m | Avg: 19m 51s | Max:  1h 20m | Hits:  79%/16277 
      🟨 20                 Pass:  48%/25  | Total:  4h 58m | Avg: 11m 55s | Max:  1h 14m | Hits:  89%/14187 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 9h 09m | Avg: 12m 12s | Max: 53m 39s | Hits: 95%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 40s | Avg:  8m 50s | Max: 11m 13s | Hits:  99%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  9h 00m | Avg: 12m 33s | Max: 53m 39s | Hits:  94%/76401 
      🟩 arm64              Pass: 100%/2   | Total:  9m 35s | Avg:  4m 47s | Max:  5m 08s | Hits:  99%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 06m | Avg: 13m 23s | Max: 47m 25s | Hits:  94%/8881  
      🟩 12.5               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 39s | Max: 53m 39s | Hits:  64%/3554  
      🟩 12.8               Pass: 100%/38  | Total:  6h 19m | Avg:  9m 59s | Max: 44m 48s | Hits:  96%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 10s | Avg:  5m 05s | Max:  5m 10s | Hits: 100%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 06m | Avg: 13m 23s | Max: 47m 25s | Hits:  94%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 43m | Avg: 51m 39s | Max: 53m 39s | Hits:  64%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  6h 09m | Avg: 10m 15s | Max: 44m 48s | Hits:  96%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 10s | Avg:  5m 05s | Max:  5m 10s | Hits: 100%/3554  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 59m | Avg: 12m 32s | Max: 53m 39s | Hits:  94%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 14s | Avg:  5m 03s | Max:  5m 36s | Hits: 100%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  5m 56s | Hits: 100%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  5m 50s | Hits: 100%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  5m 46s | Hits: 100%/3554  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 03s | Avg:  6m 09s | Max: 10m 17s | Hits: 100%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 05s | Avg:  5m 02s | Max:  5m 19s | Hits:  99%/3556  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s | Hits:  99%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 58s | Hits:  99%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 24s | Avg:  5m 42s | Max:  5m 51s | Hits:  99%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 13s | Avg:  5m 36s | Max:  5m 47s | Hits:  99%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  5m 49s | Hits:  99%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 16m | Avg:  7m 37s | Max: 11m 26s | Hits:  99%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 00s | Max: 47m 25s | Hits:  70%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 01m | Avg: 40m 23s | Max: 44m 48s | Hits:  70%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 39s | Max: 53m 39s | Hits:  64%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 37m | Avg:  5m 44s | Max: 10m 17s | Hits: 100%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  2h 17m | Avg:  6m 32s | Max: 11m 26s | Hits:  99%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 31m | Avg: 42m 14s | Max: 47m 25s | Hits:  70%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 43m | Avg: 51m 39s | Max: 53m 39s | Hits:  64%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 32s | Avg:  7m 46s | Max: 10m 53s | Hits:  99%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  6h 30m | Avg: 11m 49s | Max: 53m 39s | Hits:  95%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 23m | Avg: 14m 22s | Max: 44m 48s | Hits:  94%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  7h 37m | Avg: 12m 02s | Max: 53m 39s | Hits:  94%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 47m 59s | Avg: 15m 59s | Max: 32m 35s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 49s | Avg: 10m 57s | Max: 11m 26s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 15m 32s | Avg:  7m 46s | Max: 10m 53s | Hits:  99%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 24s | Avg:  6m 24s | Max:  6m 24s | Hits:  99%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  4h 30m | Avg: 13m 32s | Max: 49m 39s | Hits:  93%/35531 
      🟩 20                 Pass: 100%/23  | Total:  4h 21m | Avg: 11m 20s | Max: 53m 39s | Hits:  95%/40869 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 04s | Avg: 7m 32s | Max: 12m 45s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 45s | Avg: 12m 45s | Max: 12m 45s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 59m 29s | Avg: 59m 29s | Max: 59m 29s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

Copy link
Contributor

@fbusato fbusato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. My only concern is if the changes in the API are considered breaking

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Mar 3, 2025
@bernhardmgruber
Copy link
Contributor Author

looks good. My only concern is if the changes in the API are considered breaking

Yes, this an API-breaking change intended for CCCL 3.0. But maybe this is a big ask and we should add the new APIs as overloads, leaving the old as deprecated throughout CCCL 3.x.

@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Mar 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2025

🟨 CI finished in 1h 39m: Pass: 58%/93 | Total: 1d 21h | Avg: 29m 17s | Max: 1h 12m | Hits: 75%/86536
  • 🟨 cub: Pass: 13%/45 | Total: 22h 50m | Avg: 30m 27s | Max: 1h 12m | Hits: 34%/6272

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 19s | Max: 58m 35s | Hits:  74%/2104  
      🔍 nvcc               Pass:   9%/43  | Total: 20h 54m | Avg: 29m 10s | Max:  1h 12m | Hits:  14%/4168  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 56m | Avg: 58m 19s | Max: 58m 35s | Hits:  74%/2104  
      🟨 nvcc12.0           Pass:  20%/5   | Total:  3h 16m | Avg: 39m 23s | Max:  1h 03m | Hits:  14%/1042  
      🟥 nvcc12.5           Pass:   0%/2   | Total:  1h 05m | Avg: 32m 49s | Max: 33m 16s
      🟨 nvcc12.8           Pass:   8%/36  | Total: 16h 31m | Avg: 27m 32s | Max:  1h 12m | Hits:  14%/3126  
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  2h 06m | Avg: 31m 35s | Max: 32m 42s
      🟥 Clang15            Pass:   0%/2   | Total:  1h 02m | Avg: 31m 21s | Max: 31m 43s
      🟥 Clang16            Pass:   0%/2   | Total:  1h 05m | Avg: 32m 33s | Max: 34m 04s
      🟥 Clang17            Pass:   0%/2   | Total:  1h 01m | Avg: 30m 32s | Max: 30m 37s
      🟨 Clang18            Pass:  28%/7   | Total:  3h 37m | Avg: 31m 04s | Max: 58m 35s | Hits:  74%/2104  
      🟥 GCC7               Pass:   0%/2   | Total:  1h 06m | Avg: 33m 12s | Max: 36m 38s
      🟥 GCC8               Pass:   0%/1   | Total: 30m 55s | Avg: 30m 55s | Max: 30m 55s
      🟥 GCC9               Pass:   0%/2   | Total:  1h 02m | Avg: 31m 04s | Max: 32m 29s
      🟥 GCC10              Pass:   0%/2   | Total:  1h 01m | Avg: 30m 42s | Max: 31m 28s
      🟥 GCC11              Pass:   0%/2   | Total:  1h 00m | Avg: 30m 20s | Max: 30m 38s
      🟥 GCC12              Pass:   0%/2   | Total:  1h 03m | Avg: 31m 46s | Max: 32m 55s
      🟥 GCC13              Pass:   0%/11  | Total:  2h 32m | Avg: 13m 54s | Max: 40m 34s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 11m | Hits:  14%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 12m | Hits:  14%/2084  
      🟥 NVHPC24.7          Pass:   0%/2   | Total:  1h 05m | Avg: 32m 49s | Max: 33m 16s
    🟨 cxx_family
      🟨 Clang              Pass:  11%/17  | Total:  8h 52m | Avg: 31m 20s | Max: 58m 35s | Hits:  74%/2104  
      🟥 GCC                Pass:   0%/22  | Total:  8h 18m | Avg: 22m 38s | Max: 40m 34s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 34m | Avg:  1h 08m | Max:  1h 12m | Hits:  14%/4168  
      🟥 NVHPC              Pass:   0%/2   | Total:  1h 05m | Avg: 32m 49s | Max: 33m 16s
    🟨 cpu
      🟨 amd64              Pass:  13%/43  | Total: 21h 30m | Avg: 30m 00s | Max:  1h 12m | Hits:  34%/6272  
      🟥 arm64              Pass:   0%/2   | Total:  1h 20m | Avg: 40m 25s | Max: 40m 34s
    🟨 ctk
      🟨 12.0               Pass:  20%/5   | Total:  3h 16m | Avg: 39m 23s | Max:  1h 03m | Hits:  14%/1042  
      🟥 12.5               Pass:   0%/2   | Total:  1h 05m | Avg: 32m 49s | Max: 33m 16s
      🟨 12.8               Pass:  13%/38  | Total: 18h 28m | Avg: 29m 10s | Max:  1h 12m | Hits:  38%/5230  
    🟨 gpu
      🟥 h100               Pass:   0%/3   | Total: 11m 59s | Avg:  3m 59s | Max: 11m 59s
      🟨 rtx2080            Pass:  17%/34  | Total: 21h 36m | Avg: 38m 07s | Max:  1h 12m | Hits:  34%/6272  
      🟥 rtxa6000           Pass:   0%/8   | Total:  1h 02m | Avg:  7m 48s | Max: 32m 24s
    🟨 jobs
      🟨 Build              Pass:  16%/37  | Total: 22h 50m | Avg: 37m 03s | Max:  1h 12m | Hits:  34%/6272  
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟥 HostLaunch         Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total: 11m 59s | Avg:  3m 59s | Max: 11m 59s
      🟥 90;90a;100         Pass:   0%/1   | Total: 37m 30s | Avg: 37m 30s | Max: 37m 30s
    🟨 std
      🟨 17                 Pass:  20%/20  | Total: 12h 44m | Avg: 38m 13s | Max:  1h 11m | Hits:  29%/4178  
      🟨 20                 Pass:   8%/25  | Total: 10h 06m | Avg: 24m 15s | Max:  1h 12m | Hits:  44%/2094  
    
  • 🟩 thrust: Pass: 100%/45 | Total: 21h 17m | Avg: 28m 23s | Max: 54m 57s | Hits: 78%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 50m 08s | Avg: 25m 04s | Max: 29m 10s | Hits:  71%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 20h 26m | Avg: 28m 31s | Max: 54m 57s | Hits:  78%/76401 
      🟩 arm64              Pass: 100%/2   | Total: 50m 57s | Avg: 25m 28s | Max: 26m 36s | Hits:  79%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 44m | Avg: 32m 56s | Max: 50m 59s | Hits:  75%/8881  
      🟩 12.5               Pass: 100%/2   | Total:  1h 35m | Avg: 47m 59s | Max: 48m 08s | Hits:  72%/3554  
      🟩 12.8               Pass: 100%/38  | Total: 16h 56m | Avg: 26m 45s | Max: 54m 57s | Hits:  79%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 45m 15s | Avg: 22m 37s | Max: 22m 41s | Hits:  79%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 44m | Avg: 32m 56s | Max: 50m 59s | Hits:  75%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 35m | Avg: 47m 59s | Max: 48m 08s | Hits:  72%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 11m | Avg: 26m 59s | Max: 54m 57s | Hits:  79%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 45m 15s | Avg: 22m 37s | Max: 22m 41s | Hits:  79%/3554  
      🟩 nvcc               Pass: 100%/43  | Total: 20h 32m | Avg: 28m 39s | Max: 54m 57s | Hits:  78%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 48m | Avg: 27m 13s | Max: 29m 11s | Hits:  79%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 53m 20s | Avg: 26m 40s | Max: 27m 08s | Hits:  79%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 56m 14s | Avg: 28m 07s | Max: 28m 13s | Hits:  79%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 54m 32s | Avg: 27m 16s | Max: 27m 18s | Hits:  79%/3554  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 20m | Avg: 20m 02s | Max: 26m 45s | Hits:  85%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 54m 56s | Avg: 27m 28s | Max: 27m 42s | Hits:  79%/3556  
      🟩 GCC8               Pass: 100%/1   | Total: 29m 46s | Avg: 29m 46s | Max: 29m 46s | Hits:  79%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 58m 30s | Avg: 29m 15s | Max: 30m 03s | Hits:  79%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 56m 24s | Avg: 28m 12s | Max: 28m 38s | Hits:  79%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 56m 09s | Avg: 28m 04s | Max: 28m 13s | Hits:  79%/3556  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 08s | Max: 32m 34s | Hits:  79%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 32m | Avg: 21m 13s | Max: 30m 29s | Hits:  83%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 56s | Max: 50m 59s | Hits:  56%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 16m | Avg: 45m 22s | Max: 54m 57s | Hits:  60%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 35m | Avg: 47m 59s | Max: 48m 08s | Hits:  72%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  6h 53m | Avg: 24m 18s | Max: 29m 11s | Hits:  82%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  8h 52m | Avg: 25m 20s | Max: 32m 34s | Hits:  81%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 56m | Avg: 47m 12s | Max: 54m 57s | Hits:  58%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 35m | Avg: 47m 59s | Max: 48m 08s | Hits:  72%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 27m 18s | Avg: 13m 39s | Max: 15m 48s | Hits:  89%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total: 17h 05m | Avg: 31m 05s | Max: 54m 57s | Hits:  77%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 44m | Avg: 22m 26s | Max: 52m 09s | Hits:  82%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 19h 38m | Avg: 31m 01s | Max: 54m 57s | Hits:  75%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 44m 49s | Avg: 14m 56s | Max: 29m 02s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 53m 54s | Avg: 13m 28s | Max: 20m 58s | Hits:  98%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 27m 18s | Avg: 13m 39s | Max: 15m 48s | Hits:  89%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 29s | Avg: 30m 29s | Max: 30m 29s | Hits:  79%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 50m | Avg: 32m 31s | Max: 54m 57s | Hits:  75%/35531 
      🟩 20                 Pass: 100%/23  | Total:  9h 36m | Avg: 25m 04s | Max: 52m 09s | Hits:  82%/40869 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 04s | Avg: 7m 32s | Max: 12m 45s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 04s | Avg:  7m 32s | Max: 12m 45s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 45s | Avg: 12m 45s | Max: 12m 45s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 00m | Avg: 1h 00m | Max: 1h 00m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

Comment on lines +53 to 59
_CCCL_SUPPRESS_DEPRECATED_PUSH
DECLARE_TMPL_LAUNCH_WRAPPER(cub::DeviceHistogram::MultiHistogramEven,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course the warning suppression doesn't work for the _lid1 targets. What do the reviewers think? Do we need to cover the deprecated APIs in the unit tests? That would make solving this problem obsolete.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2025

🟨 CI finished in 1h 22m: Pass: 79%/93 | Total: 18h 16m | Avg: 11m 47s | Max: 1h 05m | Hits: 93%/110728
  • 🟨 cub: Pass: 57%/45 | Total: 10h 16m | Avg: 13m 42s | Max: 1h 05m | Hits: 86%/30464

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 16s | Hits:  98%/2104  
      🔍 nvcc               Pass:  55%/43  | Total: 10h 04m | Avg: 14m 04s | Max:  1h 05m | Hits:  85%/28360 
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  2h 29m | Avg:  8m 48s | Max: 23m 10s | Hits:  98%/20382 
      🔍 GCC                Pass:  13%/22  | Total:  1h 59m | Avg:  5m 25s | Max:  8m 21s | Hits:  98%/3660  
      🟩 MSVC               Pass: 100%/4   | Total:  4h 12m | Avg:  1h 03m | Max:  1h 05m | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 35m | Avg: 47m 46s | Max: 47m 51s | Hits:  83%/2254  
    🟨 ctk
      🟨 12.0               Pass:  80%/5   | Total:  1h 29m | Avg: 17m 48s | Max:  1h 01m | Hits:  80%/4702  
      🟩 12.5               Pass: 100%/2   | Total:  1h 35m | Avg: 47m 46s | Max: 47m 51s | Hits:  83%/2254  
      🟨 12.8               Pass:  52%/38  | Total:  7h 12m | Avg: 11m 22s | Max:  1h 05m | Hits:  87%/23508 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 12m 04s | Avg:  6m 02s | Max:  6m 16s | Hits:  98%/2104  
      🟨 nvcc12.0           Pass:  80%/5   | Total:  1h 29m | Avg: 17m 48s | Max:  1h 01m | Hits:  80%/4702  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 35m | Avg: 47m 46s | Max: 47m 51s | Hits:  83%/2254  
      🟨 nvcc12.8           Pass:  50%/36  | Total:  7h 00m | Avg: 11m 40s | Max:  1h 05m | Hits:  86%/21404 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 28m 13s | Avg:  7m 03s | Max:  7m 35s | Hits:  98%/4880  
      🟩 Clang15            Pass: 100%/2   | Total: 14m 23s | Avg:  7m 11s | Max:  7m 12s | Hits:  98%/2436  
      🟩 Clang16            Pass: 100%/2   | Total: 14m 08s | Avg:  7m 04s | Max:  7m 08s | Hits:  98%/2436  
      🟩 Clang17            Pass: 100%/2   | Total: 15m 35s | Avg:  7m 47s | Max:  7m 54s | Hits:  98%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 17m | Avg: 11m 04s | Max: 23m 10s | Hits:  98%/8194  
      🟩 GCC7               Pass: 100%/2   | Total: 14m 37s | Avg:  7m 18s | Max:  7m 29s | Hits:  98%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 55s | Avg:  6m 55s | Max:  6m 55s | Hits:  98%/1220  
      🟥 GCC9               Pass:   0%/2   | Total: 14m 41s | Avg:  7m 20s | Max:  7m 34s
      🟥 GCC10              Pass:   0%/2   | Total: 15m 12s | Avg:  7m 36s | Max:  7m 57s
      🟥 GCC11              Pass:   0%/2   | Total: 15m 11s | Avg:  7m 35s | Max:  7m 52s
      🟥 GCC12              Pass:   0%/2   | Total: 15m 40s | Avg:  7m 50s | Max:  8m 06s
      🟥 GCC13              Pass:   0%/11  | Total: 37m 02s | Avg:  3m 22s | Max:  8m 21s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 05m | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 35m | Avg: 47m 46s | Max: 47m 51s | Hits:  83%/2254  
    🟨 cpu
      🟨 amd64              Pass:  58%/43  | Total: 10h 02m | Avg: 14m 00s | Max:  1h 05m | Hits:  85%/29246 
      🟨 arm64              Pass:  50%/2   | Total: 14m 17s | Avg:  7m 08s | Max:  7m 38s | Hits:  98%/1218  
    🟨 gpu
      🟥 h100               Pass:   0%/3   | Total:  5m 18s | Avg:  1m 46s | Max:  5m 18s
      🟨 rtx2080            Pass:  67%/34  | Total:  9h 12m | Avg: 16m 15s | Max:  1h 05m | Hits:  84%/26810 
      🟨 rtxa6000           Pass:  37%/8   | Total: 58m 58s | Avg:  7m 22s | Max: 23m 10s | Hits:  99%/3654  
    🟨 jobs
      🟨 Build              Pass:  64%/37  | Total:  9h 33m | Avg: 15m 29s | Max:  1h 05m | Hits:  84%/28028 
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟨 HostLaunch         Pass:  33%/3   | Total: 23m 10s | Avg:  7m 43s | Max: 23m 10s | Hits: 100%/1218  
      🟨 TestGPU            Pass:  33%/3   | Total: 20m 21s | Avg:  6m 47s | Max: 20m 21s | Hits: 100%/1218  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  5m 18s | Avg:  1m 46s | Max:  5m 18s
      🟥 90;90a;100         Pass:   0%/1   | Total:  7m 51s | Avg:  7m 51s | Max:  7m 51s
    🟨 std
      🟨 17                 Pass:  70%/20  | Total:  5h 53m | Avg: 17m 41s | Max:  1h 05m | Hits:  81%/16277 
      🟨 20                 Pass:  48%/25  | Total:  4h 23m | Avg: 10m 31s | Max:  1h 04m | Hits:  91%/14187 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 44m | Avg: 8m 59s | Max: 34m 20s | Hits: 96%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 32s | Avg:  8m 46s | Max: 11m 33s | Hits:  99%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 35m | Avg:  9m 11s | Max: 34m 20s | Hits:  96%/76401 
      🟩 arm64              Pass: 100%/2   | Total:  9m 36s | Avg:  4m 48s | Max:  5m 08s | Hits:  99%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 42m 54s | Avg:  8m 34s | Max: 23m 06s | Hits:  94%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 35m 11s | Avg: 17m 35s | Max: 18m 39s | Hits:  96%/3554  
      🟩 12.8               Pass: 100%/38  | Total:  5h 26m | Avg:  8m 35s | Max: 34m 20s | Hits:  96%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 12s | Avg:  5m 06s | Max:  5m 23s | Hits: 100%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 42m 54s | Avg:  8m 34s | Max: 23m 06s | Hits:  94%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 35m 11s | Avg: 17m 35s | Max: 18m 39s | Hits:  96%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 16m | Avg:  8m 47s | Max: 34m 20s | Hits:  96%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 12s | Avg:  5m 06s | Max:  5m 23s | Hits: 100%/3554  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 34m | Avg:  9m 10s | Max: 34m 20s | Hits:  96%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 14s | Avg:  5m 03s | Max:  5m 22s | Hits: 100%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 10m 46s | Avg:  5m 23s | Max:  5m 25s | Hits: 100%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 38s | Avg:  5m 19s | Max:  5m 22s | Hits: 100%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 10m 50s | Avg:  5m 25s | Max:  5m 30s | Hits: 100%/3554  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 34s | Avg:  6m 13s | Max: 10m 40s | Hits: 100%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 20s | Avg:  5m 10s | Max:  5m 33s | Hits:  99%/3556  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s | Hits:  99%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 55s | Avg:  5m 27s | Max:  5m 29s | Hits:  99%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 10m 48s | Avg:  5m 24s | Max:  5m 29s | Hits:  99%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 15s | Avg:  5m 37s | Max:  5m 42s | Hits:  99%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 09s | Avg:  6m 04s | Max:  6m 17s | Hits:  99%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 16m | Avg:  7m 41s | Max: 11m 52s | Hits:  99%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 47m 26s | Avg: 23m 43s | Max: 24m 20s | Hits:  70%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 28m | Avg: 29m 32s | Max: 34m 20s | Hits:  70%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 35m 11s | Avg: 17m 35s | Max: 18m 39s | Hits:  96%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 36m | Avg:  5m 38s | Max: 10m 40s | Hits: 100%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  2h 17m | Avg:  6m 32s | Max: 11m 52s | Hits:  99%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 16m | Avg: 27m 12s | Max: 34m 20s | Hits:  70%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 35m 11s | Avg: 17m 35s | Max: 18m 39s | Hits:  96%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 46s | Avg:  8m 23s | Max: 11m 52s | Hits:  99%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 19m | Avg:  7m 51s | Max: 25m 52s | Hits:  97%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 08m | Avg: 12m 52s | Max: 34m 20s | Hits:  94%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 09m | Avg:  8m 09s | Max: 28m 26s | Hits:  96%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 32s | Avg: 16m 30s | Max: 34m 20s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 14s | Avg: 11m 18s | Max: 11m 52s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 46s | Avg:  8m 23s | Max: 11m 52s | Hits:  99%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 33s | Avg:  6m 33s | Max:  6m 33s | Hits:  99%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 55m | Avg:  8m 47s | Max: 25m 52s | Hits:  95%/35531 
      🟩 20                 Pass: 100%/23  | Total:  3h 31m | Avg:  9m 11s | Max: 34m 20s | Hits:  97%/40869 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 16s | Avg: 7m 38s | Max: 13m 03s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 16s | Avg:  7m 38s | Max: 13m 03s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 13s | Avg:  2m 13s | Max:  2m 13s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 03s | Avg: 13m 03s | Max: 13m 03s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2025

🟩 CI finished in 1h 08m: Pass: 100%/93 | Total: 17h 01m | Avg: 10m 59s | Max: 59m 45s | Hits: 95%/133878
  • 🟩 cub: Pass: 100%/45 | Total: 9h 04m | Avg: 12m 05s | Max: 32m 57s | Hits: 93%/53614

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 50m | Avg: 12m 20s | Max: 32m 57s | Hits:  92%/51178 
      🟩 arm64              Pass: 100%/2   | Total: 13m 40s | Avg:  6m 50s | Max:  7m 16s | Hits:  99%/2436  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 57m 30s | Avg: 11m 30s | Max: 30m 47s | Hits:  84%/5922  
      🟩 12.5               Pass: 100%/2   | Total: 24m 40s | Avg: 12m 20s | Max: 12m 25s | Hits:  98%/2254  
      🟩 12.8               Pass: 100%/38  | Total:  7h 42m | Avg: 12m 09s | Max: 32m 57s | Hits:  93%/45438 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  5m 40s | Hits:  99%/2104  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 57m 30s | Avg: 11m 30s | Max: 30m 47s | Hits:  84%/5922  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 24m 40s | Avg: 12m 20s | Max: 12m 25s | Hits:  98%/2254  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  7h 30m | Avg: 12m 31s | Max: 32m 57s | Hits:  93%/43334 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  5m 40s | Hits:  99%/2104  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 52m | Avg: 12m 23s | Max: 32m 57s | Hits:  92%/51510 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 27m 37s | Avg:  6m 54s | Max:  7m 27s | Hits:  99%/4880  
      🟩 Clang15            Pass: 100%/2   | Total: 13m 39s | Avg:  6m 49s | Max:  6m 56s | Hits:  99%/2436  
      🟩 Clang16            Pass: 100%/2   | Total: 13m 47s | Avg:  6m 53s | Max:  6m 55s | Hits:  99%/2436  
      🟩 Clang17            Pass: 100%/2   | Total: 14m 48s | Avg:  7m 24s | Max:  7m 44s | Hits:  99%/2436  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 12m | Avg: 10m 22s | Max: 21m 47s | Hits:  99%/8194  
      🟩 GCC7               Pass: 100%/2   | Total: 14m 19s | Avg:  7m 09s | Max:  7m 39s | Hits:  99%/2440  
      🟩 GCC8               Pass: 100%/1   | Total:  7m 06s | Avg:  7m 06s | Max:  7m 06s | Hits:  99%/1220  
      🟩 GCC9               Pass: 100%/2   | Total: 14m 00s | Avg:  7m 00s | Max:  7m 12s | Hits:  99%/2440  
      🟩 GCC10              Pass: 100%/2   | Total: 15m 48s | Avg:  7m 54s | Max:  8m 12s | Hits:  99%/2440  
      🟩 GCC11              Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max:  7m 50s | Hits:  99%/2436  
      🟩 GCC12              Pass: 100%/2   | Total: 14m 52s | Avg:  7m 26s | Max:  7m 33s | Hits:  99%/2436  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 49m | Avg: 15m 24s | Max: 23m 55s | Hits:  99%/13398 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 40s | Max: 30m 47s | Hits:  15%/2084  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 05m | Avg: 32m 37s | Max: 32m 57s | Hits:  15%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 24m 40s | Avg: 12m 20s | Max: 12m 25s | Hits:  98%/2254  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 22m | Avg:  8m 22s | Max: 21m 47s | Hits:  99%/20382 
      🟩 GCC                Pass: 100%/22  | Total:  4h 10m | Avg: 11m 23s | Max: 23m 55s | Hits:  99%/26810 
      🟩 MSVC               Pass: 100%/4   | Total:  2h 06m | Avg: 31m 39s | Max: 32m 57s | Hits:  15%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total: 24m 40s | Avg: 12m 20s | Max: 12m 25s | Hits:  98%/2254  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 53m 15s | Avg: 17m 45s | Max: 23m 55s | Hits:  99%/3654  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 49m | Avg: 10m 17s | Max: 32m 57s | Hits:  90%/40216 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 21m | Avg: 17m 39s | Max: 23m 31s | Hits:  99%/9744  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  6h 09m | Avg:  9m 59s | Max: 32m 57s | Hits:  91%/43870 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 42s | Avg: 21m 42s | Max: 21m 42s | Hits:  99%/1218  
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 28s | Avg: 17m 28s | Max: 17m 28s | Hits:  99%/1218  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 04s | Max: 23m 55s | Hits:  99%/3654  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 05m | Avg: 21m 59s | Max: 23m 45s | Hits:  99%/3654  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 53m 15s | Avg: 17m 45s | Max: 23m 55s | Hits:  99%/3654  
      🟩 90;90a;100         Pass: 100%/1   | Total:  8m 11s | Avg:  8m 11s | Max:  8m 11s | Hits:  99%/1218  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 39m | Avg: 10m 58s | Max: 32m 18s | Hits:  88%/23591 
      🟩 20                 Pass: 100%/25  | Total:  5h 24m | Avg: 12m 59s | Max: 32m 57s | Hits:  96%/30023 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 42m | Avg: 8m 56s | Max: 35m 19s | Hits: 96%/79956

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 06s | Avg:  8m 33s | Max: 11m 10s | Hits:  99%/3556  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 32m | Avg:  9m 07s | Max: 35m 19s | Hits:  96%/76401 
      🟩 arm64              Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  5m 13s | Hits:  99%/3555  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 28s | Avg:  9m 05s | Max: 26m 02s | Hits:  94%/8881  
      🟩 12.5               Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 15m 29s | Hits:  99%/3554  
      🟩 12.8               Pass: 100%/38  | Total:  5h 27m | Avg:  8m 37s | Max: 35m 19s | Hits:  96%/67521 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 28s | Avg:  5m 14s | Max:  5m 25s | Hits: 100%/3554  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 28s | Avg:  9m 05s | Max: 26m 02s | Hits:  94%/8881  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 15m 29s | Hits:  99%/3554  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 17m | Avg:  8m 48s | Max: 35m 19s | Hits:  96%/63967 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 28s | Avg:  5m 14s | Max:  5m 25s | Hits: 100%/3554  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 31m | Avg:  9m 06s | Max: 35m 19s | Hits:  96%/76402 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 19m 59s | Avg:  4m 59s | Max:  5m 18s | Hits: 100%/7108  
      🟩 Clang15            Pass: 100%/2   | Total: 10m 59s | Avg:  5m 29s | Max:  5m 35s | Hits: 100%/3554  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 33s | Avg:  5m 16s | Max:  5m 17s | Hits: 100%/3554  
      🟩 Clang17            Pass: 100%/2   | Total: 10m 32s | Avg:  5m 16s | Max:  5m 22s | Hits: 100%/3554  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 10s | Avg:  6m 10s | Max: 10m 10s | Hits: 100%/12439 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 13s | Avg:  5m 06s | Max:  5m 21s | Hits:  99%/3556  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 18s | Avg:  5m 18s | Max:  5m 18s | Hits:  99%/1778  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 00s | Avg:  5m 30s | Max:  5m 51s | Hits:  99%/3556  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 44s | Avg:  5m 52s | Max:  6m 14s | Hits:  99%/3556  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 25s | Avg:  5m 42s | Max:  5m 54s | Hits:  99%/3556  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 18s | Avg:  5m 39s | Max:  5m 47s | Hits:  99%/3556  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 16m | Avg:  7m 37s | Max: 11m 57s | Hits:  99%/17780 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 51m 53s | Avg: 25m 56s | Max: 26m 02s | Hits:  70%/3542  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 28m | Avg: 29m 39s | Max: 35m 19s | Hits:  70%/5313  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 15m 29s | Hits:  99%/3554  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 35m | Avg:  5m 36s | Max: 10m 10s | Hits: 100%/30209 
      🟩 GCC                Pass: 100%/21  | Total:  2h 17m | Avg:  6m 31s | Max: 11m 57s | Hits:  99%/37338 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 20m | Avg: 28m 10s | Max: 35m 19s | Hits:  70%/8855  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 56s | Avg: 14m 28s | Max: 15m 29s | Hits:  99%/3554  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 35s | Avg:  8m 17s | Max: 11m 57s | Hits:  99%/3556  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 18m | Avg:  7m 49s | Max: 26m 41s | Hits:  97%/58637 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 07m | Avg: 12m 44s | Max: 35m 19s | Hits:  94%/17763 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 06m | Avg:  8m 04s | Max: 26m 57s | Hits:  96%/67519 
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 30s | Avg: 16m 50s | Max: 35m 19s | Hits:  90%/5326  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 50s | Avg: 11m 12s | Max: 11m 57s | Hits:  99%/7111  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 35s | Avg:  8m 17s | Max: 11m 57s | Hits:  99%/3556  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 00s | Avg:  6m 00s | Max:  6m 00s | Hits:  99%/1778  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 00m | Avg:  9m 02s | Max: 26m 41s | Hits:  95%/35531 
      🟩 20                 Pass: 100%/23  | Total:  3h 24m | Avg:  8m 52s | Max: 35m 19s | Hits:  97%/40869 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 44s | Avg: 7m 52s | Max: 13m 23s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max: 13m 23s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 21s | Avg:  2m 21s | Max:  2m 21s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 23s | Avg: 13m 23s | Max: 13m 23s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 59m 45s | Avg: 59m 45s | Max: 59m 45s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@bernhardmgruber bernhardmgruber enabled auto-merge (squash) March 4, 2025 18:33
@github-project-automation github-project-automation bot moved this from In Progress to In Review in CCCL Mar 5, 2025
@bernhardmgruber bernhardmgruber merged commit a8b3675 into NVIDIA:main Mar 5, 2025
106 of 109 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 5, 2025
@bernhardmgruber bernhardmgruber deleted the ref_hist_array branch March 5, 2025 09:14
davebayer pushed a commit to davebayer/cccl that referenced this pull request Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cub For all items related to CUB

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

CUB histogram API signatures are misleading

4 participants