Skip to content

Build CUDA benchmarks once, but run in parallel#8489

Merged
AdamGS merged 3 commits into
developfrom
adamg/unifrom-codspeed-gpu-build
Jun 18, 2026
Merged

Build CUDA benchmarks once, but run in parallel#8489
AdamGS merged 3 commits into
developfrom
adamg/unifrom-codspeed-gpu-build

Comment

9ed22a8
Select commit
Loading
Failed to load commit list.
CodSpeed HQ / CodSpeed Performance Analysis failed Jun 18, 2026 in 0s

Performance Regression: -14.25%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
❌ 4 regressed benchmarks
✅ 1576 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation take_10k_random 197.9 µs 255.8 µs -22.63%
Simulation take_10k_contiguous 218.5 µs 276.4 µs -20.94%
Simulation patched_take_10k_contiguous_patches 232.2 µs 291 µs -20.18%
Simulation patched_take_10k_random 244.2 µs 303 µs -19.41%
WallTime cuda/bitpacked_u8/unpack/3bw[100M] 352.6 µs 299.3 µs +17.8%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing adamg/unifrom-codspeed-gpu-build (9ed22a8) with develop (d020924)

Open in CodSpeed