I'm curious if there were any recent discussions on migrating the storage type from cuco::pair_type<cuda::atomic<key_type>, cuda::atomic<mapped_type>>* to coco::pair_type<key_type, mapped_type>*and then use libcu++'s new cuda::atomic_ref<T> for thread-safe table manipulation when needed. (also addressed in NVIDIA/libcudacxx#110)
Let's start this thread to collect everything related to this topic in one place.
I will update the top post regularly so people don't have to scroll through everything.
The lifetime of an object must exceed the lifetime of all atomic_refs that references the object. While any atomic_ref instances referencing an object exists, the object must be exclusively accessed through these atomic_ref instances. No subobject of an object referenced by an atomic_ref object may be concurrently referenced by any other atomic_ref object.
Atomic operations applied to an object through an atomic_ref are atomic with respect to atomic operations applied through any other atomic_ref referencing the same object.
- We need to refactor some components, e.g.,
probe_sequence defines the slot type as a pair of cuda::atomics. However, it seems that it is not actually using any of the atomic operations.
As I am currently refactoring the static_reduction_map PR (#98), I would suggest using this as an opportunity to test this atomic_ref approach in our codebase and see if we run into any problems before refactoring all of the data structures.
I'm curious if there were any recent discussions on migrating the storage type from
cuco::pair_type<cuda::atomic<key_type>, cuda::atomic<mapped_type>>*tococo::pair_type<key_type, mapped_type>*and then use libcu++'s newcuda::atomic_ref<T>for thread-safe table manipulation when needed. (also addressed in NVIDIA/libcudacxx#110)Let's start this thread to collect everything related to this topic in one place.
I will update the top post regularly so people don't have to scroll through everything.
Starting with release 1.9.0, libcu++'s
atomic_refwill support floating point types (see Add atomics for floating point types. libcudacxx#286), as well as <4B types (they should be already available iirc). Though FP16 is still WIP.Large type support (>8B) is currently blocked by Enable user-provided lock table for
atomic_ref<T>cccl#990 and Added support for most of <mutex> libcudacxx#113. The latter is planned to land in 1.9.0.Constraints imposed by the standard:
probe_sequencedefines the slot type as a pair ofcuda::atomics. However, it seems that it is not actually using any of the atomic operations.As I am currently refactoring thestatic_reduction_mapPR (#98), I would suggest using this as an opportunity to test thisatomic_refapproach in our codebase and see if we run into any problems before refactoring all of the data structures.