Expose ivf-flat centers to python/c#888
Expose ivf-flat centers to python/c#888rapids-bot[bot] merged 13 commits intorapidsai:branch-25.06from
Conversation
Similar to rapidsai#881 - also expose centers for ivf-flat as well as ivf-pq
cpp/src/neighbors/ivf_flat_c.cpp
Outdated
| RAFT_EXPECTS(src.extent(0) == dst.extent(0), "Output centers has incorrect number of rows"); | ||
| RAFT_EXPECTS(src.extent(1) == dst.extent(1), "Output centers has incorrect number of cols"); | ||
|
|
||
| cudaMemcpyAsync(dst.data_handle(), |
There was a problem hiding this comment.
Can we use raft::copy or raft::copy_async in these calls instead of direct calls to cudaMemcpyAsync?
There was a problem hiding this comment.
sure! I updated in the last commit to use the raft::copy from raft/util/cudart_utils.hpp .
(Fwiw, I originally tried to use the copy functions from raft/core/copy.hpp and raft/core/copy.cuh but couldn't - since we can't use the '.cuh' version inside the C-api, and the .hpp version was complaining about needing a cuda kernel for the D2H copy iirc, which is why I was using cudaMemCpyAsync directly here).
There was a problem hiding this comment.
I am seeing that nn_descent_c.cpp also uses cudaMemcpyAsync. If there is a reason we have avoided the cudart header, we can stick to cudaMemcpyAsync. My understanding was that having the cudaMemcpyAsync call means that the file would have to be compiled with nvcc anyway so we should be reusing raft functions instead.
(cc @cjnolet)
There was a problem hiding this comment.
My understanding was that having the cudaMemcpyAsync call means that the file would have to be compiled with nvcc anyway
@tarang-jain cudaMemCpyXX() is part of the CUDA runtime API, so it should not require nvcc to compile. Only the lower-level device function routines will need nvcc. Otherwise, it's just linking against the pre-compiled CUDA routines in the runtime API (kind of similar to what end-users do when they use cuVS C/C++ APIs).
The reason why raft::copy ends up requiring nvcc is because there were recently some device functions added to raft to work specifically with mdspan... to be honest, I'd be in favor of separating those out from the ones that only require the runtime API for this very purpose.
There was a problem hiding this comment.
I updated nn_descent_c.cpp in the last commit to use raft::copy.
the cudaMemcpyAsync code is fine (as is copy functions from raft/util/cudart_utils.hpp) - I just couldn't use this code https://github.com/rapidsai/raft/blob/c2dc3124ce3fbcb5ff2ccabd88d7f57570b6aea9/cpp/include/raft/core/copy.cuh#L57-L61 from raft .
fwiw, one nice thing about using raft::copy is that it fixes one issue that this code used to have (wasn't checking the return value from cudaMemCpyAsync , which was a stupid oversight on my part =) ).
mythrocks
left a comment
There was a problem hiding this comment.
Some trivial nitpicks. Still coming to terms with the code.
The C++ side looks good to my eye. +1 non-binding.
| if (index->dtype.code == kDLFloat && index->dtype.bits == 32) { | ||
| auto index_ptr = | ||
| reinterpret_cast<cuvs::neighbors::ivf_flat::index<float, int64_t>*>(index->addr); | ||
| return index_ptr->n_lists(); | ||
| } else if (index->dtype.code == kDLFloat && index->dtype.bits == 16) { | ||
| auto index_ptr = | ||
| reinterpret_cast<cuvs::neighbors::ivf_flat::index<half, int64_t>*>(index->addr); | ||
| return index_ptr->n_lists(); | ||
| } else if (index->dtype.code == kDLInt && index->dtype.bits == 8) { | ||
| auto index_ptr = | ||
| reinterpret_cast<cuvs::neighbors::ivf_flat::index<int8_t, int64_t>*>(index->addr); | ||
| return index_ptr->n_lists(); | ||
| } else if (index->dtype.code == kDLUInt && index->dtype.bits == 8) { | ||
| auto index_ptr = | ||
| reinterpret_cast<cuvs::neighbors::ivf_flat::index<uint8_t, int64_t>*>(index->addr); | ||
| return index_ptr->n_lists(); |
There was a problem hiding this comment.
This is a recurring pattern in the code. One wonders if libcudf's type-dispatch pattern might be of value here.
We might consider exploring at a later date.
| sizeof(float) * index.dim_ext(), | ||
| sizeof(float) * index.dim(), | ||
| index.n_lists(), | ||
| cudaMemcpyDefault, |
There was a problem hiding this comment.
TIL cudaMemcpyDefault. I didn't know/realize that the direction could be inferred.
| auto res_ptr = reinterpret_cast<raft::resources*>(res); | ||
| auto index_ptr = reinterpret_cast<cuvs::neighbors::ivf_flat::index<T, IdxT>*>(index.addr); | ||
| auto dst = cuvs::core::from_dlpack<output_mdspan_type>(centers); | ||
| auto src = index_ptr->centers(); |
There was a problem hiding this comment.
Would there be value in making any of these const?
I don't have familiarity with the code yet, so I haven't grokked the semantics of raft::copy. Apologies, if this is noise.
Co-authored-by: MithunR <mythrocks@gmail.com>
|
/merge |
Similar to rapidsai#881 - also expose centers for ivf-flat as well as ivf-pq Authors: - Ben Frederickson (https://github.com/benfred) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Tarang Jain (https://github.com/tarang-jain) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#888
Similar to #881 - also expose centers for ivf-flat as well as ivf-pq