Skip to content

Expose ivf-flat centers to python/c#888

Merged
rapids-bot[bot] merged 13 commits intorapidsai:branch-25.06from
benfred:python_ivf_flat_centers
May 27, 2025
Merged

Expose ivf-flat centers to python/c#888
rapids-bot[bot] merged 13 commits intorapidsai:branch-25.06from
benfred:python_ivf_flat_centers

Conversation

@benfred
Copy link
Member

@benfred benfred commented May 13, 2025

Similar to #881 - also expose centers for ivf-flat as well as ivf-pq

Similar to rapidsai#881 - also expose centers for ivf-flat as well as ivf-pq
@benfred benfred requested review from a team as code owners May 13, 2025 00:17
@benfred benfred added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 13, 2025
RAFT_EXPECTS(src.extent(0) == dst.extent(0), "Output centers has incorrect number of rows");
RAFT_EXPECTS(src.extent(1) == dst.extent(1), "Output centers has incorrect number of cols");

cudaMemcpyAsync(dst.data_handle(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use raft::copy or raft::copy_async in these calls instead of direct calls to cudaMemcpyAsync?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! I updated in the last commit to use the raft::copy from raft/util/cudart_utils.hpp .

(Fwiw, I originally tried to use the copy functions from raft/core/copy.hpp and raft/core/copy.cuh but couldn't - since we can't use the '.cuh' version inside the C-api, and the .hpp version was complaining about needing a cuda kernel for the D2H copy iirc, which is why I was using cudaMemCpyAsync directly here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing that nn_descent_c.cpp also uses cudaMemcpyAsync. If there is a reason we have avoided the cudart header, we can stick to cudaMemcpyAsync. My understanding was that having the cudaMemcpyAsync call means that the file would have to be compiled with nvcc anyway so we should be reusing raft functions instead.
(cc @cjnolet)

Copy link
Member

@cjnolet cjnolet May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that having the cudaMemcpyAsync call means that the file would have to be compiled with nvcc anyway

@tarang-jain cudaMemCpyXX() is part of the CUDA runtime API, so it should not require nvcc to compile. Only the lower-level device function routines will need nvcc. Otherwise, it's just linking against the pre-compiled CUDA routines in the runtime API (kind of similar to what end-users do when they use cuVS C/C++ APIs).

The reason why raft::copy ends up requiring nvcc is because there were recently some device functions added to raft to work specifically with mdspan... to be honest, I'd be in favor of separating those out from the ones that only require the runtime API for this very purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated nn_descent_c.cpp in the last commit to use raft::copy.

the cudaMemcpyAsync code is fine (as is copy functions from raft/util/cudart_utils.hpp) - I just couldn't use this code https://github.com/rapidsai/raft/blob/c2dc3124ce3fbcb5ff2ccabd88d7f57570b6aea9/cpp/include/raft/core/copy.cuh#L57-L61 from raft .

fwiw, one nice thing about using raft::copy is that it fixes one issue that this code used to have (wasn't checking the return value from cudaMemCpyAsync , which was a stupid oversight on my part =) ).

@benfred benfred self-assigned this May 13, 2025
Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some trivial nitpicks. Still coming to terms with the code.

The C++ side looks good to my eye. +1 non-binding.

Comment on lines +387 to +402
if (index->dtype.code == kDLFloat && index->dtype.bits == 32) {
auto index_ptr =
reinterpret_cast<cuvs::neighbors::ivf_flat::index<float, int64_t>*>(index->addr);
return index_ptr->n_lists();
} else if (index->dtype.code == kDLFloat && index->dtype.bits == 16) {
auto index_ptr =
reinterpret_cast<cuvs::neighbors::ivf_flat::index<half, int64_t>*>(index->addr);
return index_ptr->n_lists();
} else if (index->dtype.code == kDLInt && index->dtype.bits == 8) {
auto index_ptr =
reinterpret_cast<cuvs::neighbors::ivf_flat::index<int8_t, int64_t>*>(index->addr);
return index_ptr->n_lists();
} else if (index->dtype.code == kDLUInt && index->dtype.bits == 8) {
auto index_ptr =
reinterpret_cast<cuvs::neighbors::ivf_flat::index<uint8_t, int64_t>*>(index->addr);
return index_ptr->n_lists();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a recurring pattern in the code. One wonders if libcudf's type-dispatch pattern might be of value here.

We might consider exploring at a later date.

sizeof(float) * index.dim_ext(),
sizeof(float) * index.dim(),
index.n_lists(),
cudaMemcpyDefault,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL cudaMemcpyDefault. I didn't know/realize that the direction could be inferred.

Comment on lines +147 to +150
auto res_ptr = reinterpret_cast<raft::resources*>(res);
auto index_ptr = reinterpret_cast<cuvs::neighbors::ivf_flat::index<T, IdxT>*>(index.addr);
auto dst = cuvs::core::from_dlpack<output_mdspan_type>(centers);
auto src = index_ptr->centers();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be value in making any of these const?

I don't have familiarity with the code yet, so I haven't grokked the semantics of raft::copy. Apologies, if this is noise.

@cjnolet
Copy link
Member

cjnolet commented May 27, 2025

/merge

@rapids-bot rapids-bot bot merged commit b0f45bb into rapidsai:branch-25.06 May 27, 2025
75 checks passed
mythrocks pushed a commit to mythrocks/cuvs that referenced this pull request Jun 3, 2025
Similar to rapidsai#881 - also expose centers for ivf-flat as well as ivf-pq

Authors:
  - Ben Frederickson (https://github.com/benfred)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tarang Jain (https://github.com/tarang-jain)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#888
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants