Skip to content

[Java] New off-heap Dataset support for CAGRA and Bruteforce#902

Merged
rapids-bot[bot] merged 19 commits intorapidsai:branch-25.06from
SearchScale:ishan/new-dataset-method
May 29, 2025
Merged

[Java] New off-heap Dataset support for CAGRA and Bruteforce#902
rapids-bot[bot] merged 19 commits intorapidsai:branch-25.06from
SearchScale:ishan/new-dataset-method

Conversation

@chatman
Copy link
Contributor

@chatman chatman commented May 15, 2025

As reported in #698, current withDataset(float[][] arr) requires the entire dataset to be copied in heap first, before writing out the MemorySegment for it.

Introducing a new Dataset (interface and impl) support with a addVector(float[] vector) support for adding the vectors into the MemorySegment one by one, without needing to load them all at once.

@copy-pr-bot
Copy link

copy-pr-bot bot commented May 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chatman
Copy link
Contributor Author

chatman commented May 15, 2025

FYI, @ChrisHegarty @narangvivek10.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change Java labels May 15, 2025
@chatman chatman marked this pull request as ready for review May 23, 2025 16:27
@narangvivek10
Copy link
Contributor

Also, maybe update the title to [Java] New off-heap Dataset support for CAGRA and Bruteforce. Thanks!

@chatman chatman changed the title Java: New off-heap Dataset support for CAGRA and Brute Force [Java]: New off-heap Dataset support for CAGRA and Bruteforce May 27, 2025
@chatman chatman changed the title [Java]: New off-heap Dataset support for CAGRA and Bruteforce [Java] New off-heap Dataset support for CAGRA and Bruteforce May 27, 2025
@chatman
Copy link
Contributor Author

chatman commented May 27, 2025

@narangvivek10 @punAhuja Incorporated your feedback, thanks! Can you approve the PR?
@cjnolet I think this is ready for CI testing/merge now.

@cjnolet
Copy link
Member

cjnolet commented May 27, 2025

/ok to test 7a75d9a

@chatman
Copy link
Contributor Author

chatman commented May 27, 2025

Thanks @mythrocks , incorporated your suggestions.

I think we need to add a style check for the Java project and standardize tabs vs spaces in a separate PR. This codebase is a mix right now :-(

@chatman
Copy link
Contributor Author

chatman commented May 27, 2025

@mythrocks can you please trigger a CI run?

@chatman
Copy link
Contributor Author

chatman commented May 28, 2025

@cjnolet Can you please review and merge this?

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@cjnolet
Copy link
Member

cjnolet commented May 28, 2025

/ok to test 6148cff

@cjnolet
Copy link
Member

cjnolet commented May 28, 2025

/merge

@rapids-bot rapids-bot bot merged commit e8dbb88 into rapidsai:branch-25.06 May 29, 2025
75 checks passed
Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Belated LGTM. ❤️

mythrocks pushed a commit to mythrocks/cuvs that referenced this pull request Jun 3, 2025
…i#902)

As reported in rapidsai#698, current `withDataset(float[][] arr)` requires the entire dataset to be copied in heap first, before writing out the MemorySegment for it.

Introducing a new `Dataset` (interface and impl) support with a `addVector(float[] vector)` support for adding the vectors into the MemorySegment one by one, without needing to load them all at once.

Authors:
  - Ishan Chattopadhyaya (https://github.com/chatman)
  - Vivek Narang (https://github.com/narangvivek10)

Approvers:
  - MithunR (https://github.com/mythrocks)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#902
@chatman
Copy link
Contributor Author

chatman commented Jun 13, 2025 via email

rapids-bot bot pushed a commit that referenced this pull request Jul 6, 2025
This PR is a follow-up from #902.
Still WIP (see self-comments on the changes) but I'd like some early feedback.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Chris Hegarty (https://github.com/ChrisHegarty)
  - MithunR (https://github.com/mythrocks)

URL: #1024
punAhuja pushed a commit to SearchScale/cuvs that referenced this pull request Jul 15, 2025
This PR is a follow-up from rapidsai#902.
Still WIP (see self-comments on the changes) but I'd like some early feedback.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Chris Hegarty (https://github.com/ChrisHegarty)
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1024
punAhuja pushed a commit to SearchScale/cuvs that referenced this pull request Jul 16, 2025
This PR is a follow-up from rapidsai#902.
Still WIP (see self-comments on the changes) but I'd like some early feedback.

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Chris Hegarty (https://github.com/ChrisHegarty)
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1024
rapids-bot bot pushed a commit that referenced this pull request Jul 26, 2025
In #902 and #1034 we introduced a `Dataset` interface to support on-heap and off-heap ("native") memory seamlessly as inputs for cagra and bruteforce index building.

As we expand the functionality of cuvs-java, we realized we have similar needs for outputs (see e.g. #1105 / #1102 or #1104).

This PR extends `Dataset` to support being used as an output, wrapping native (off-heap) memory in a convenient and efficient way, and providing common utilities to transform to and from on-heap memory.
This work is inspired by the existing raft `mdspan` and `DLTensor` data structures, but tailored to our needs (2d only, just 3 data types, etc.). The PR keeps the current implementation simple and minimal on purpose, but structured in a way that is simple to extend.

By itself, the PR is just a refactoring to extend the `Dataset` implementation and reorganize the implementation classes; its real usefulness will be in using it in the PRs mentioned above (in fact, this PR has been extracted from #1105).
The implementation class hierarchy is implemented with future extensions in mind: atm we have one `HostMemoryDatasetImpl`, but we are already thinking to have a corresponding `DeviceMemoryDatasetImpl` that will wrap and manage (views) on GPU memory to avoid (in some cases) extra copies of data from GPU memory to CPU memory only to process them or forward them to another algorithm (e.g quantization followed by indexing).

Future work will also include add support/refactoring to allocate and manage GPU memory and DLTensors (e.g. working better with/refactoring `prepareTensor`).

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: #1111
lowener pushed a commit to lowener/cuvs that referenced this pull request Aug 11, 2025
…#1111)

In rapidsai#902 and rapidsai#1034 we introduced a `Dataset` interface to support on-heap and off-heap ("native") memory seamlessly as inputs for cagra and bruteforce index building.

As we expand the functionality of cuvs-java, we realized we have similar needs for outputs (see e.g. rapidsai#1105 / rapidsai#1102 or rapidsai#1104).

This PR extends `Dataset` to support being used as an output, wrapping native (off-heap) memory in a convenient and efficient way, and providing common utilities to transform to and from on-heap memory.
This work is inspired by the existing raft `mdspan` and `DLTensor` data structures, but tailored to our needs (2d only, just 3 data types, etc.). The PR keeps the current implementation simple and minimal on purpose, but structured in a way that is simple to extend.

By itself, the PR is just a refactoring to extend the `Dataset` implementation and reorganize the implementation classes; its real usefulness will be in using it in the PRs mentioned above (in fact, this PR has been extracted from rapidsai#1105).
The implementation class hierarchy is implemented with future extensions in mind: atm we have one `HostMemoryDatasetImpl`, but we are already thinking to have a corresponding `DeviceMemoryDatasetImpl` that will wrap and manage (views) on GPU memory to avoid (in some cases) extra copies of data from GPU memory to CPU memory only to process them or forward them to another algorithm (e.g quantization followed by indexing).

Future work will also include add support/refactoring to allocate and manage GPU memory and DLTensors (e.g. working better with/refactoring `prepareTensor`).

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1111
enp1s0 pushed a commit to enp1s0/cuvs that referenced this pull request Aug 22, 2025
…#1111)

In rapidsai#902 and rapidsai#1034 we introduced a `Dataset` interface to support on-heap and off-heap ("native") memory seamlessly as inputs for cagra and bruteforce index building.

As we expand the functionality of cuvs-java, we realized we have similar needs for outputs (see e.g. rapidsai#1105 / rapidsai#1102 or rapidsai#1104).

This PR extends `Dataset` to support being used as an output, wrapping native (off-heap) memory in a convenient and efficient way, and providing common utilities to transform to and from on-heap memory.
This work is inspired by the existing raft `mdspan` and `DLTensor` data structures, but tailored to our needs (2d only, just 3 data types, etc.). The PR keeps the current implementation simple and minimal on purpose, but structured in a way that is simple to extend.

By itself, the PR is just a refactoring to extend the `Dataset` implementation and reorganize the implementation classes; its real usefulness will be in using it in the PRs mentioned above (in fact, this PR has been extracted from rapidsai#1105).
The implementation class hierarchy is implemented with future extensions in mind: atm we have one `HostMemoryDatasetImpl`, but we are already thinking to have a corresponding `DeviceMemoryDatasetImpl` that will wrap and manage (views) on GPU memory to avoid (in some cases) extra copies of data from GPU memory to CPU memory only to process them or forward them to another algorithm (e.g quantization followed by indexing).

Future work will also include add support/refactoring to allocate and manage GPU memory and DLTensors (e.g. working better with/refactoring `prepareTensor`).

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality Java non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

7 participants