Skip to content

Refactor JIT LTO kernel generation#1812

Merged
rapids-bot[bot] merged 22 commits intorapidsai:mainfrom
KyleFromNVIDIA:refactor-jit-lto
Feb 23, 2026
Merged

Refactor JIT LTO kernel generation#1812
rapids-bot[bot] merged 22 commits intorapidsai:mainfrom
KyleFromNVIDIA:refactor-jit-lto

Conversation

@KyleFromNVIDIA
Copy link
Member

Add an algorithm that computes a matrix product, and add a generic
CMake function that uses this algorithm to generate a matrix of kernels
with desired parameters.

Add an algorithm that computes a matrix product, and add a generic
CMake function that uses this algorithm to generate a matrix of kernels
with desired parameters
@KyleFromNVIDIA KyleFromNVIDIA requested review from a team as code owners February 17, 2026 15:40
@KyleFromNVIDIA KyleFromNVIDIA added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 17, 2026
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, otherwise this looks great!

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can normalize how keys are named in the JSON file. If the key ends with:

  1. _name: Actual name of the template
  2. _type: Always used to instantiate the actual kernel function template
  3. _type_tag: Template parameters for registerAlgorithm. The JSON should have the whole tag as a value tag_type_ui.
  4. _type_val: Value for the types used in constructor of registerAlgorithm as _name + _type_val

@KyleFromNVIDIA
Copy link
Member Author

I think we can normalize how keys are named in the JSON file.

It's not entirely clear to me what this would look like, or if it would even be possible with the current algorithm. Please post a suggestion with your specific renaming recommendations for interleaved_scan_matrix.json.

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the dev team can normalize a naming convention after this PR is merged!

@divyegala divyegala mentioned this pull request Feb 19, 2026
8 tasks
@jameslamb jameslamb requested review from jameslamb and removed request for AyodeAwe February 20, 2026 19:08
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look closely at the source files in src/neighbors/ivf_flat, trusting other reviewers to be much more knowledgeable about those than me.

Left a few comments about the general setup though. Most are minor but one is big... are we SURE it's to require a Python interpreter to build libcuvs?

Leaving a non-blocking "Comment" review, I'll come back and approve once those questions are answered.

Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but please do consider the other comments I left about the strictness and placement of this Python code.

@KyleFromNVIDIA
Copy link
Member Author

are we SURE it's to require a Python interpreter to build libcuvs?

I already discussed this with @divyegala back in December. One of cuvs's dependencies already requires Python to build, so this isn't introducing a new dependency.

Other projects that want this matrix product algorithm may not have the same luxury. Once CMake 4.3 with its string(JSON) improvements comes out, and once we require CMake 4.3, it will be possible to port the algorithm to CMake script and ditch the Python script.

@KyleFromNVIDIA
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit d017502 into rapidsai:main Feb 23, 2026
147 of 151 checks passed
bdice added a commit to bdice/cugraph that referenced this pull request Feb 23, 2026
bdice added a commit to rlratzel/cugraph that referenced this pull request Feb 23, 2026
KyleFromNVIDIA added a commit to KyleFromNVIDIA/cuvs that referenced this pull request Feb 23, 2026
rapids-bot bot pushed a commit that referenced this pull request Feb 23, 2026
tfeher pushed a commit to Stardust-SJF/cuvs_rabitq that referenced this pull request Mar 3, 2026
Add an algorithm that computes a matrix product, and add a generic
CMake function that uses this algorithm to generate a matrix of kernels
with desired parameters.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Divye Gala (https://github.com/divyegala)
  - James Lamb (https://github.com/jameslamb)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1812
tfeher pushed a commit to Stardust-SJF/cuvs_rabitq that referenced this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

6 participants