Skip to content

Fix potential OOB access in CAGRA search when graph size < dataset size#1780

Open
irina-resh-nvda wants to merge 16 commits intorapidsai:mainfrom
irina-resh-nvda:add-max-node-id-parameter
Open

Fix potential OOB access in CAGRA search when graph size < dataset size#1780
irina-resh-nvda wants to merge 16 commits intorapidsai:mainfrom
irina-resh-nvda:add-max-node-id-parameter

Conversation

@irina-resh-nvda
Copy link
Contributor

Random seed node selection incorrectly used dataset_desc.size to generate indices. When graph.extent(0) < dataset.size, this caused OOB access by attempting to index graph rows that don't exist.

This PR fixes an out-of-bounds memory access bug in CAGRA search that occurs when the graph has fewer nodes than the dataset (e.g., during iterative CAGRA-Q builds with compression).

Solution

Thread graph.extent(0) through the search kernel call chain as a graph_size parameter. Random seeds are now correctly constrained to [0, graph.extent(0)) instead of [0, dataset.size).

Purely internal fix with no API changes. Existing behavior unchanged for normal CAGRA usage where graph and dataset sizes match.

Testing

Added test that builds an index on 5,000 points, expands the dataset to 10,000 points via update_dataset(), then runs search. This reproduces the exact scenario where the bug would occur. Test verifies both SINGLE_CTA and MULTI_CTA algorithms complete without OOB access.

…rs to constrain random seed node selection to a subset of the dataset. This is useful when the graph is smaller than the dataset, such as during iterative build with compression.
…oved max_node_id from the search parameters structure
@irina-resh-nvda irina-resh-nvda requested review from a team as code owners February 6, 2026 17:00
@irina-resh-nvda irina-resh-nvda self-assigned this Feb 6, 2026
@irina-resh-nvda irina-resh-nvda added bug Something isn't working non-breaking Introduces a non-breaking change labels Feb 6, 2026
Copy link
Contributor

@mfoerste4 mfoerste4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only minor suggestions.

uint32_t* const num_executed_iterations, /* stats */
SAMPLE_FILTER_T sample_filter)
SAMPLE_FILTER_T sample_filter,
const typename DATASET_DESCRIPTOR_T::INDEX_T graph_size = 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the default value used anywhere besides tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default is used everywhere but the tests and thefuture iterative cagra q implementation

@divyegala divyegala mentioned this pull request Feb 16, 2026
8 tasks
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Irina, the PR looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

Development

Successfully merging this pull request may close these issues.

3 participants