-
Notifications
You must be signed in to change notification settings - Fork 160
Automatic adjustment of itopk size according to filtering rate #509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
6223fd2
8ff6991
37e26c1
3665d45
018e792
ab1130b
bedd224
ea8c273
5025481
b61126a
588bd0c
228a1ae
776f2f5
9d262f7
192c0a9
d19a6c4
b5c31b3
81e4b39
cdc4bc4
dd371dc
e769ca7
ce93427
c133c8b
baa3c0c
06d404c
23852af
7e7ff1e
b05f174
33eda3a
6566a08
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -336,11 +336,13 @@ void search(raft::resources const& res, | |
| const cuvs::neighbors::filtering::base_filter& sample_filter_ref) | ||
| { | ||
| try { | ||
| using none_filter_type = cuvs::neighbors::filtering::none_sample_filter; | ||
| auto& sample_filter = dynamic_cast<const none_filter_type&>(sample_filter_ref); | ||
| using none_filter_type = cuvs::neighbors::filtering::none_sample_filter; | ||
| auto& sample_filter = dynamic_cast<const none_filter_type&>(sample_filter_ref); | ||
| search_params params_copy = params; | ||
| if (params.filtering_rate < 0.0) { params_copy.filtering_rate = 0.0; } | ||
| auto sample_filter_copy = sample_filter; | ||
| return search_with_filtering<T, IdxT, none_filter_type>( | ||
| res, params, idx, queries, neighbors, distances, sample_filter_copy); | ||
| res, params_copy, idx, queries, neighbors, distances, sample_filter_copy); | ||
| return; | ||
| } catch (const std::bad_cast&) { | ||
| } | ||
|
|
@@ -349,9 +351,18 @@ void search(raft::resources const& res, | |
| auto& sample_filter = | ||
| dynamic_cast<const cuvs::neighbors::filtering::bitset_filter<uint32_t, int64_t>&>( | ||
| sample_filter_ref); | ||
| search_params params_copy = params; | ||
| if (params.filtering_rate < 0.0) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I see- the filtering rate set to "auto" will automatically calculate the sparsity of the bitset while a user can skip having to compute this if they set the filtering rate directly. I like this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is correct. By default, it calculates the filtering rate, but if the user specifies the filtering rate, that calculation is skipped. |
||
| const auto num_set_bits = sample_filter.bitset_view_.count(res); | ||
| auto filtering_rate = (float)(idx.data().n_rows() - num_set_bits) / idx.data().n_rows(); | ||
| const float min_filtering_rate = 0.0; | ||
| const float max_filtering_rate = 0.999; | ||
| params_copy.filtering_rate = | ||
| std::min(std::max(filtering_rate, min_filtering_rate), max_filtering_rate); | ||
| } | ||
| auto sample_filter_copy = sample_filter; | ||
| return search_with_filtering<T, IdxT, decltype(sample_filter_copy)>( | ||
| res, params, idx, queries, neighbors, distances, sample_filter_copy); | ||
| res, params_copy, idx, queries, neighbors, distances, sample_filter_copy); | ||
| } catch (const std::bad_cast&) { | ||
| RAFT_FAIL("Unsupported sample filter type"); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. So is specifying this value an alternative to explicitly computing the sparsity of the pre-filtering bitset (which would introduce additional compute into the search)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be clearer to write "automatically computed" rather than "automatically set" here. I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. "automatically compted" is perfect.