Skip to content

feat: support batch vector queries#6828

Open
LeoReeYang wants to merge 15 commits into
lance-format:mainfrom
LeoReeYang:leo/batch-vector-query-6821
Open

feat: support batch vector queries#6828
LeoReeYang wants to merge 15 commits into
lance-format:mainfrom
LeoReeYang:leo/batch-vector-query-6821

Conversation

@LeoReeYang
Copy link
Copy Markdown
Contributor

@LeoReeYang LeoReeYang commented May 18, 2026

Summary

  • Extends the existing Scanner::nearest API to accept batched query vectors for fixed-size vector columns (no separate nearest_batch API).
  • Implements shared flat batch KNN in KNNVectorDistanceExec: each data batch is loaded once, all query vectors are evaluated against it, and results are returned in one stream with up to m * k rows.
  • Adds query_index to batch results so callers can split top-k rows per input query (LanceDB-compatible name, not _query_index).
  • When use_index=true and a vector index is available, batch queries run through the indexed path (per-query ANN search, union, and query_index tagging) instead of forcing flat search.
  • Honors batch query parameters: distance_range is applied before per-query top-k selection on the flat path; indexed batch respects the same bounds.
  • fast_search with batch queries and no vector index returns an empty result whose schema still includes query_index.
  • Batch vs multivector is determined by the vector column type (FixedSizeList → batch of single-vector queries; List multivector column → one multivector query).

Closes #6821.

API contract

  • Input: pass multiple query vectors via a list-like / 2-D query for a FixedSizeList embedding column.
  • Output: up to k rows per query vector, plus query_index (0-based index into the input query batch).
  • Flat path: shared scan/decode across queries; requires _rowid in the scan plan.
  • Indexed path (default when index exists): runs one indexed search per query vector and merges results.

Benchmark

Python benchmark command:

uv run --extra benchmarks pytest python/benchmarks/test_search.py::test_batch_flat_knn

Dataset size, dimensionality, query count, batch size, and rounds are declared in the benchmark's @pytest.mark.parametrize values. Adjust those parameters in python/benchmarks/test_search.py to reproduce the scaling rows below.

Dataset: random float32 vectors written to a real local .lance dataset. No memory:// dataset and no throttled/simulated object store latency. OS page cache effects are accepted.

Query Count Scaling

Fixed dataset: 1,000,000 rows, dim=512, k=10. This is about 1.9 GiB of raw vector values.

rows dim query count (m) separate queries mean batch query mean time saved speedup
1,000,000 512 2 224.19 ms 180.68 ms 43.51 ms 1.24x
1,000,000 512 5 573.84 ms 310.23 ms 263.61 ms 1.85x
1,000,000 512 10 1.1241 s 524.05 ms 600.05 ms 2.15x

Batching becomes more valuable as m increases because shared scan/decode work is amortized over more query vectors.

Dataset Size Scaling

Fixed query count: m=10, dim=512, k=10.

rows raw vector size query count (m) separate queries mean batch query mean time saved speedup
100,000 ~0.19 GiB 10 121.74 ms 50.318 ms 71.42 ms 2.42x
500,000 ~0.95 GiB 10 579.07 ms 261.21 ms 317.86 ms 2.22x
1,000,000 ~1.91 GiB 10 1.1241 s 524.05 ms 600.05 ms 2.15x

On local disk with OS page cache, relative speedup is not strictly monotonic with row count because both plans become increasingly dominated by cached vector decoding and distance-compute work. The robust trend here is absolute time saved, which grows from ~71 ms to ~600 ms as dataset size grows.

Test plan

  • cargo test -p lance test_batch_knn
  • cargo test -p lance fast_search_without
  • uv run pytest python/tests/test_vector_index.py -k batch
  • uv run --extra benchmarks pytest --collect-only python/benchmarks/test_search.py::test_batch_flat_knn
  • cargo clippy -p lance --tests -- -D warnings
  • uv run ruff format --check python/lance/dataset.py python/tests/test_vector_index.py python/benchmarks/test_search.py

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 87.85358% with 73 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/io/exec/knn.rs 80.88% 40 Missing and 12 partials ⚠️
rust/lance/src/dataset/scanner.rs 93.61% 10 Missing and 11 partials ⚠️

📢 Thoughts on this report? Let us know!

@LeoReeYang LeoReeYang changed the title Support batch flat vector queries feat: support batch flat vector queries May 18, 2026
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Updated based on review feedback:

  • Removed the separate nearest_batch API; batched fixed-size vector queries now go through Scanner::nearest.
  • Removed the separate KNNBatchVectorDistanceExec type; batch flat KNN is handled inside KNNVectorDistanceExec.
  • Restored simple fast_search / use_index setters and route batch nearest queries to the flat path internally.
  • Updated the benchmark to create a real local disk .lance dataset instead of using memory:// or throttled simulated latency.

Local disk benchmark result for 8 queries, 50k rows, dim=4: separate mean 3.8834 ms vs batch mean 3.3045 ms, about 1.17x speedup. The gain is modest on local disk because the repeated reads are served from OS page cache.

@github-actions github-actions Bot added the enhancement New feature or request label May 18, 2026
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Updated the benchmark scale per feedback:

  • Dataset is now 1,000,000 rows x 512 dimensions, with 10 batched query vectors and k=10.
  • The benchmark writes a real local .lance dataset under the system temp directory, about 1.9 GiB of raw vector values.
  • Data generation now streams batches into Dataset::write instead of materializing the full benchmark dataset in memory first.

Local result with OS cache accepted:

  • separate_queries/10: mean 1.1166 s
  • batch_query/10: mean 508.05 ms
  • Speedup: about 2.20x

This is meaningfully higher than the previous small-data local run (~1.17x), which matches the expectation that larger scan workloads show more benefit from sharing read/decode work across queries.

@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Added a controlled benchmark matrix to make the trend clearer.

Query-count scaling at 1M rows x 512d:

m separate mean batch mean saved speedup
2 224.19 ms 180.68 ms 43.51 ms 1.24x
5 573.84 ms 310.23 ms 263.61 ms 1.85x
10 1.1241 s 524.05 ms 600.05 ms 2.15x

Dataset-size scaling at m=10, 512d:

rows raw vector size separate mean batch mean saved speedup
100k ~0.19 GiB 121.74 ms 50.318 ms 71.42 ms 2.42x
500k ~0.95 GiB 579.07 ms 261.21 ms 317.86 ms 2.22x
1M ~1.91 GiB 1.1241 s 524.05 ms 600.05 ms 2.15x

So the relative speedup clearly increases with m. For dataset size, the absolute time saved grows from ~71 ms to ~600 ms while relative speedup stays above 2x on local disk with OS cache effects accepted. The benchmark is now parameterized with env vars so these rows can be reproduced without editing source.

Comment thread rust/lance/src/io/exec/knn.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
Comment thread rust/lance/benches/vector_index.rs Outdated
);
}

fn bench_batch_flat_knn(c: &mut Criterion) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we port this to be in Python?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has been ported to: python/python/benchmarks/test_search.py:227

DataType::List(_) | DataType::FixedSizeList(_, _) => {
if !matches!(vector_type, DataType::List(_)) {
return Err(Error::invalid_input(format!(
"Query is multivector but column {}({})is not multivector",
Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more how this distinguishes between multivector query and query batch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch-vs-multivector is distinguished by the vector column type: list-like q + List column means one multivector query; list-like q + FixedSizeList column means a batch of single-vector queries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch-vs-multivector is decided by vector column type with comments added in Scanner::nearest lines 1467-1475

Comment thread python/python/benchmarks/test_search.py Outdated
@LeoReeYang LeoReeYang requested a review from BubbleCal May 18, 2026 14:09
Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. distance_range param is lost if it's a batch query
  2. this forces the query to be executed by flat KNN even there's an index, we still need to use the index if there is one (just query the index for each query vector).

plz add tests for verifying they are really fixed

@BubbleCal
Copy link
Copy Markdown
Contributor

if the query is with:

  • fast_search=True
  • batch query
  • no index

it's expected to return an empty result, but the schema should still contain query_index column, but now it doesn't

Comment thread python/python/lance/dataset.py Outdated
In that case Lance runs a flat batch KNN query, returns up to ``k`` rows
for each query vector, and adds ``query_index`` to identify the source
query for each result row. Indexed/ANN batch search is not used in this
first implementation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comments look not correct

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • outdated comments
  • there will be a query_index for batch

Comment thread python/python/lance/dataset.py Outdated
q: QueryVectorLike
The query vector.
The query vector. For fixed-size vector columns, this may be a 2-D
array-like batch of query vectors. Batch queries run flat KNN, apply
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated comments were committed with old batch implementation but not being updated by later commits, btw indexed batch & distance_range has been consistent with comments like above

@LeoReeYang LeoReeYang requested a review from BubbleCal May 21, 2026 00:10
@LeoReeYang LeoReeYang changed the title feat: support batch flat vector queries feat: support batch vector queries May 21, 2026
.into_iter()
.flat_map(BinaryHeap::into_vec)
.collect::<Vec<_>>();
results.sort_by(|left, right| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to rethink how to implement this, now the results will be truncated because of SortExec has limit=k.

Say for batch query with 2 vectors, and k=10, this would return 20 rows, but SortExec will keep only 10 results then we will lose the rest results

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introducing Scanner::is_batch_nearest & skipping SortExec on batch flat path, per-query top-k will be handled by KNNVectorDistanceExec::execute_batch

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is fixed, plz add a test to verify it

Comment thread rust/lance/src/io/exec/knn.rs Outdated
match t {
DisplayFormatType::Default | DisplayFormatType::Verbose => {
write!(f, "KNNVectorDistance: metric={}", self.distance_type,)
if self.query_count > 1 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all checks self.query_count > 1 need to be replaced by self.is_batch_query() which should check by query shape not query count.

Say if the query is a list of vectors but with only 1 vector, it's still a batch query, or the behavior will be hard to predict

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. change batch query judgement
  2. Scanner::is_batch_nearest + KNNVectorDistanceExec::is_batch replaces self.query_count > 1
  3. indexed batch still leverage existing single query ANN path with is_batch_nearest=false

@LeoReeYang LeoReeYang requested a review from BubbleCal May 21, 2026 10:37
@LeoReeYang LeoReeYang force-pushed the leo/batch-vector-query-6821 branch from 99de360 to bbe534b Compare May 21, 2026 23:37
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Test dedup only — no production or indexed-batch behavior changes.

  • Rust: 7 batch KNN tests → 4 via batch_knn_two_queries() and assert_batch_matches_single_queries().
    • test_batch_knn_flat: flat plan, m×k rows, query_index, parity vs single queries, single-vector (1, dim) batch shape, no global SortExec TopK(k).
    • test_batch_knn_flat_respects_distance_range: shared helper + distance_range.
    • test_batch_knn_indexed: indexed plan + distance_range parity (merged former split tests).
  • Python:
    • @pytest.mark.parametrize on flat batch test (2-query / 1-query).
    • removed redundant single-vector test.
    • _assert_batch_matches_single_queries checks distance_range only when provided.

Net: fewer overlapping assertions, same coverage.

LeoReeYang and others added 7 commits May 22, 2026 15:28
Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results.

Co-authored-by: Cursor <cursoragent@cursor.com>
Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload.

Co-authored-by: Cursor <cursoragent@cursor.com>
Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API.

Co-authored-by: Cursor <cursoragent@cursor.com>
Apply rustfmt output expected by CI for the batch query binding change.

Co-authored-by: Cursor <cursoragent@cursor.com>
Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables.

Co-authored-by: Cursor <cursoragent@cursor.com>
LeoReeYang and others added 7 commits May 22, 2026 15:30
Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Update nearest/search docstrings to describe indexed batch queries and
add Python tests that batch distance_range matches per-query searches.

Co-authored-by: Cursor <cursoragent@cursor.com>
When fast_search is used with a batch nearest query and no vector index,
return an empty result whose schema still contains query_index.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use is_batch_nearest based on list-like queries on fixed-size vector columns
instead of query_count > 1, so single-vector batch queries still get query_index
and avoid SortExec TopK(fetch=k) truncating m*k results to k rows.

Co-authored-by: Cursor <cursoragent@cursor.com>
Consolidate overlapping Rust/Python batch nearest tests via shared helpers.
No production changes; merge with main deferred.

Co-authored-by: Cursor <cursoragent@cursor.com>
Keep the main-based batch vector query branch compiling cleanly after conflict resolution.

Co-authored-by: Cursor <cursoragent@cursor.com>
@LeoReeYang LeoReeYang force-pushed the leo/batch-vector-query-6821 branch from bbe534b to fc0e7f0 Compare May 22, 2026 07:48
Comment thread rust/lance/src/dataset/scanner.rs Outdated
let vector_expr = expressions::col(DIST_COL, current_schema)?;
output_expr.push((vector_expr, DIST_COL.to_string()));
}
if self.is_batch_nearest && output_expr.iter().all(|(_, name)| name != QUERY_INDEX_COL)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the query_index column shouldn't be added into autoproject_scoring_columns, it's a little bit confusing

Add a regression test that batch flat KNN returns k rows per query instead of
being truncated by SortExec, and keep query_index autoprojection separate from
scoring-column autoprojection.

Co-authored-by: Cursor <cursoragent@cursor.com>
@LeoReeYang LeoReeYang requested a review from BubbleCal May 22, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support batch vector query API and shared flat KNN scan

2 participants