feat: support batch vector queries by LeoReeYang · Pull Request #6828 · lance-format/lance

LeoReeYang · 2026-05-18T08:38:32Z

Summary

Extends the existing Scanner::nearest API to accept batched query vectors for fixed-size vector columns (no separate nearest_batch API).
Implements shared flat batch KNN in KNNVectorDistanceExec: each data batch is loaded once, all query vectors are evaluated against it, and results are returned in one stream with up to m * k rows.
Adds query_index to batch results so callers can split top-k rows per input query (LanceDB-compatible name, not _query_index).
When use_index=true and a vector index is available, batch queries run through the indexed path (per-query ANN search, union, and query_index tagging) instead of forcing flat search.
Honors batch query parameters: distance_range is applied before per-query top-k selection on the flat path; indexed batch respects the same bounds.
fast_search with batch queries and no vector index returns an empty result whose schema still includes query_index.
Batch vs multivector is determined by the vector column type (FixedSizeList → batch of single-vector queries; List multivector column → one multivector query).

Closes #6821.

API contract

Input: pass multiple query vectors via a list-like / 2-D query for a FixedSizeList embedding column.
Output: up to k rows per query vector, plus query_index (0-based index into the input query batch).
Flat path: shared scan/decode across queries; requires _rowid in the scan plan.
Indexed path (default when index exists): runs one indexed search per query vector and merges results.

Benchmark

Python benchmark command:

uv run --extra benchmarks pytest python/benchmarks/test_search.py::test_batch_flat_knn

Dataset size, dimensionality, query count, batch size, and rounds are declared in the benchmark's @pytest.mark.parametrize values. Adjust those parameters in python/benchmarks/test_search.py to reproduce the scaling rows below.

Dataset: random float32 vectors written to a real local .lance dataset. No memory:// dataset and no throttled/simulated object store latency. OS page cache effects are accepted.

Query Count Scaling

Fixed dataset: 1,000,000 rows, dim=512, k=10. This is about 1.9 GiB of raw vector values.

rows	dim	query count (`m`)	separate queries mean	batch query mean	time saved	speedup
1,000,000	512	2	224.19 ms	180.68 ms	43.51 ms	1.24x
1,000,000	512	5	573.84 ms	310.23 ms	263.61 ms	1.85x
1,000,000	512	10	1.1241 s	524.05 ms	600.05 ms	2.15x

Batching becomes more valuable as m increases because shared scan/decode work is amortized over more query vectors.

Dataset Size Scaling

Fixed query count: m=10, dim=512, k=10.

rows	raw vector size	query count (`m`)	separate queries mean	batch query mean	time saved	speedup
100,000	~0.19 GiB	10	121.74 ms	50.318 ms	71.42 ms	2.42x
500,000	~0.95 GiB	10	579.07 ms	261.21 ms	317.86 ms	2.22x
1,000,000	~1.91 GiB	10	1.1241 s	524.05 ms	600.05 ms	2.15x

On local disk with OS page cache, relative speedup is not strictly monotonic with row count because both plans become increasingly dominated by cached vector decoding and distance-compute work. The robust trend here is absolute time saved, which grows from ~71 ms to ~600 ms as dataset size grows.

Test plan

cargo test -p lance test_batch_knn
cargo test -p lance fast_search_without
uv run pytest python/tests/test_vector_index.py -k batch
uv run --extra benchmarks pytest --collect-only python/benchmarks/test_search.py::test_batch_flat_knn
cargo clippy -p lance --tests -- -D warnings
uv run ruff format --check python/lance/dataset.py python/tests/test_vector_index.py python/benchmarks/test_search.py

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

github-actions · 2026-05-18T08:38:50Z

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

codecov · 2026-05-18T09:13:06Z

Codecov Report

❌ Patch coverage is 87.85358% with 73 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/io/exec/knn.rs	80.88%	40 Missing and 12 partials ⚠️
rust/lance/src/dataset/scanner.rs	93.61%	10 Missing and 11 partials ⚠️

📢 Thoughts on this report? Let us know!

LeoReeYang · 2026-05-18T10:01:32Z

Updated based on review feedback:

Removed the separate nearest_batch API; batched fixed-size vector queries now go through Scanner::nearest.
Removed the separate KNNBatchVectorDistanceExec type; batch flat KNN is handled inside KNNVectorDistanceExec.
Restored simple fast_search / use_index setters and route batch nearest queries to the flat path internally.
Updated the benchmark to create a real local disk .lance dataset instead of using memory:// or throttled simulated latency.

Local disk benchmark result for 8 queries, 50k rows, dim=4: separate mean 3.8834 ms vs batch mean 3.3045 ms, about 1.17x speedup. The gain is modest on local disk because the repeated reads are served from OS page cache.

LeoReeYang · 2026-05-18T10:16:49Z

Updated the benchmark scale per feedback:

Dataset is now 1,000,000 rows x 512 dimensions, with 10 batched query vectors and k=10.
The benchmark writes a real local .lance dataset under the system temp directory, about 1.9 GiB of raw vector values.
Data generation now streams batches into Dataset::write instead of materializing the full benchmark dataset in memory first.

Local result with OS cache accepted:

separate_queries/10: mean 1.1166 s
batch_query/10: mean 508.05 ms
Speedup: about 2.20x

This is meaningfully higher than the previous small-data local run (~1.17x), which matches the expectation that larger scan workloads show more benefit from sharing read/decode work across queries.

LeoReeYang · 2026-05-18T10:32:00Z

Added a controlled benchmark matrix to make the trend clearer.

Query-count scaling at 1M rows x 512d:

m	separate mean	batch mean	saved	speedup
2	224.19 ms	180.68 ms	43.51 ms	1.24x
5	573.84 ms	310.23 ms	263.61 ms	1.85x
10	1.1241 s	524.05 ms	600.05 ms	2.15x

Dataset-size scaling at m=10, 512d:

rows	raw vector size	separate mean	batch mean	saved	speedup
100k	~0.19 GiB	121.74 ms	50.318 ms	71.42 ms	2.42x
500k	~0.95 GiB	579.07 ms	261.21 ms	317.86 ms	2.22x
1M	~1.91 GiB	1.1241 s	524.05 ms	600.05 ms	2.15x

So the relative speedup clearly increases with m. For dataset size, the absolute time saved grows from ~71 ms to ~600 ms while relative speedup stays above 2x on local disk with OS cache effects accepted. The benchmark is now parameterized with env vars so these rows can be reproduced without editing source.

BubbleCal · 2026-05-18T10:39:58Z

    );
 }

+fn bench_batch_flat_knn(c: &mut Criterion) {


can we port this to be in Python?

it has been ported to: python/python/benchmarks/test_search.py:227

BubbleCal · 2026-05-18T10:40:30Z

            DataType::List(_) | DataType::FixedSizeList(_, _) => {
-                if !matches!(vector_type, DataType::List(_)) {
-                    return Err(Error::invalid_input(format!(
-                        "Query is multivector but column {}({})is not multivector",


Can you explain more how this distinguishes between multivector query and query batch?

Batch-vs-multivector is distinguished by the vector column type: list-like q + List column means one multivector query; list-like q + FixedSizeList column means a batch of single-vector queries.

batch-vs-multivector is decided by vector column type with comments added in Scanner::nearest lines 1467-1475

BubbleCal

distance_range param is lost if it's a batch query
this forces the query to be executed by flat KNN even there's an index, we still need to use the index if there is one (just query the index for each query vector).

plz add tests for verifying they are really fixed

BubbleCal · 2026-05-19T07:04:45Z

if the query is with:

fast_search=True
batch query
no index

it's expected to return an empty result, but the schema should still contain query_index column, but now it doesn't

BubbleCal · 2026-05-19T07:05:32Z

+            In that case Lance runs a flat batch KNN query, returns up to ``k`` rows
+            for each query vector, and adds ``query_index`` to identify the source
+            query for each result row. Indexed/ANN batch search is not used in this
+            first implementation.


this comments look not correct

outdated comments

there will be a query_index for batch

BubbleCal · 2026-05-19T07:05:51Z

    q: QueryVectorLike
-        The query vector.
+        The query vector. For fixed-size vector columns, this may be a 2-D
+        array-like batch of query vectors. Batch queries run flat KNN, apply


outdated comments were committed with old batch implementation but not being updated by later commits, btw indexed batch & distance_range has been consistent with comments like above

BubbleCal · 2026-05-21T05:30:04Z

+            .into_iter()
+            .flat_map(BinaryHeap::into_vec)
+            .collect::<Vec<_>>();
+        results.sort_by(|left, right| {


I think we need to rethink how to implement this, now the results will be truncated because of SortExec has limit=k.

Say for batch query with 2 vectors, and k=10, this would return 20 rows, but SortExec will keep only 10 results then we will lose the rest results

introducing Scanner::is_batch_nearest & skipping SortExec on batch flat path, per-query top-k will be handled by KNNVectorDistanceExec::execute_batch

I don't think this is fixed, plz add a test to verify it

BubbleCal · 2026-05-21T05:32:00Z

        match t {
            DisplayFormatType::Default | DisplayFormatType::Verbose => {
-                write!(f, "KNNVectorDistance: metric={}", self.distance_type,)
+                if self.query_count > 1 {


I think all checks self.query_count > 1 need to be replaced by self.is_batch_query() which should check by query shape not query count.

Say if the query is a list of vectors but with only 1 vector, it's still a batch query, or the behavior will be hard to predict

change batch query judgement

Scanner::is_batch_nearest + KNNVectorDistanceExec::is_batch replaces self.query_count > 1

indexed batch still leverage existing single query ANN path with is_batch_nearest=false

LeoReeYang · 2026-05-22T00:08:56Z

Test dedup only — no production or indexed-batch behavior changes.

Rust: 7 batch KNN tests → 4 via batch_knn_two_queries() and assert_batch_matches_single_queries().
- test_batch_knn_flat: flat plan, m×k rows, query_index, parity vs single queries, single-vector (1, dim) batch shape, no global SortExec TopK(k).
- test_batch_knn_flat_respects_distance_range: shared helper + distance_range.
- test_batch_knn_indexed: indexed plan + distance_range parity (merged former split tests).
Python:
- @pytest.mark.parametrize on flat batch test (2-query / 1-query).
- removed redundant single-vector test.
- _assert_batch_matches_single_queries checks distance_range only when provided.

Net: fewer overlapping assertions, same coverage.

Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results. Co-authored-by: Cursor <cursoragent@cursor.com>

Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback. Co-authored-by: Cursor <cursoragent@cursor.com>

Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload. Co-authored-by: Cursor <cursoragent@cursor.com>

Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends. Co-authored-by: Cursor <cursoragent@cursor.com>

Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API. Co-authored-by: Cursor <cursoragent@cursor.com>

Apply rustfmt output expected by CI for the batch query binding change. Co-authored-by: Cursor <cursoragent@cursor.com>

Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>

Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path. Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

Update nearest/search docstrings to describe indexed batch queries and add Python tests that batch distance_range matches per-query searches. Co-authored-by: Cursor <cursoragent@cursor.com>

When fast_search is used with a batch nearest query and no vector index, return an empty result whose schema still contains query_index. Co-authored-by: Cursor <cursoragent@cursor.com>

Use is_batch_nearest based on list-like queries on fixed-size vector columns instead of query_count > 1, so single-vector batch queries still get query_index and avoid SortExec TopK(fetch=k) truncating m*k results to k rows. Co-authored-by: Cursor <cursoragent@cursor.com>

Consolidate overlapping Rust/Python batch nearest tests via shared helpers. No production changes; merge with main deferred. Co-authored-by: Cursor <cursoragent@cursor.com>

Keep the main-based batch vector query branch compiling cleanly after conflict resolution. Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal · 2026-05-22T08:27:58Z

                let vector_expr = expressions::col(DIST_COL, current_schema)?;
                output_expr.push((vector_expr, DIST_COL.to_string()));
            }
+            if self.is_batch_nearest && output_expr.iter().all(|(_, name)| name != QUERY_INDEX_COL)


I think the query_index column shouldn't be added into autoproject_scoring_columns, it's a little bit confusing

Add a regression test that batch flat KNN returns k rows per query instead of being truncated by SortExec, and keep query_index autoprojection separate from scoring-column autoprojection. Co-authored-by: Cursor <cursoragent@cursor.com>

claude Bot reviewed May 18, 2026

View reviewed changes

github-actions Bot added the python label May 18, 2026

BubbleCal requested changes May 18, 2026

View reviewed changes

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/io/exec/knn.rs Outdated

LeoReeYang changed the title ~~Support batch flat vector queries~~ feat: support batch flat vector queries May 18, 2026

github-actions Bot added the enhancement New feature or request label May 18, 2026

BubbleCal reviewed May 18, 2026

View reviewed changes

Comment thread rust/lance/src/io/exec/knn.rs Outdated

Comment thread rust/lance/src/io/exec/knn.rs Outdated

Comment thread rust/lance/src/io/exec/knn.rs Outdated

BubbleCal reviewed May 18, 2026

View reviewed changes

Comment thread python/python/benchmarks/test_search.py Outdated

LeoReeYang requested a review from BubbleCal May 18, 2026 14:09

BubbleCal requested changes May 18, 2026

View reviewed changes

BubbleCal requested changes May 19, 2026

View reviewed changes

LeoReeYang requested a review from BubbleCal May 21, 2026 00:10

LeoReeYang changed the title ~~feat: support batch flat vector queries~~ feat: support batch vector queries May 21, 2026

BubbleCal requested changes May 21, 2026

View reviewed changes

LeoReeYang requested a review from BubbleCal May 21, 2026 10:37

LeoReeYang force-pushed the leo/batch-vector-query-6821 branch from 99de360 to bbe534b Compare May 21, 2026 23:37

LeoReeYang and others added 7 commits May 22, 2026 15:28

feat: support batch flat vector queries

9b87776

Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results. Co-authored-by: Cursor <cursoragent@cursor.com>

refactor: align batch vector query with nearest API

264e0b2

Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback. Co-authored-by: Cursor <cursoragent@cursor.com>

bench: scale batch flat vector query benchmark

10a5687

Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload. Co-authored-by: Cursor <cursoragent@cursor.com>

bench: parameterize batch vector query benchmark

beddd3d

Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: align batch query review feedback

e1a66c3

Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: format python dataset binding

42fe7b9

Apply rustfmt output expected by CI for the batch query binding change. Co-authored-by: Cursor <cursoragent@cursor.com>

bench: use pytest params for batch knn benchmark

4217c77

Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang and others added 7 commits May 22, 2026 15:30

fix: respect batch vector query parameters

f4b3f25

Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path. Co-authored-by: Cursor <cursoragent@cursor.com>

test: assert indexed batch KNN matches single-query distance_range

82ab937

Co-authored-by: Cursor <cursoragent@cursor.com>

docs: align batch vector query Python docs with implementation

dfd4532

Update nearest/search docstrings to describe indexed batch queries and add Python tests that batch distance_range matches per-query searches. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: include query_index in empty batch fast_search results

69013a0

When fast_search is used with a batch nearest query and no vector index, return an empty result whose schema still contains query_index. Co-authored-by: Cursor <cursoragent@cursor.com>

test: deduplicate batch KNN tests

584c07b

Consolidate overlapping Rust/Python batch nearest tests via shared helpers. No production changes; merge with main deferred. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: align batch query branch with main

fc0e7f0

Keep the main-based batch vector query branch compiling cleanly after conflict resolution. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang force-pushed the leo/batch-vector-query-6821 branch from bbe534b to fc0e7f0 Compare May 22, 2026 07:48

BubbleCal requested changes May 22, 2026

View reviewed changes

fix: address latest batch KNN review feedback

e14e04a

Add a regression test that batch flat KNN returns k rows per query instead of being truncated by SortExec, and keep query_index autoprojection separate from scoring-column autoprojection. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang requested a review from BubbleCal May 22, 2026 14:24

Conversation

LeoReeYang commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

API contract

Benchmark

Query Count Scaling

Dataset Size Scaling

Test plan

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BubbleCal May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BubbleCal left a comment

Choose a reason for hiding this comment

Uh oh!

BubbleCal commented May 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LeoReeYang commented May 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

LeoReeYang commented May 18, 2026 •

edited

Loading

codecov Bot commented May 18, 2026 •

edited

Loading

BubbleCal May 18, 2026 •

edited

Loading