perf: use roaring's range iter to speedup mask_to_offset_ranges by westonpace · Pull Request #6871 · lance-format/lance

westonpace · 2026-05-20T12:45:08Z

The function mask_to_offset_ranges is used at scan planning time to determine which rows to read from the file. This was a bottleneck when the mask was the result of a zonal index search because the old implementation materialized all of the offsets only to convert them back into ranges.

Luckily, roaring recently implemented a range-based iterator. Using this we can skip the materialization step. On my zonemap benchmark this doubles the speed of the search and, perhaps more importantly, removes a penalty I observed when the index is used even on queries that are not highly selective.

Generated with the assistance of Claude code.

Add a criterion benchmark suite targeting RowAddrMask / RowAddrTreeMap that quantifies the cost of operations whose work is fundamentally range-shaped but currently goes through per-row Partial(RoaringBitmap) representation. Six groups: insert_range_single_run - producer cost: insert one range into_addr_iter_single_run - consumer cost: walk every row addr next_range_iter_single_run - achievable cost via Iter::next_range intersect_two_runs - set op on two range-shaped masks mask_to_offset_ranges_inner_loop - end-to-end slow path observed in IS NULL trace (495 ms / 889 ms) insert_runs_constant_cardinality - many small runs vs one big run Each varies dataset size while holding number-of-ranges fixed at 1, so linear scaling in N reveals where row count dominates the cost. Headline finding (10M-row inputs): into_addr_iter: 19.4 ms per-bit walk next_range iter: 1.72 us per-run walk (~11000x faster) The next_range/iter delta represents the speedup an alternate range-aware iterator could surface to callers. The roaring crate already represents the data as run-encoded containers; the RowAddrMask public API does not expose them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds `RowAddrTreeMap::iter_runs()` — a range-shaped consumer that walks roaring's run-encoded containers via `Iter::next_range` instead of yielding individual bits. Rewrites the `U64Segment::Range` arm of `mask_to_offset_ranges` to use it, eliminating the per-bit walk that dominated the IS NULL hot path documented in 1b9d7c0. Benchmark deltas at 10M rows (single contiguous run, vs the bench commit's `into_addr_iter` baseline): Consumer iteration into_addr_iter iter_runs speedup N = 10K 19.4 µs 17.6 ns 1,100x N = 100K 191 µs 28.4 ns 6,800x N = 1M 1.92 ms 181 ns 10,400x N = 10M 19.5 ms 1.68 µs 11,600x mask_to_offset_ranges_inner_loop (end-to-end hot path): N = 10K 19.7 µs 132 ns 150x N = 100K 194 µs 262 ns 775x N = 1M 1.93 ms 1.92 µs 1,000x N = 10M 19.3 ms 20.1 µs 960x Within ~3x of a dedicated Vec<RangeInclusive>-backed representation at 10M rows, but both are in the microseconds while the original was in the milliseconds — irrelevant in the context of a query that takes hundreds of ms. The new method is ~70 lines (method + 2 tests + bench wiring) vs the ~700-line Runs-variant alternative, and adds no new public enum variant or representation switch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

codecov · 2026-05-20T13:20:17Z

Codecov Report

❌ Patch coverage is 95.91837% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-core/src/utils/mask.rs	96.96%	1 Missing ⚠️
rust/lance-table/src/rowids.rs	93.75%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

wjones127

Nice optimization!

Xuanwo

Nice change!

westonpace and others added 2 commits May 16, 2026 18:51

claude Bot reviewed May 20, 2026

View reviewed changes

github-actions Bot added the performance label May 20, 2026

wjones127 approved these changes May 20, 2026

View reviewed changes

Xuanwo approved these changes May 21, 2026

View reviewed changes

westonpace merged commit 3d85fc7 into lance-format:main May 21, 2026
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: use roaring's range iter to speedup mask_to_offset_ranges#6871

perf: use roaring's range iter to speedup mask_to_offset_ranges#6871
westonpace merged 2 commits into
lance-format:mainfrom
westonpace:perf-mask-next-range

westonpace commented May 20, 2026

Uh oh!

claude Bot left a comment

Uh oh!

codecov Bot commented May 20, 2026

Uh oh!

wjones127 left a comment

Uh oh!

Xuanwo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

westonpace commented May 20, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

codecov Bot commented May 20, 2026

Codecov Report

Uh oh!

wjones127 left a comment

Choose a reason for hiding this comment

Uh oh!

Xuanwo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants