feat: add early termination for compaction plan with max_compaction_bytes option by Jay-ju · Pull Request #6890 · lance-format/lance

Jay-ju · 2026-05-21T12:05:29Z

Summary

Add budget-based early termination to DefaultCompactionPlanner to prevent OOM when planning compaction on datasets with many fragments (e.g., hundreds of thousands).

Closes: #6039

Problem

When a dataset has hundreds of thousands of fragments, plan_compaction collects metrics for all fragments before producing the plan. This leads to:

OOM risk: All fragment metadata + metrics are held in memory simultaneously
Excessive I/O: Each fragment requires a read of its deletion file
Large serialized plans: 10K fragments → ~2.3MB JSON; 300K fragments → ~70MB JSON

The existing max_source_fragments option was a post-hoc truncation — it collected all metrics first, then truncated the output. This did not reduce planning time or memory.

Benchmark data (10K fragments, no deletions):

max_source_fragments	plan_time_ms	plan_json_size
None (unlimited)	51	2.3MB
100	56	408B
500	56	408B

Plan time barely changed because all metrics were still collected.

Solution

Refactor max_source_fragments from post-hoc truncation to in-loop early termination, and add a new max_compaction_bytes option. The planner now tracks total_candidate_fragments and total_candidate_bytes during the metrics collection loop and breaks out as soon as either budget is exceeded.

Key changes:

max_source_fragments: Now terminates metrics collection early (was post-hoc truncation)
max_compaction_bytes: New option to limit by cumulative fragment byte size
exceeds_budget(): Helper method checking both limits during the planning loop
Preserves existing parallel I/O (.buffered(io_parallelism())) — unlike PR feat: support bounded compaction planner #6095 which used serial I/O

Design Rationale

This approach follows hamersaw's review feedback on PR #6095: extending CompactionOptions rather than adding a new BoundedCompactionPlanner type. Users configure limits directly without needing to choose a planner implementation.

Changes

Rust

CompactionOptions: Add max_compaction_bytes: Option<usize> field
DefaultCompactionPlanner::plan(): Replace post-hoc truncation with in-loop early termination
DefaultCompactionPlanner::exceeds_budget(): New helper method
CompactionOptions::apply_dataset_config(): Support lance.compaction.max_compaction_bytes
Tests: 3 functional tests + 3 benchmark tests

Python

CompactionOptions TypedDict: Add max_compaction_bytes field with docs
PyO3 binding: Handle max_compaction_bytes key

Usage

# Limit by fragment count
dataset.optimize.compact_files(max_source_fragments=1000)

# Limit by total bytes
dataset.optimize.compact_files(max_compaction_bytes=10 * 1024**3)  # 10GB

# Both limits combined
dataset.optimize.compact_files(
    max_source_fragments=1000,
    max_compaction_bytes=10 * 1024**3,
)

let options = CompactionOptions {
    max_source_fragments: Some(1000),
    max_compaction_bytes: Some(10 * 1024 * 1024 * 1024),
    ..Default::default()
};

Comparison with PR #6095

Dimension	PR #6095	This PR
Architecture	New `BoundedCompactionPlanner` type	Extend existing `DefaultCompactionPlanner`
I/O pattern	Serial (one-at-a-time)	Parallel (preserved)
User API	`planner="bounded"` + limits	Direct `max_*` options
Maintainer feedback	Design not accepted	Follows maintainer preference

…ytes option Add budget-based early termination to DefaultCompactionPlanner to prevent OOM when planning compaction on datasets with many fragments (e.g., hundreds of thousands). Changes: - Add max_compaction_bytes option to CompactionOptions - Refactor max_source_fragments from post-hoc truncation to in-loop early termination, stopping fragment metrics collection once budget is exceeded - Add exceeds_budget() helper checking both fragment count and byte limits during the planning loop - Update Python bindings and TypedDict docs - Add functional tests for early termination behavior - Add benchmark tests for plan performance at scale Closes: lance-format#6039

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

…imits - Add apply_budget_limits() for strict post-hoc truncation on task list - Move early termination check before fragment is added to bin - Guarantee at least 1 task is always included - Fix test_max_source_fragments CI failure

- Fix Issue 1: Remove first-task exemption in apply_budget_limits, budget is now a strict hard limit (0 tasks if first task exceeds it) - Fix Issue 2: Early termination now tracks effective (non-noop) candidate fragments only, preventing budget waste on bins that will be filtered by is_noop() - Fix Issue 3: Mark benchmark tests as #[ignore] to reduce CI cost - Update docs to clarify hard-limit semantics

Jay-ju · 2026-05-21T13:31:19Z

@claude review

codecov · 2026-05-21T14:01:00Z

Codecov Report

❌ Patch coverage is 44.48161% with 166 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/dataset/optimize.rs	44.48%	165 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

The test uses IVF with 2 partitions but default nprobes=1, which only probes 1 partition per segment. With delta indices (2 segments), the search may miss the partition containing ID 0 in the first segment, causing the assertion to fail non-deterministically (e.g., returning [889, 1000] instead of [0, 1000]). Setting nprobes=2 ensures all partitions are probed, making the search exhaustive and the test deterministic.

Jay-ju · 2026-05-22T02:21:00Z

Hi @hamersaw. Fragment planning consumes much time in large data scenarios. I have discussed with @zhangyue19921010 . Based on the discussion of #6039 , we revised the original logic of full planning followed by trimming to on-demand planning. Planning will stop once reaching the threshold.

Could you take a look when you have time?

yanghua · 2026-05-25T09:05:00Z

@claude review

claude

⚠️ Code review skipped — your organization has reached its monthly code review spending cap.

An organization admin can view or raise the cap at claude.ai/admin-settings/claude-code. The cap resets at the start of the next billing period.

Once the cap resets or is raised, comment @claude review on this pull request to trigger a review.

yanghua

Left two comments.

yanghua · 2026-05-25T14:00:30Z

                (Some(candidacy), Some(bin)) => {
-                    // We cannot mix "indexed" and "non-indexed" fragments and so we only consider
-                    // the existing bin if it contains the same indices
                    if bin.indices == indices {


If we match this branch, the early termination would never happen.

yanghua · 2026-05-25T14:07:07Z

+    }
+
+    #[tokio::test]
+    #[ignore]


Many ignored test cases, used for benchmark?

…e benchmark tests The early termination check was missing in the branch where a fragment is appended to an existing bin with matching indices. In the common case of an unindexed dataset, all fragments share the same (empty) indices list and always take this path, so the budget was never enforced during the loop -- only post-hoc via apply_budget_limits, defeating the PR's goal of reducing I/O and memory. Now, after appending a fragment to an existing bin, we compute the running fragment/byte counts and break immediately if the budget is exceeded. Also removes three #[ignore] benchmark tests that are not suitable for CI per project guidelines.

claude Bot reviewed May 21, 2026

View reviewed changes

github-actions Bot added enhancement New feature or request python labels May 21, 2026

Jay-ju added 2 commits May 21, 2026 21:02

claude Bot reviewed May 25, 2026

View reviewed changes

yanghua reviewed May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add early termination for compaction plan with max_compaction_bytes option#6890

feat: add early termination for compaction plan with max_compaction_bytes option#6890
Jay-ju wants to merge 5 commits into
lance-format:mainfrom
Jay-ju:feat/compaction-plan-early-termination

Jay-ju commented May 21, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Jay-ju commented May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026

Uh oh!

Jay-ju commented May 22, 2026

Uh oh!

yanghua commented May 25, 2026

Uh oh!

claude Bot left a comment

Uh oh!

yanghua left a comment

Uh oh!

yanghua May 25, 2026

Uh oh!

yanghua May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jay-ju commented May 21, 2026

Summary

Problem

Solution

Design Rationale

Changes

Rust

Python

Usage

Comparison with PR #6095

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Jay-ju commented May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026

Codecov Report

Uh oh!

Jay-ju commented May 22, 2026

Uh oh!

yanghua commented May 25, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

yanghua left a comment

Choose a reason for hiding this comment

Uh oh!

yanghua May 25, 2026

Choose a reason for hiding this comment

Uh oh!

yanghua May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants