Skip to content

Rework merge task for both compactor and indexer merge flows#6464

Merged
nadav-govari merged 6 commits into
nadav/merge-feature-split-mergesfrom
nadav/fix-merge-operation
May 29, 2026
Merged

Rework merge task for both compactor and indexer merge flows#6464
nadav-govari merged 6 commits into
nadav/merge-feature-split-mergesfrom
nadav/fix-merge-operation

Conversation

@nadav-govari

Copy link
Copy Markdown
Collaborator

Description

One thing that was done in the feature branch was reworking the MergeScratch used in the merge pipeline with an eye on deprecating MergeTask, since I thought that was going away. Since the existing flow stays for now, we need the merge task, which tracks merge permits per pipeline for the existing flow. So rework it into an enum thats either merge task or merge operation.

How was this PR tested?

Unit tests updated.

@nadav-govari nadav-govari requested a review from a team as a code owner May 22, 2026 18:36

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9fa5cabc0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +168 to +170
let merge_task_opt = match merge_source {
MergeSource::Task(task) => Some(task),
MergeSource::Operation(_) => None,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the merge task alive through publish

When this branch forwards a scheduled MergeTask, it eventually lands in SplitsUpdate::merge_task, but the logs publisher destructures SplitsUpdate with .. at quickwit/quickwit-indexing/src/actors/log_publisher_impl.rs:68-75, which drops that field before publish_splits is awaited. For the old merge/delete pipelines, this releases the scheduler permit and removes the operation from the planner/delete inventory before the replacement is actually committed, so a slow or failed publish can let another merge be scheduled and make known_split_ids GC treat the input splits as no longer in flight. Hold/drop the task only after publish succeeds, as the parquet publisher does, if this forwarding is meant to preserve merge-task lifetime.

Useful? React with 👍 / 👎.

@nadav-govari nadav-govari merged commit 7054e1d into nadav/merge-feature-split-merges May 29, 2026
5 checks passed
@nadav-govari nadav-govari deleted the nadav/fix-merge-operation branch May 29, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant