Skip to content

backport: Allow dynamic filter pushdown for left join#3

Merged
evenyag merged 2 commits into
GreptimeTeam:greptimedb-53.1.0-function-signature-exec-errorfrom
discord9:allow-left-join-dynamic-filter
Jun 9, 2026
Merged

backport: Allow dynamic filter pushdown for left join#3
evenyag merged 2 commits into
GreptimeTeam:greptimedb-53.1.0-function-signature-exec-errorfrom
discord9:allow-left-join-dynamic-filter

Conversation

@discord9

@discord9 discord9 commented Jun 3, 2026

Copy link
Copy Markdown

Summary

Backport upstream DataFusion's HashJoin dynamic filter support for join types whose probe side is safe for ON-clause filter pushdown.

This follows the original DataFusion feature behavior from apache#20447: HashJoinExec now gates self-generated dynamic filters with JoinType::on_lr_is_preserved() instead of a Greptime-specific Inner | Left rule.

The dynamic filter is still built from the left/build side join keys and applied to the right/probe side.

Note on null-aware LeftAnti

This backport intentionally keeps the behavior aligned with upstream DataFusion and does not add Greptime-specific handling for null-aware LeftAnti / NOT IN semantics.

In particular, this PR does not introduce an extra null_aware guard in HashJoinExec::allow_join_dynamic_filter_pushdown(). Any semantic caveats around null-aware anti joins are kept consistent with the original upstream feature and are considered out of scope for this backport.

What changed

  • Add JoinType::on_lr_is_preserved() to datafusion-common.
  • Reuse that helper from the optimizer's ON-clause filter-pushdown logic.
  • Change HashJoinExec::allow_join_dynamic_filter_pushdown() to check whether the right/probe side is preserved for ON-clause pushdown.
  • Update dynamic-filter sqllogictest expectations for the newly enabled join types.

Why this is safe

The self-generated HashJoin dynamic filter only filters the probe side by build-side join-key values.

For a physical LEFT JOIN, this is safe because rows on the right/probe side whose keys are not present on the left/build side cannot contribute to the output, while unmatched left rows are still preserved.

More generally, this backport keeps the upstream DataFusion rule: enable the feature when the probe side is preserved according to on_lr_is_preserved().

Validation

  • cargo fmt --all
  • cargo test -p datafusion-physical-plan test_allow_join_dynamic_filter_pushdown_gate -- --nocapture
  • cargo test -p datafusion-physical-plan test_on_lr_is_preserved -- --nocapture
  • cargo test -p datafusion-sqllogictest --test sqllogictests -- dynamic_filter_pushdown_config
  • GreptimeDB integration check with this DataFusion patch:
cargo test -p tests-integration remote_dyn_filter_test -- --nocapture

The GreptimeDB integration test verifies:

  • HashJoinExec: mode=CollectLeft, join_type=Left
  • right-side MergeScanExec receives non-empty dyn_filters
  • unmatched left row is preserved in LEFT JOIN output

Signed-off-by: discord9 <discord9@163.com>
@discord9 discord9 changed the title Allow dynamic filter pushdown for left join backport: Allow dynamic filter pushdown for left join Jun 9, 2026
@evenyag evenyag merged commit d171145 into GreptimeTeam:greptimedb-53.1.0-function-signature-exec-error Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants