From 920bb3802abf007df5878929d10644b6e5e745fe Mon Sep 17 00:00:00 2001 From: Yan Yan Date: Wed, 28 Jan 2026 14:26:52 -0800 Subject: [PATCH] [SPARK-41398][SQL][FOLLOWUP] Update runtime filtering javadoc to reflect relaxed partition constraints Update the javadoc for `SupportsRuntimeV2Filtering.filter()` and `SupportsRuntimeFiltering.filter()` to reflect the changes made in SPARK-41398, which relaxed the constraint on partition values during runtime filtering. After that change, scans can now either: - Replace partitions with no matching data with empty InputPartitions, or - Report only a subset of the original partition values (omitting those with no data) The previous documentation stated that the "overall number of partitions" must be preserved, which is no longer required. The only constraint is that new partition values not present in the original partitioning cannot be introduced. Co-Authored-By: Claude Opus 4.5 --- .../spark/sql/connector/read/SupportsRuntimeFiltering.java | 6 ++++-- .../sql/connector/read/SupportsRuntimeV2Filtering.java | 7 ++++--- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java index 0921a90ac22a7..34bce404f375d 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java @@ -51,8 +51,10 @@ public interface SupportsRuntimeFiltering extends SupportsRuntimeV2Filtering { * the originally reported partitioning during runtime filtering. While applying runtime filters, * the scan may detect that some {@link InputPartition}s have no matching data. It can omit * such partitions entirely only if it does not report a specific partitioning. Otherwise, - * the scan can replace the initially planned {@link InputPartition}s that have no matching - * data with empty {@link InputPartition}s but must preserve the overall number of partitions. + * the scan can either replace the initially planned {@link InputPartition}s that have no + * matching data with empty {@link InputPartition}s, or report only a subset of the original + * partition values (omitting those with no data). The scan must not report new partition values + * that were not present in the original partitioning. *

* Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime. * diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java index 7c238bde969b2..1bec81fe8184f 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java @@ -55,9 +55,10 @@ public interface SupportsRuntimeV2Filtering extends Scan { * the originally reported partitioning during runtime filtering. While applying runtime * predicates, the scan may detect that some {@link InputPartition}s have no matching data. It * can omit such partitions entirely only if it does not report a specific partitioning. - * Otherwise, the scan can replace the initially planned {@link InputPartition}s that have no - * matching data with empty {@link InputPartition}s but must preserve the overall number of - * partitions. + * Otherwise, the scan can either replace the initially planned {@link InputPartition}s that + * have no matching data with empty {@link InputPartition}s, or report only a subset of the + * original partition values (omitting those with no data). The scan must not report new + * partition values that were not present in the original partitioning. *

* Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime. *