Several insta snapshot tests in datafusion/core/tests/physical_optimizer/ capture RepartitionExec: partitioning=RoundRobinBatch(N), input_partitions=M where N is the host's CPU core count. On hosts whose core count differs from the snapshot environment, the assertions fail despite the optimizer behavior being correct.
Reproduction
On a 24-core / 24-thread machine, against current main (HEAD 2453bec66):
cargo test -p datafusion --test core_integration \
physical_optimizer::ensure_requirements::test_filter_over_multi_partition_sort_limit
Fails with:
- MockMultiPartitionExec
+ RepartitionExec: partitioning=RoundRobinBatch(24), input_partitions=16
Other examples of the same class
Workaround
Setting DATAFUSION_EXECUTION_TARGET_PARTITIONS=4 (or whatever count the snapshot was captured against) makes the affected tests pass.
Suggested fix
Either:
- Pin a fixed
target_partitions in the test's SessionConfig so the captured plan doesn't inherit the host's CPU count, or
- Add a CPU-count guard /
#[ignore] for environments where target_partitions exceeds the value the snapshot was captured against.
Happy to send a PR for approach (1) if there's consensus.
Several insta snapshot tests in
datafusion/core/tests/physical_optimizer/captureRepartitionExec: partitioning=RoundRobinBatch(N), input_partitions=MwhereNis the host's CPU core count. On hosts whose core count differs from the snapshot environment, the assertions fail despite the optimizer behavior being correct.Reproduction
On a 24-core / 24-thread machine, against current
main(HEAD2453bec66):cargo test -p datafusion --test core_integration \ physical_optimizer::ensure_requirements::test_filter_over_multi_partition_sort_limitFails with:
Other examples of the same class
test_filter_over_multi_partition_sort_limit— added in Add EnsureRequirements: merged EnforceDistribution + EnforceSorting with idempotent pushdown_sorts #21976 (2026-05-24)explain_analyze.slt:103(output_rows_skew— added in feat(metric): Add output skewness metric to detect skewed plans easier #21211) — same root cause, manifests in SLTWorkaround
Setting
DATAFUSION_EXECUTION_TARGET_PARTITIONS=4(or whatever count the snapshot was captured against) makes the affected tests pass.Suggested fix
Either:
target_partitionsin the test'sSessionConfigso the captured plan doesn't inherit the host's CPU count, or#[ignore]for environments wheretarget_partitionsexceeds the value the snapshot was captured against.Happy to send a PR for approach (1) if there's consensus.