Skip to content

Snapshot tests in physical_optimizer are not deterministic across CPU-count environments #22543

@crm26

Description

@crm26

Several insta snapshot tests in datafusion/core/tests/physical_optimizer/ capture RepartitionExec: partitioning=RoundRobinBatch(N), input_partitions=M where N is the host's CPU core count. On hosts whose core count differs from the snapshot environment, the assertions fail despite the optimizer behavior being correct.

Reproduction

On a 24-core / 24-thread machine, against current main (HEAD 2453bec66):

cargo test -p datafusion --test core_integration \
  physical_optimizer::ensure_requirements::test_filter_over_multi_partition_sort_limit

Fails with:

-        MockMultiPartitionExec
+        RepartitionExec: partitioning=RoundRobinBatch(24), input_partitions=16

Other examples of the same class

Workaround

Setting DATAFUSION_EXECUTION_TARGET_PARTITIONS=4 (or whatever count the snapshot was captured against) makes the affected tests pass.

Suggested fix

Either:

  1. Pin a fixed target_partitions in the test's SessionConfig so the captured plan doesn't inherit the host's CPU count, or
  2. Add a CPU-count guard / #[ignore] for environments where target_partitions exceeds the value the snapshot was captured against.

Happy to send a PR for approach (1) if there's consensus.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions