What happened?
We'd been on Vortex 0.69 for a while and, upgrading to 0.74, hit a regression caused by the removal of ScalarFnConstantRule in #7575.
can_prune evaluates a predicate's stat-falsification expression over the one-row file-stats array, but it only accepted a constant-folded result (Columnar::Constant) and treated a materialized one-row boolean (Columnar::Canonical) as "cannot prove".
After #7575 removed bottom-up constant folding, composite falsifications no longer fold to a constant: boolean trees (and/or) and eq (whose falsification is internally or(min > lit, lit > max)) now execute to a one-row Canonical, so can_prune discards the result and stops pruning. Bare gt/lt comparisons still fold through the compare kernel's one-level constant fast path, so only composite and eq predicates regressed.
Steps to reproduce
- Write a Vortex file containing a struct column whose statistics bound the values — e.g. an age column with values [15, 18, 22, 25] (min 15, max 25) and a price column [120, 130, 140, 150].
- Open the file (SESSION.open_options().open_buffer(buf)?).
- Call file.can_prune(&eq(col("age"), lit(5)))? ... 5 is outside the [15, 25] min/max, so the file provably contains no match. Expected Ok(true) (prunable); actual Ok(false).
- Call file.can_prune(&and(gt(col("age"), lit(30)), lt(col("price"), lit(100))))? ...both branches are falsified by the stats. Expected Ok(true); actual Ok(false).
- Call file.can_prune(&or(gt(col("age"), lit(30)), lt(col("age"), lit(10))))? ... both branches falsified. Expected Ok(true); actual Ok(false).
- As a control, call file.can_prune(>(col("age"), lit(30)))? ... a bare comparison, which still folds through the compare kernel's fast path. Returns Ok(true) correctly, showing only composite/eq predicates regressed.
Environment
Additional context
No response
What happened?
We'd been on Vortex 0.69 for a while and, upgrading to 0.74, hit a regression caused by the removal of
ScalarFnConstantRulein #7575.can_pruneevaluates a predicate's stat-falsification expression over the one-row file-stats array, but it only accepted a constant-folded result (Columnar::Constant) and treated a materialized one-row boolean (Columnar::Canonical) as "cannot prove".After #7575 removed bottom-up constant folding, composite falsifications no longer fold to a constant: boolean trees (
and/or) andeq(whose falsification is internallyor(min > lit, lit > max)) now execute to a one-rowCanonical, socan_prunediscards the result and stops pruning. Baregt/ltcomparisons still fold through the compare kernel's one-level constant fast path, so only composite andeqpredicates regressed.Steps to reproduce
Environment
Additional context
No response