fix: simplifier on leaf nodes returns null#22368
Conversation
apache#22367) A leaf PhysicalExpr that is neither a Literal nor a Column (nor volatile) was being const-folded because `all` over its empty child list is vacuously true. The simplifier then evaluated the node against a dummy 1-row null batch, producing a scalar unrelated to its real runtime semantics and silently corrupting predicates. Reject zero-children expressions up front: a leaf has no children to derive constness from, so const-folding is unsound unless the leaf is already known to be a Literal (handled by the dedicated check above). This is the narrow defensive fix from step 1 of issue apache#22367; the broader FFI reconstruction work is tracked separately. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| let children = expr.children(); | ||
| if children.is_empty() || !children.iter().all(|c| c.is::<Literal>()) { |
There was a problem hiding this comment.
This is the core of the PR.
|
@mbutrovich I know you're planning to start rc for 54 soon, but I'd very much like to get this fix in to it. |
adriangb
left a comment
There was a problem hiding this comment.
I understand the reasoning here and can't see what cases this breaks.
I do remember (maybe when I was writing this simplifier bit?) being surprised that there is no API on PhysicalExpr to determine constness and that we (1) downcast match Literal (i.e. there can't be a custom literal) and (2) assume anything with no columns in it's subtree and non volatile is literal. I remember thinking of exactly an expression like your OpaqueLeaf. That is to say: this may be a bug somewhere else as well.
It's in the linked issue, but we have cases where via FFI we have things like a |
Which issue does this PR close?
Addresses the immediate problem identified in apache/datafusion-python#1551 and the first recommendation in #22367
Rationale for this change
We expect Literal and Column expressions and they get
Transform::noas expected. If we have other PhysicalExpr that similarly should not get through because they have no children then they should likewise getTransform::no. This was found when a FFI wrapped Column expression failed in simplification.What changes are included in this PR?
Essentially a two line change to see if it is a leaf node (no children) instead of just checking that all of the children are literals.
Are these changes tested?
Unit test added.
Are there any user-facing changes?
No user facing changes.