Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -497,6 +497,15 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
}
}

case not: Not => simplifyNot(not)
}

/**
* Simplify Not expressions. This method is extracted to allow recursive calls for
* Not(a Or b) and Not(a And b) cases, which avoids calling the entire actualExprTransformer
* and only focuses on Not simplification.
*/
private def simplifyNot(not: Not): Expression = not match {
case Not(TrueLiteral) => FalseLiteral
case Not(FalseLiteral) => TrueLiteral

Expand All @@ -506,22 +515,18 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
case Not(a LessThan b) => GreaterThanOrEqual(a, b)
case Not(a LessThanOrEqual b) => GreaterThan(a, b)

// SPARK-54881: push down the NOT operators on children, before attaching the junction Node
// to the main tree. This ensures idempotency in an optimal way and avoids an extra rule
// iteration.
// De Morgan's Laws can create new Not expressions that need further simplification.
case Not(a Or b) =>
And(Not(a), Not(b)).transformDownWithPruning(_.containsPattern(NOT), ruleId) {
actualExprTransformer
}
And(simplifyNot(Not(a)), simplifyNot(Not(b)))
case Not(a And b) =>
Or(Not(a), Not(b)).transformDownWithPruning(_.containsPattern(NOT), ruleId) {
actualExprTransformer
}
Or(simplifyNot(Not(a)), simplifyNot(Not(b)))

case Not(Not(e)) => e

case Not(IsNull(e)) => IsNotNull(e)
case Not(IsNotNull(e)) => IsNull(e)

case _ => not
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have 2 comments:

  1. Have found recursive calls to be less efficient, as compared to getting the recursive behaviour via call on an instance, where the instance itself is changing at every step.. like it is happening in transformDown
  2. Any future modification , say adding new case etc, would require analysis whether it should go in both the places ( original transform down, and new simplifyNot function, or only in once place... atleast this is what I feel.. though I may be wrong.
    But if that identification would be straightforward and any hypothetical new case added would be needed only in one of the functions, then this distinction is good. Though recursive call on method will be a concern for complex expressions....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it was O(n^2): actualExprTransformer is called in transformExpressionsUpWithPruning, and itself also calls transformDownWithPruning. We repeatedly transform the sub-expression tree.

I don't think recursion is a concern, as these transform APIs are all recursive under the hood.

Expand Down