Add support for filter pushdown rule#924
Conversation
…ers from TableScan struct
…filter-pushdown-rule
Codecov Report
@@ Coverage Diff @@
## main #924 +/- ##
==========================================
+ Coverage 75.81% 76.04% +0.22%
==========================================
Files 73 73
Lines 4065 4082 +17
Branches 737 739 +2
==========================================
+ Hits 3082 3104 +22
+ Misses 823 814 -9
- Partials 160 164 +4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
…k-sql into support-filter-pushdown-rule
…filter-pushdown-rule
| [ | ||
| [("a", "==", 1), ("b", "<", 10)], | ||
| [("a", "==", 1), ("b", ">", 5)], | ||
| [("b", ">", 5), ("b", "<", 10)], | ||
| [("a", "==", 1)], | ||
| ], |
There was a problem hiding this comment.
The latest version of datafusion applies some conversions that convert filters to a cnf like format. So the dnf here
(b > 5 AND b < 10) OR a = 1 gets remapped to
(b>5 or a=1) and (b<10 or a=1) by datafusion which becomes this is dnf
(b>5 and a=1) or (b<10 and a=1) or (b>5 and b<10) or (a=1) .
I don't know if there's an easy way to simplify these predicate expressions but some cases might lead to larger number of redundant filters because of this.
| 14, | ||
| 16, | ||
| 18, | ||
| 21, |
There was a problem hiding this comment.
I looked into both these queries, for both it seems like they previously failed with metadata inference errors trying to do a comparison between a datetime64 and string object.
I'll open up another issue to investigate but in the meanwhile pushing those filters down to the io seems to fix the queries
There was a problem hiding this comment.
Digging into this before it seemed like the underlying issue was that we were at some point casting a column to Date32 and trying to compare to a Date32 scalar, but their dtypes didn't match up - not sure if that's helpful to make a simple reproducer for this
|
rerun tests |
This PR enables using the
filter_pushdown_rulefrom datafusion and allows passing filters down to the tablescan operation.