We want to push down scalar (and, in the future, aggregate) functions that are part of SELECT into vortex.
Example: SELECT strlen(col) doesn't need to decompress the strings.
Duckdb PR for type pushdown which is the same mechanism: duckdb/duckdb#22788
Issues to solve:
Future work:
- Push down chains of functions f_1(...(f_n(col)) if leaf child is BoundColumnRef.
- Push down multiple expressions for column if we can push all of them i.e. allow pushing down (strlen(col), col). This requires a change on projection ids passed to vortex. In this example we need to pass projection_ids=[0, 0].
Future separate extension is aggregate function pushdown
We want to push down scalar (and, in the future, aggregate) functions that are part of SELECT into vortex.
Example: SELECT strlen(col) doesn't need to decompress the strings.
Duckdb PR for type pushdown which is the same mechanism: duckdb/duckdb#22788
Issues to solve:
cast reduce rule for dict evaluated validity(), causing decompression. Solved by adding validity() to byte_length() Constant comparison and byte_length OnPair kernels #8371
This boils down to the task whether we want to push expression or subexpression tree to Dict values cache to avoid the cost of recanonicalization. Push down some expressions to Dict layout reader's cached values #8341
Future work:
Future separate extension is aggregate function pushdown