We would like to support most/all the features the duckdb parquet extension enables
In other words, where and why are we not better than Parquet on all benchmarks we check:
clickbench: q23[1][2], q35, q40, q41, q42.
statpopgen: q2[5][6], q3[6], q4[6], q5[5][6], q6[5], q9[6], q10[6].
tpch S3 sf=1: q6, q17, q21.
tpch NVME sf=10: q4, q12 (vortex-compact).
tpch NVME sf=100: q20, q21.
tpch S3 sf=100: q6, q17, q21.
tpcds NVME sf=1: many, notable ones are q2, q3 (big regression), q5, q6, q31
clickbench q14-q18: measuring duckdb performance, scan takes 5% CPU time; q32-q34: scan takes 2.2% CPU time: q40-42: scan takes 2% CPU time.
Features that will solve this:
Things proven to have none/marginal impact:
Related:
We would like to support most/all the features the duckdb parquet extension enables
In other words, where and why are we not better than Parquet on all benchmarks we check:
clickbench: q23[1][2], q35, q40, q41, q42.
statpopgen: q2[5][6], q3[6], q4[6], q5[5][6], q6[5], q9[6], q10[6].
tpch S3 sf=1: q6, q17, q21.
tpch NVME sf=10: q4, q12 (vortex-compact).
tpch NVME sf=100: q20, q21.
tpch S3 sf=100: q6, q17, q21.
tpcds NVME sf=1: many, notable ones are q2, q3 (big regression), q5, q6, q31
clickbench q14-q18: measuring duckdb performance, scan takes 5% CPU time; q32-q34: scan takes 2.2% CPU time: q40-42: scan takes 2% CPU time.
Features that will solve this:
Things proven to have none/marginal impact:
Related: