parquet: Speed up BitReader/DeltaBitPackDecoder#325
Conversation
From a quick test, this speeds up reading delta-packed int columns by over 30%.
From a quick test, it seems to decode around 10% faster overall.
753019c to
10d8390
Compare
|
@kornholi thank you for looking into |
|
I restarted the CI checks on this PR as the failure on the windows tests seemed unrelated to your changes |
| ); | ||
| assert!(loaded == self.values_current_mini_block); | ||
| } else { | ||
| self.deltas_in_mini_block.clear(); |
There was a problem hiding this comment.
I don't understand the need for this change -- was calling clear() a major bottleneck? Or was it having to reinitialize the entire deltas_in_mini_block to default() in the self.use_batch branch?
There was a problem hiding this comment.
In this case, the resize is expensive even though it optimizes down to mostly a memset (only 4 elems in the array in my tests). Around a 5% throughput difference.
* parquet: Avoid temporary `BufferPtr`s in `BitReader` From a quick test, this speeds up reading delta-packed int columns by over 30%. * parquet: Avoid some allocations in `DeltaBitPackDecoder` From a quick test, it seems to decode around 10% faster overall.
* parquet: Avoid temporary `BufferPtr`s in `BitReader` From a quick test, this speeds up reading delta-packed int columns by over 30%. * parquet: Avoid some allocations in `DeltaBitPackDecoder` From a quick test, it seems to decode around 10% faster overall. Co-authored-by: Kornelijus Survila <kornholijo@gmail.com>
This PR removes some reference counting in
BitReaderand a few allocations inDeltaBitPackDecoder.At least for the datasets I tested with, delta-encoded integer columns decode around 50% faster. A
SELECT AVG(foo)through datafusion was about 30% faster as well.