Inline Kleene boolean kernels#8478
Conversation
Merging this PR will improve performance by 50.5%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | take_10k_random |
197.9 µs | 255.8 µs | -22.63% |
| ❌ | Simulation | take_10k_contiguous |
218.3 µs | 276.3 µs | -21% |
| ❌ | Simulation | patched_take_10k_contiguous_patches |
232.3 µs | 290.9 µs | -20.16% |
| ❌ | Simulation | patched_take_10k_random |
244.2 µs | 303 µs | -19.38% |
| ❌ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
299 µs | 347.7 µs | -14% |
| ⚡ | Simulation | or_false_constant |
69.8 µs | 15 µs | ×4.7 |
| ⚡ | Simulation | and_true_constant |
69.3 µs | 15 µs | ×4.6 |
| ⚡ | Simulation | or_true_constant |
69.4 µs | 16.1 µs | ×4.3 |
| ⚡ | Simulation | and_false_constant |
69.5 µs | 17.5 µs | ×4 |
| ⚡ | Simulation | or_bool_constant |
69 µs | 33.9 µs | ×2 |
| ⚡ | Simulation | or_null_constant |
92.3 µs | 49.8 µs | +85.23% |
| ⚡ | Simulation | or_null_constant_aligned |
92.4 µs | 49.9 µs | +85.16% |
| ⚡ | Simulation | and_null_constant |
92.5 µs | 50.1 µs | +84.71% |
| ⚡ | Simulation | and_null_constant_aligned |
92.5 µs | 50.2 µs | +84.18% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
35.3 µs | 20.3 µs | +73.73% |
| ⚡ | Simulation | and_null_constant_shifted |
99.4 µs | 69.5 µs | +43.06% |
| ⚡ | Simulation | or_null_constant_shifted |
98.6 µs | 69.3 µs | +42.36% |
| ⚡ | Simulation | or_bool_nonnull_arrays |
40.8 µs | 32.7 µs | +24.79% |
| ⚡ | Simulation | and_bool_nonnull_arrays |
40.9 µs | 33 µs | +24.12% |
| ⚡ | Simulation | or_bool_nullable_arrays_shifted |
108.2 µs | 90.1 µs | +20.06% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ngates/kleene-bool-kernels (225076f) with develop (c2119d9)
|
Could the perf regression be due to alignment from this new boolean buffer allocated here? |
|
Is there list a? Do we have a grasp of the performance of this change (ignoring constant), which would likely a mistake on our part previously? I what would we expect here? |
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
be5e9d4 to
65cedb5
Compare
robert3005
left a comment
There was a problem hiding this comment.
I think apart form some weird structure that looks like refactor done half way this is good
| } | ||
|
|
||
| fn truthy_bit_buffer(array: ArrayView<'_, ByteBool>) -> BitBuffer { | ||
| BitBuffer::from_iter(array.truthy_bytes().iter().map(|byte| *byte != 0)) |
| let nullability = lhs.dtype().nullability() | rhs.dtype().nullability(); | ||
| let lhs_values = truthy_bit_buffer(lhs); | ||
|
|
||
| if let Some(rhs) = rhs.as_opt::<vortex_array::arrays::Constant>() { |
There was a problem hiding this comment.
Would be nice to import this
| } | ||
|
|
||
| /// Execute a Kleene boolean operation from boolean value bitmaps and validity values. | ||
| pub fn boolean_buffers( |
There was a problem hiding this comment.
How about we name it something better? I was a bit puzzled why these aren't methods on bitbuffer but these need validity as well. How about having kleene_or/kleene_and? kleene_boolean?
| let values = match operator { | ||
| Operator::And => lhs_values & &rhs_values, | ||
| Operator::Or => lhs_values | &rhs_values, | ||
| other => return Err(vortex_err!("Not a boolean operator: {other}")), |
| } | ||
|
|
||
| /// Execute a Kleene boolean operation between boolean value bits and a scalar. | ||
| pub fn boolean_buffer_scalar( |
There was a problem hiding this comment.
this is no longer used?
There was a problem hiding this comment.
No, it's used. Should we replace this with constant_array_boolean? Am I missing something?
| } | ||
| } | ||
|
|
||
| fn fused_boolean_word_sources( |
There was a problem hiding this comment.
should this be two methods? There's little logic shared between the two branches
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Next in our list of Arrow dependencies to remove