-
Notifications
You must be signed in to change notification settings - Fork 1.9k
perf: Optimize hash joins with an empty build side #16716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
13be88a
7498e75
2742fcd
00b5425
0e14e5f
d000bb1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,12 +36,13 @@ pub use super::join_filter::JoinFilter; | |
| pub use super::join_hash_map::JoinHashMapType; | ||
| pub use crate::joins::{JoinOn, JoinOnRef}; | ||
|
|
||
| use arrow::array::BooleanArray; | ||
| use arrow::array::{ | ||
| builder::UInt64Builder, downcast_array, new_null_array, Array, ArrowPrimitiveType, | ||
| BooleanBufferBuilder, NativeAdapter, PrimitiveArray, RecordBatch, RecordBatchOptions, | ||
| UInt32Array, UInt32Builder, UInt64Array, | ||
| }; | ||
| use arrow::buffer::NullBuffer; | ||
| use arrow::buffer::{BooleanBuffer, NullBuffer}; | ||
| use arrow::compute; | ||
| use arrow::datatypes::{ | ||
| ArrowNativeType, Field, Schema, SchemaBuilder, UInt32Type, UInt64Type, | ||
|
|
@@ -928,6 +929,55 @@ pub(crate) fn build_batch_from_indices( | |
| Ok(RecordBatch::try_new(Arc::new(schema.clone()), columns)?) | ||
| } | ||
|
|
||
| /// Returns a new [RecordBatch] resulting of a join where the build/left side is empty. | ||
| /// The resulting batch has [Schema] `schema`. | ||
| pub(crate) fn build_batch_empty_build_side( | ||
| schema: &Schema, | ||
| build_batch: &RecordBatch, | ||
| probe_batch: &RecordBatch, | ||
| column_indices: &[ColumnIndex], | ||
| join_type: JoinType, | ||
| ) -> Result<RecordBatch> { | ||
| match join_type { | ||
| // these join types only return data if the left side is not empty, so we return an | ||
| // empty RecordBatch | ||
| JoinType::Inner | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LGTM, how about
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cross joins with an empty relation already appear to run well in the Here is the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes this makes sense, cross join is not a join type that would go through creating hash table
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking about this, I think a more generic version of this would be switching small left sides (e.g < 10 rows) to using cross join 🤔
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is this including for equijoin conditions? I think the performance seemed slow when there was a larger right table for doing this with nested loop join which follows a similar algorithm. It is probably a memory issue due to the cartesian product.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should be relatively fast to do a cross join / NLJ instead of a hash join for those cases, but of course depends how the nested loop join is implemented, probably there is more room for optimization of the nested loop join.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking of opening a proposal to make nested loop join faster, there are definitely some issues to work on there. I'll try to get to that when I have the time |
||
| | JoinType::Left | ||
| | JoinType::LeftSemi | ||
| | JoinType::RightSemi | ||
| | JoinType::LeftAnti | ||
| | JoinType::LeftMark => Ok(RecordBatch::new_empty(Arc::new(schema.clone()))), | ||
|
|
||
| // the remaining joins will return data for the right columns and null for the left ones | ||
| JoinType::Right | JoinType::Full | JoinType::RightAnti | JoinType::RightMark => { | ||
| let num_rows = probe_batch.num_rows(); | ||
| let mut columns: Vec<Arc<dyn Array>> = | ||
| Vec::with_capacity(schema.fields().len()); | ||
|
|
||
| for column_index in column_indices { | ||
| let array = match column_index.side { | ||
| // left -> null array | ||
| JoinSide::Left => new_null_array( | ||
| build_batch.column(column_index.index).data_type(), | ||
| num_rows, | ||
| ), | ||
| // right -> respective right array | ||
| JoinSide::Right => Arc::clone(probe_batch.column(column_index.index)), | ||
| // right mark -> unset boolean array as there are no matches on the left side | ||
| JoinSide::None => Arc::new(BooleanArray::new( | ||
| BooleanBuffer::new_unset(num_rows), | ||
| None, | ||
| )), | ||
| }; | ||
|
|
||
| columns.push(array); | ||
| } | ||
|
|
||
| Ok(RecordBatch::try_new(Arc::new(schema.clone()), columns)?) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /// The input is the matched indices for left and right and | ||
| /// adjust the indices according to the join type | ||
| pub(crate) fn adjust_indices_by_join_type( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we would check the left side being empty before retrieving probe batches, we could also remove hash repartition 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do this in a follow up pr wdyt @nuno-faria?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Can you point out where the probe repartition is being triggered? In the
process_probe_batchitself I think we can also skip creating the hashes when the build side is empty, but I measured and it didn't have a relatively big impact on performance.