Skip to content

ARROW-11086: [Rust] Extend take implementation to more index types#9057

Closed
Dandandan wants to merge 2 commits into
apache:masterfrom
Dandandan:take_index
Closed

ARROW-11086: [Rust] Extend take implementation to more index types#9057
Dandandan wants to merge 2 commits into
apache:masterfrom
Dandandan:take_index

Conversation

@Dandandan

@Dandandan Dandandan commented Dec 31, 2020

Copy link
Copy Markdown
Contributor

Context

The context of this PR is that I want to experiment with a simplified implementation of the hash join in DataFusion which directly can index the build-side array instead of keeping a list of batches. This array could grow beyond 2 ^ 32 billion elements, so would need indexes of type UInt64 rather than UInt32.

Implementation

In the PR I just extend the public take to take any IndexType which implements ArrowNumericType and ToPrimitive.
I am not sure about the consideration before to restrict take to only UInt32Array.

@github-actions

Copy link
Copy Markdown

@alamb

alamb commented Dec 31, 2020

Copy link
Copy Markdown
Contributor

The full set of Rust CI tests did not run on this PR :(

Can you please rebase this PR against apache/master to pick up the changes in #9056 so that they do?

I apologize for the inconvenience.

@Dandandan

Copy link
Copy Markdown
Contributor Author

Rebased

@codecov-io

codecov-io commented Dec 31, 2020

Copy link
Copy Markdown

Codecov Report

Merging #9057 (571bd65) into master (4b7cdcb) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #9057   +/-   ##
=======================================
  Coverage   82.61%   82.61%           
=======================================
  Files         203      204    +1     
  Lines       50140    50140           
=======================================
  Hits        41422    41422           
  Misses       8718     8718           
Impacted Files Coverage Δ
rust/arrow/src/compute/kernels/take.rs 95.21% <100.00%> (ø)
rust/datafusion/src/physical_plan/hash_join.rs 89.53% <100.00%> (ø)
rust/arrow/src/csv/writer.rs 78.82% <0.00%> (-0.56%) ⬇️
rust/arrow/src/util/serialization.rs 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b7cdcb...571bd65. Read the comment docs.

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@alamb

alamb commented Jan 1, 2021

Copy link
Copy Markdown
Contributor

The clippy failures in https://github.com/apache/arrow/pull/9057/checks?check_run_id=1630788725 seem unrelated to your change -- let me check that out...

@alamb

alamb commented Jan 1, 2021

Copy link
Copy Markdown
Contributor

Ah, I hadn't yet seen #9061 which appears to fix the clippy errors -- thanks @Dandandan !

@Dandandan

Copy link
Copy Markdown
Contributor Author

FYI @jorgecarleitao @andygrove this (small) PR is needed to finish #9070

@andygrove andygrove left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jorgecarleitao

Copy link
Copy Markdown
Member

Clippy missing xD

@Dandandan

Copy link
Copy Markdown
Contributor Author

@jorgecarleitao restarted the CI 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants