-
Notifications
You must be signed in to change notification settings - Fork 1k
Join APIs that return gathermaps #7454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rapids-bot
merged 183 commits into
rapidsai:branch-0.19
from
shwina:gathermap-based-join-apis
Mar 30, 2021
Merged
Changes from 1 commit
Commits
Show all changes
183 commits
Select commit
Hold shift + click to select a range
4a4b4af
Merge branch 'branch-0.17' into branch-0.18
shwina 223f2b5
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina abd6ad2
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina 18863b5
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina 0fbdd31
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina dc9b943
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina d586aa7
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina 996fda8
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina 2808a5c
Add a compute_hash_join_indices that returns just the join indices
shwina ef0baee
Don't need common_columns stuff for join that returns a gathermap
shwina 18f3074
Add hash_join_impl methods that return gathermaps
shwina 70abf48
Add overloads to public hash_join class
shwina 13dff67
Add top-level join APIs that return gathermaps
shwina 3300fe1
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into g…
shwina 7ed694c
Use device_uvector instead of device_vector in join
shwina 636c2ea
Undo some API changes
shwina b79da68
Add join_result
shwina 380aa59
Add APIs that return join_result
shwina 3cbb2b4
Remove column_in_common
shwina 53ae7c9
Add an inner join API that returns gathermaps
shwina fde172b
Add remaining APIs to return gathermaps
shwina 4a286dd
Add gathermap join test
shwina c756db9
Replace -1 with INT_MIN
shwina 6a3d23e
Make join_result columns instead of column_views
shwina 5dfc2a0
Replace join_result with a pair of columns
shwina 362829b
Add gathermap test for outer join
shwina 4e4380c
Add and pass full join gathermap test
shwina 339a13d
Begin Python-side refactor
shwina 2b07802
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into g…
shwina 0d5a19c
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into g…
shwina fdbdc12
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into g…
shwina 5dd5d29
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into g…
shwina 6b20429
Merge branch 'branch-0.19' into gathermap-based-join-apis
shwina 044eac1
Add left_semi and left_anti join APIs that return gathermaps
shwina 555d5ec
Add Cython bindings
shwina 56ae616
full -> outer
shwina dd05121
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina d447924
Progress
shwina 484512e
More progress on py refactor
shwina 5227582
Remove breakpoint
shwina 9cd870e
Fix neg index handling
shwina 8e4f193
Use nullify gather in join
shwina 29fe140
Handle outer joins better
shwina b634055
Fix index construction
shwina cd53d6c
Fix sorting behaviour
shwina 75f1efd
Fix Index.join
shwina 1f5d6ad
Progress on semi/anti joins
shwina de30520
Add simple join test
shwina 66a0de5
Semi-join fix
shwina ca72295
Only combine key columns in outer join if they have the same name
shwina ee2242d
Handle when both _on and _index are provided
shwina e531725
Fix sorting join result
shwina c8b4948
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 674095c
whitespace
shwina cbd9dc3
Make construct_join_output_df work with column views
shwina 3f3c3cb
Get rid of hash_join::left_join
shwina 01415fc
More join C++ cleanup
shwina 6185492
Even more cleaning
shwina d736d1c
More join tests
shwina b58591d
Fix all join tests
shwina be560bb
Python regressions
shwina efb60d6
Revert
shwina fe6d0b8
Invalid -> Unkown
shwina 547027c
Don't mutate lhs/rhs
shwina 5f93d23
Fix join tests
shwina b7bf821
Fix semi/anti join trivial cases
shwina 50a2fb2
When testing join results, use a helper that sorts values
shwina ff0ae79
Totally broken commit
shwina 07cd052
Cleanup
shwina bd6bf77
Warnings
shwina a40063e
Cleanup
shwina ccef9d0
Cleanup
shwina 210244b
Cleanup
shwina b57348c
Add typing for join helpers
shwina 5c2c9b3
Typing for Join class
shwina 558aa15
Simplify joiner API
shwina 3184896
Example doc
shwina d3535dc
Refactor join APIs to return a device_uvector
shwina 3b0a2a5
Merge tag 'branch-0.19-latest' of https://github.com/rapidsai/cudf in…
shwina b82181d
docs
shwina 77d2bfd
Finish up docs?
shwina 0bf34e8
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 26a3fb0
Fix join tests
shwina 8a60d62
Refactor join APIs to work with unique_ptr<rmm::device_uvector>>
shwina 387a953
Update join Cython
shwina 6cd6433
Need to resize the gathermap
shwina c67dcce
Doc
shwina 30c22ed
Changelog
shwina f73199d
Add helper to convert gather_map_type->Column
shwina 393c06a
Update python/cudf/cudf/core/frame.py
shwina e91f554
Cannot specify both column and index
shwina 0185896
Vaildate how
shwina b232f85
Merge branch 'gathermap-based-join-apis' of github.com:shwina/cudf in…
shwina 1eb495d
Can't use a set
shwina 4f1f072
Avoid function local import
shwina 4aa8fec
False -> NotImplementedError
shwina ae0e5f9
Update cpp/include/cudf/join.hpp
shwina f47cf7e
Reuse some join logic
shwina 2a201c3
Merge branch 'gathermap-based-join-apis' of github.com:shwina/cudf in…
shwina 230ca08
Formatting
shwina 498a621
Update cpp/include/cudf/join.hpp
shwina 2de26f3
Docs?
shwina d6f128c
Merge branch 'gathermap-based-join-apis' of github.com:shwina/cudf in…
shwina b7d8d8a
Use mr
shwina 9efc761
Docs
shwina 8779bc7
Simplify suffix handling
shwina 4c651ac
Simplify joiner requirements
shwina b4f4d7c
Do less work in SemiJoin._merge_results
shwina d353c92
Doc
shwina 580a346
Doc
shwina 328dafd
Return None from semi_join
shwina 297d20a
Init common_type
shwina e388dd6
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 935648b
Move validation directly into set_by_label and use a raw dict to stor…
vyasr 806a3ef
Remove all references to OrderedColumnDict.
vyasr 40a7b17
Move validation to separate method and use in both set_by_label and c…
vyasr a1c576e
Format with black.
vyasr 788d9d6
Expose parameter to make validation optional.
vyasr 6a64285
Coerce constructor input to dict before calling items.
vyasr e7d0981
Make construction safe.
vyasr c39932c
Final cleanup and documentation.
vyasr 4ff09fc
Address style issues.
vyasr 35c63ec
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 9433582
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into f…
shwina 74f2884
Merge remote-tracking branch 'origin/branch-0.19' into feature/optimi…
vyasr 0178127
CA fix
shwina 5c0f202
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina c8d2364
Don't validate on gathers
shwina efea63d
Prioritize numeric columns
shwina 898a3d8
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina c3b6444
Lazily compute and delete column length on demand.
vyasr 01b2cf5
Remove redundant clear cache in setitem.
vyasr 8899258
Remove mypy annotation for column length.
vyasr c6cd415
Optimize casting logic
shwina 3507785
Merge branch 'feature/optimize_accessor_copy' of github.com:vyasr/cud…
shwina 7f8e1cd
Undo
shwina f2e4609
Don't validate when copying type metadata
shwina 5d378c2
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina 83cc407
ImportError
shwina 72598fb
Prioritize numeric dtypes in is_numerical_dtype
shwina fa220b6
Add unsafe CA ctor
shwina 6572cd3
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina f7dc417
Revert "Prioritize numeric dtypes in is_numerical_dtype"
shwina 3760077
Revert "Prioritize numeric dtypes in is_numerical_dtype"
shwina 01cdfcf
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina de9ca28
Change error message back so that tests pass.
vyasr e35d03b
Faster is_numerical_dtype
shwina e2fd533
Faster is_numerical_dtype
shwina 9044d62
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina 64ca702
Even faster is_numerical_dtype
shwina 749edf1
Enable fast path for constructing a Buffer from a DeviceBuffer
shwina 7526e4a
Merge branch 'feature/optimize_accessor_copy' into join-bench
shwina ca772b8
Small fix
shwina 739ec57
Add validation option to insert and standardize error message.
vyasr 498b70e
Fix style.
vyasr 3cd012b
Merge remote-tracking branch 'vyasr/feature/optimize_accessor_copy' i…
shwina 660afa6
Merge branch 'various-py-optimizations' into join-bench
shwina f8ac22f
Merge branch 'gathermap-based-join-apis' into join-bench
shwina c28866c
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into v…
shwina 01e13fa
Undo formatting change
shwina 89a0301
Add TODO
shwina 26f4cc8
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina f2036eb
Merge branch 'various-py-optimizations' into join-bench
shwina 5e73de7
init->create + doc
shwina e0c50b5
Merge branch 'various-py-optimizations' into gathermap-based-join-apis
shwina fa880c1
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 58bdecd
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina ed1b434
Merge branch 'join-bench' into gathermap-based-join-apis
shwina ca116a3
Only gather the index if necessary
shwina ce03918
Don't copy type metadata for the index unless we need to
shwina b7c6b19
Use validate=False in a few more places
shwina 671a0e0
Import
shwina 797087b
Review
shwina 5ad531f
Coerce to tuple first
shwina f7e94fb
Replace hasattr with isinstance
shwina 1cb9448
Handle renamed indexes
shwina cc89360
Fix to names setter
shwina 4ca1238
Merge branch 'branch-0.19' of https://github.com/rapidsai/cudf into g…
shwina 9cebf2e
Update cpp/src/join/hash_join.cu
shwina 1584b86
Better example
shwina 3977b79
Remove std::moves
shwina 67919a3
Merge branch 'gathermap-based-join-apis' of github.com:shwina/cudf in…
shwina 7bf6561
Fix formatting error
shwina File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Add Cython bindings
- Loading branch information
commit 555d5ec5ad9ca04142e8b1c6a9448637f9d900e8
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.