Skip to content

GH-43541: [C++] Check accepted device allocation types before executing kernel#43542

Closed
felipecrv wants to merge 11 commits into
apache:mainfrom
felipecrv:check_device_type
Closed

GH-43541: [C++] Check accepted device allocation types before executing kernel#43542
felipecrv wants to merge 11 commits into
apache:mainfrom
felipecrv:check_device_type

Conversation

@felipecrv

@felipecrv felipecrv commented Aug 2, 2024

Copy link
Copy Markdown
Contributor

Rationale for this change

Kernels shouldn't segfault when executing against GPU-allocated (or any other non-CPU device) Arrays.

What changes are included in this PR?

  • A way of declaring which device allocation types a kernel can handle per parameter
  • Removal of some of the checks from the Python code
  • Making drop_null callable when array is CUDA-allocated

Are these changes tested?

Tested from the Python tests.

Are there any user-facing changes?

New member functions to some classes.

@github-actions

github-actions Bot commented Aug 2, 2024

Copy link
Copy Markdown

⚠️ GitHub issue #43541 has been automatically assigned in GitHub to PR creator.

Comment thread cpp/src/arrow/chunked_array.h Outdated
@github-actions github-actions Bot added awaiting committer review Awaiting committer review awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Aug 2, 2024
Comment thread cpp/src/arrow/compute/kernel.cc Outdated
Comment on lines 438 to 440

@felipecrv felipecrv Aug 2, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably handle these even though they are not passed here.

Comment thread cpp/src/arrow/compute/kernel.cc Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will probably remove this DCHECK and keep the ones in KernelSignature::MatchesDeviceAllocationTypes.

@github-actions github-actions Bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Aug 2, 2024
@felipecrv felipecrv marked this pull request as ready for review August 27, 2024 00:32
@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Aug 27, 2024
@felipecrv felipecrv requested a review from danepitkin August 27, 2024 00:35
@felipecrv

Copy link
Copy Markdown
Contributor Author

I know what is wrong with drop_null (failing in tests), but I'm still deciding the best way to fix it.

Comment thread python/pyarrow/array.pxi Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this one to not change (not using compute kernels under the hood)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled by __iter__ already. Isn't that what's called in the for x in self sugar?

@felipecrv felipecrv requested a review from bkietz August 27, 2024 13:54

@bkietz bkietz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this looks good but I'm not convinced about the changes to InputType; I don't know of any kernels which take non-CPU data so it seems more reasonable to leave the InputType refactor for a follow up. On the other hand, Users may have implemented their own kernels which do operate on non CPU data in which case this change to InputType is breaking (since those kernels were not defined with the new constructor and default to gracefully rejecting non-CPU data). I'll have to think on this some more

Comment thread cpp/src/arrow/device_allocation_type.h Outdated
Comment on lines 61 to 69

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i = 1; i <= kDeviceAllocationTypeMax; i++) {
if (device_type_bitset_.test(i)) {
// Skip all the unused values in the enum.
switch (i) {
case 0:
case 5:
case 6:
continue;
}
for (int i : {
DeviceAllocationType::kCPU,
DeviceAllocationType::kCUDA,
DeviceAllocationType::kCUDA_HOST,
DeviceAllocationType::kOPENCL,
DeviceAllocationType::kVULKAN,
DeviceAllocationType::kMETAL,
DeviceAllocationType::kVPI,
DeviceAllocationType::kROCM,
DeviceAllocationType::kROCM_HOST,
DeviceAllocationType::kEXT_DEV,
DeviceAllocationType::kCUDA_MANAGED,
DeviceAllocationType::kONEAPI,
DeviceAllocationType::kWEBGPU,
DeviceAllocationType::kHEXAGON,
}) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work when another enum entry is added. Mine will work and I put the kDeviceAllocationTypeMax right after the enum so people remember to not leave gaps.

Comment thread cpp/src/arrow/chunked_array.h Outdated
@github-actions github-actions Bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Aug 27, 2024
@felipecrv

felipecrv commented Aug 27, 2024

Copy link
Copy Markdown
Contributor Author

In general this looks good but I'm not convinced about the changes to InputType; I don't know of any kernels which take non-CPU data so it seems more reasonable to leave the InputType refactor for a follow up.

@bkietz I can leave it to a follow up, but that would force me to put ad-hoc device type checks in the kernel dispatching code instead of doing it in a systematic way.

On the other hand, Users may have implemented their own kernels which do operate on non CPU data in which case this change to InputType is breaking (since those kernels were not defined with the new constructor and default to gracefully rejecting non-CPU data). I'll have to think on this some more

These users were already on thin ice: all it takes is a GetNullCount() call in kernel dispatching logic to crash. The bright side is that I already give them a chance to avoid the check by passing the device types in the type matcher. This is why I didn't go for definitive "is cpu" checks and instead allowed extension of the set.

@felipecrv

Copy link
Copy Markdown
Contributor Author

@bkietz I extracted the basics from this PR into a new PR -> #43853

@apache apache deleted a comment from github-actions Bot Aug 27, 2024

@danepitkin danepitkin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyArrow bits LGTM!

Comment thread cpp/src/arrow/chunked_array.h Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth cacheing this after the first execution?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might cache the entire set on the constructor. This check is very cheap, so I wouldn't cache in the C++ layer. The set is represented as a 64-bit word under the hood. It's very small.

@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Aug 27, 2024
@felipecrv

Copy link
Copy Markdown
Contributor Author

@danepitkin after @bkietz I extracted some of the commits from here into a smaller PR that only adds the device_types() methods, doesn't constrain kernel execution, and doesn't touch the python code: #43853

@github-actions

Copy link
Copy Markdown

Thank you for your contribution. Unfortunately, this pull request has been marked as stale because it has had no activity in the past 365 days. Please remove the stale label or comment below, or this PR will be closed in 14 days. Feel free to re-open this if it has been closed in error. If you do not have repository permissions to reopen the PR, please tag a maintainer.

@github-actions github-actions Bot added the Status: stale-warning Issues and PRs flagged as stale which are due to be closed if no indication otherwise label Nov 18, 2025
@github-actions github-actions Bot closed this Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting change review Awaiting change review Component: C++ Component: Python Status: stale-warning Issues and PRs flagged as stale which are due to be closed if no indication otherwise

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants