Skip to content

ARROW-7813: [Rust] Remove and fix unsafe code#6395

Closed
Marwes wants to merge 20 commits into
apache:masterfrom
Marwes:unsafe_rust_2
Closed

ARROW-7813: [Rust] Remove and fix unsafe code#6395
Marwes wants to merge 20 commits into
apache:masterfrom
Marwes:unsafe_rust_2

Conversation

@Marwes

@Marwes Marwes commented Feb 10, 2020

Copy link
Copy Markdown

This removes or corrects many instances of unsafe code in the rust crates. This is by no means a complete fix and some of the fixes do not entirely fix the issues with the particular unsafe (see comments), but in all instances it should put the code in a better place than before.

Based on #6256

@github-actions

Copy link
Copy Markdown

@Marwes

Marwes commented Feb 13, 2020

Copy link
Copy Markdown
Author

Removed the commits that changed the simd implemented operators. The code is definitely doing undefined behaviour though as the padding won't be initialized ( #6397 (comment) ).

So either the code could do non-simd operations for the last partial block (if any). Or the padding should be initialized before reading it.

@paddyhoran

Copy link
Copy Markdown
Contributor

So either the code could do non-simd operations for the last partial block (if any). Or the padding should be initialized before reading it.

I plan to initialize the padded region, ARROW-7836 if no-one beats me to it.

@paddyhoran

Copy link
Copy Markdown
Contributor

I'll try and review this tomorrow so we can get it merged.

@sunchao

sunchao commented Feb 15, 2020

Copy link
Copy Markdown
Member

Will take a look too. This needs rebase BTW. And cc @sadikovi .

@paddyhoran

Copy link
Copy Markdown
Contributor

Ping @Marwes can you re-base this so we can review if you get a chance please?

@andygrove

Copy link
Copy Markdown
Member

@Marwes Could you rebase this ?

@Marwes

Marwes commented Apr 1, 2020

Copy link
Copy Markdown
Author

Rebased, CI is failing though. Not sure what to do to fix.

@paddyhoran

Copy link
Copy Markdown
Contributor

This was fixed by #6800. Let me see if I can re-trigger CI.

@paddyhoran

Copy link
Copy Markdown
Contributor

@kszucs how can I re-trigger CI? I tried to force-push to this branch but it has no effect.

@wesm

wesm commented Apr 4, 2020

Copy link
Copy Markdown
Member

Not sure what happened with GitHub (they have been having outage issues?) but I rebased and CI is running again

@nevi-me

nevi-me commented Apr 4, 2020

Copy link
Copy Markdown
Contributor

CI is failing with unnecessary unsafe blocks on parquet\src\arrow\array_reader.rs

@paddyhoran

Copy link
Copy Markdown
Contributor

@Marwes looks like just some basic issues with this, want to take a quick look? I'd like to get this one merged.

@bluss

bluss commented Apr 12, 2020

Copy link
Copy Markdown

(I was "invited" to look at this PR, since I noticed some issues that needed fixing. All my comments will appear a bit random and semi-related to this PR, because of this. I will only comment on important issues.)

Nice, I had named 3 things I saw in arrow 0.16.0 code (not a full review)

Issue (1) about Buffer.ptr being null is being fixed here in this PR. Another common way to solve this in Rust is to use https://doc.rust-lang.org/std/ptr/struct.NonNull.html#method.dangling - an aligned non-null dummy pointer, but I won't review this solution further, Marwes can probably judge it all as well as I can. Using NonNull<T> is certainly an idea, too.
Issue (3) about making Buffer::from_raw_parts unsafe has been merged as a fix in an older PR.

Issue (2) so remains, no check for allocation failure. This produces dangerous mismatches - null pointer and nonzero length - in various places.

This doesn't have to be fixed in this PR, just to note that the issue is not fixed here.

Comment thread rust/arrow/src/buffer.rs Outdated

This comment was marked as resolved.

Comment thread rust/arrow/src/buffer.rs Outdated

This comment was marked as resolved.

Comment thread rust/arrow/src/buffer.rs Outdated

This comment was marked as resolved.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix has disappeared, I can't see it in the diff anymore 😕

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened ARROW-8479

Markus Westerlind added 7 commits April 14, 2020 09:42
Might be other platforms this works on as well, but those could be added
as they are found
Fixes undefined behaviour that could occur from safe code calling with
`T == Box<i32>` etc.
This may cause panics in code using ByteArray or Int96 but no tests
currently test these paths. Still better than the status quo which would
be undefined behaviour (if the replaced code paths were hit).
@Marwes

Marwes commented Apr 14, 2020

Copy link
Copy Markdown
Author

Issue (1) about Buffer.ptr being null is being fixed here in this PR. Another common way to solve this in Rust is to use https://doc.rust-lang.org/std/ptr/struct.NonNull.html#method.dangling - an aligned non-null dummy pointer, but I won't review this solution further, Marwes can probably judge it all as well as I can. Using NonNull is certainly an idea, too.

That's the better solution. I just spotted the null pointer issue late in this PR and didn't want to complicate it further.

Fixed the review comments.

@bluss

bluss commented Apr 14, 2020

Copy link
Copy Markdown

Thank you @Marwes for doing big work like this for better and safer code

@paddyhoran paddyhoran left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Marwes for doing big work like this for better and safer code

Absolutely, thanks very much @Marwes! LGTM from Arrow (crate) perspective. @sunchao can you take a quick look at the changes to the Parquet crate please?

@wesm

wesm commented Apr 14, 2020

Copy link
Copy Markdown
Member

Would be great to get this into 0.17.0, probably needs to be merged today

cc @kszucs

@sunchao sunchao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I only had time to skim through this but it'd be great if we have some perf numbers to show there is no regression.

@kszucs

kszucs commented Apr 15, 2020

Copy link
Copy Markdown
Member

@sunchao Included in the 0.17 release. If you find a significant regression, then we can still fix it by cutting a new release candidate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants