Skip to content

reduction of libc requirement, more direct memory calls#1589

Closed
ryancinsight wants to merge 2 commits into
PyO3:mainfrom
ryancinsight:main
Closed

reduction of libc requirement, more direct memory calls#1589
ryancinsight wants to merge 2 commits into
PyO3:mainfrom
ryancinsight:main

Conversation

@ryancinsight

Copy link
Copy Markdown

reduction of libc requirements, but libc can still be used as a feature, more direct memory calls and reduces compilation by 1 second.

@birkenfeld

Copy link
Copy Markdown
Member

I don't understand this change. Is it really introducing all that conditional compilation complexity to save 1 second of compile time? I don't think it is questionable to use libc in a crate that interfaces with a codebase implemented in C.

BTW, equating usize and size_t is questionable, as the equivalent C type is uintptr_t. bindgen stopped doing it in rust-lang/rust-bindgen#1688

@ryancinsight

Copy link
Copy Markdown
Author

We can remove the conditionals but I thought there may be some that would like to use it.

In terms of usize being size_t, libc does not avoid this, the libc crate has type size_t = usize; so why mask it with libc::size_t.

We switched to just using usize on mimalloc-Rs and snmalloc-Rs because libc adds significant complication time and it doesn't make sense to go from C size_t to libc crate and renaming usize to size_t and every subsequent call to this aliased size_t to go back through libc crate because usize was renamed size_t.

@ryancinsight

Copy link
Copy Markdown
Author

If you want to see what libc crate does for other types than look at this https://docs.rs/libc/0.2.94/src/libc/unix/mod.rs.html#19:

and you will find:
pub type size_t = usize;
pub type ptrdiff_t = isize;
pub type intptr_t = isize;
pub type uintptr_t = usize;
pub type ssize_t = isize;

If you are ok with that than use it as a feature but some like to reduce dependencies, especially those that don’t seem to add much benefit besides changing name.

@birkenfeld

Copy link
Copy Markdown
Member

So which is it, "significant compilation time" or "1 second"? If you want to optimize compilation time of PyO3, this is not the place to start.

I grant that libc is setting size_t = usize, but should that become problematic for some platform libc is the proper place to fix it for all, not in every individual crate. And size_t is not the only definition you're copying-pasting here.

@ryancinsight

Copy link
Copy Markdown
Author

For snmalloc-rs it was 6-7 seconds, from 27 seconds to 21 seconds. The simple allocation tests showed a 0.1 to 0.05 second reduction.

I also made the change to pyoxidizer and was suggested that a transition be done for here for consistency in future with its transition from rust-cpython. Currently, on all rust supported platforms size_t is usize along with ssize_t being isize. If you find the libc crate so important why do I see so many instances of you using std::os::raw calls.

Furthermore, what am I copy-pasting? I in no means change the definition or name of anything, I'm using the primitives directly. For wchar_t I looked them up based on the platform this could be updated in the future when platforms use u64 but currently non supported do so why use a library where unneeded. This is why I set as feature to revert to libc if it is truly needed. The reverse of not using libc as a feature could be implemented that way instead if preferred.

Comment thread Cargo.toml Outdated
@davidhewitt

Copy link
Copy Markdown
Member

Thank you for authoring this PR.

I have to admit that I'm also not enthusiastic about it. In my view not using libc is a soundness risk: I don't know the correct sizes of all types on all platforms Python can be compiled on. Getting this wrong on any platform can violate the unsafe ffi going on here: at best it'll cause an immediate and noisy crash, at worst it'll introduce security flaws.

With this in mind, I'm unconvinced that giving users such a footgun in pursuit of a compile time improvement is worth the risk.

In addition, I think that the changes we yet want to make to PyO3 are likely to have significant impact on compilation times (either way), so adding complexity like this doesn't seem like it's going to have noticeable long term impact other than increasing maintainence overhead.

May I ask why the compilation time of the libc dependency is a significant problem for you? Cargo will cache compilation of libc between multiple runs, and with re-use of the cargo cache in CI environments (e.g.) you can avoid recompilation there too. In addition, libc is a very common dependency; I would think most Rust projects beyond a certain scale will have libc in their dependency graph multiple times.

@ryancinsight

ryancinsight commented May 3, 2021

Copy link
Copy Markdown
Author

Hi David, my issue with libc is that it is being used more often than it probably should be, there are many examples where it is great like when the various OS have diverse calls to the system. However, the memory ones don't seem to vary much and often create a false sense that they are safe, the python versions and operating systems you support now are not going to be changing what I set. Most are concerned about new architectures like CHERI that currently aren't even supported or have limited support by rust. All I said is that it causes significant addition of compilation time on the memory allocators which tend to be small on dependencies in general. I tend test the allocators with pyoxidizer to see which one will best suit my needs and tasks. I also prefer to keep the dependencies lower where needed and the fact I see performance changes in allocation tests just for calling another library that at the moment only changes the names of the rust primitives with a type definition is more of a pet peeve.

However, I believe this crate should choose to use one or the other and if possible pursue not having the rust standard library. libc can do all of the calls that you are doing with std::os::raw and by not using default features this will not use libc. Otherwise, you can change you std::os::raw to use core with nostd flag and remove calls to libc.

In the end, I have already run all the tests you created and passed except one of docs because I forgot that the primitive isize didn't have the min function so I reverted that to the std call with forced-push.

Its up to you but it is strange, to me at least, to choose to use std::os::raw and libc, typically it's preferred to use rust until you can't justify and then use libc for everything if libc has to be used

#[cfg(not(feature = "libc"))]
#[allow(non_camel_case_types)]
#[cfg(target_family = "unix")]
type wchar_t = i32;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out, I can't say I've looked at that architecture before. Not sure there is enough interested to make changes though so I'll just close ;)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so it looks like the non-arch specific unix types are mostly here, so those are probably reasonably safe to use with this technique, but stuff like wchar_t appears to be more complex as you'd want to scope them to specific architectures like target_arch = "x86_64" or something:
https://github.com/rust-lang/libc/blob/0.2.97/src/unix/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants