rustc: Stop passing --allow-undefined on wasm targets#149868
rustc: Stop passing --allow-undefined on wasm targets#149868rust-bors[bot] merged 1 commit intorust-lang:mainfrom
--allow-undefined on wasm targets#149868Conversation
This commit updates how the linker is invoked on WebAssembly targets
(all of them) to avoid passing the `--allow-undefined` flag to the
linker. Historically, if I remember this correctly, when `wasm-ld` was
first integrated this was practically required because at the time it
was otherwise impossible to import a function from the host into a wasm
binary. Or, at least, I'm pretty sure that was why this was added.
At the time, as the documentation around this option indicates, it was
known that this was going to be a hazard. This doesn't match behavior on
native, for example, and can easily paper over what should be a linker
error with some sort of other obscure runtime error. An example is that
this program currently compiles and links, it just prints null:
unsafe extern "C" {
static nonexistent: u8;
}
fn main() {
println!("{:?}", &raw const nonexistent);
}
This can easily lead to mistakes like rust-lang/libc/4880 and defer what
should be a compile-time link error to weird or unusual behavior at link
time. Additionally, in the intervening time since `wasm-ld` was first
introduced here, lots has changed and notably this program works as
expected:
#[link(wasm_import_module = "host")]
unsafe extern "C" {
fn foo();
}
fn main() {
unsafe {
foo();
}
}
Notably this continues to compile without error and the final wasm
binary indeed has an imported function from the host. What this change
means, however, is that this program:
unsafe extern "C" {
fn foo();
}
fn main() {
unsafe {
foo();
}
}
this currently compiles successfully and emits an import from the `env`
module. After this change, however, this will fail to compile with a
link error stating that the `foo` symbol is not defined.
|
These commits modify compiler targets. |
|
I'll clarify that process-wise I'm not exactly sure how to approach this. This is highly likely to break projects in practice but I don't know how to evaluate the quantity or breadth of breakage. If it's possible to do a wasm crater run or something like that I'd be happy to triage the results and/or send PRs to upstream projects. In the meantime though I wanted to get this on the radar as I should have done this ~years ago and it's long overdue that this change is made for wasm (my fault) |
|
It should be possible to run Crater with a custom target (https://github.com/rust-lang/crater/blob/master/docs/bot-usage.md#specifying-toolchains). I'll kick off a try job with the right artifacts (I think) at least. I'm not sure how much that will actually be helpful though in terms of what % of crates are actually supposed to build and do build for wasm32. One thought is that we could possibly phase this in for at least the wasi targets with the next preview (p4?) assuming that is coming down the line, or in theory with an edition of the leaf crate, but neither feel like amazing options. One other thought is that we could maybe FCW lint if we run the linker without --allow-undefined and if that fails re-run again with it, and if that works then emit a lint? Not sure how reliable the failure is. @bors try jobs=dist-x86_64-linux,dist-various-* |
This comment has been minimized.
This comment has been minimized.
rustc: Stop passing `--allow-undefined` on wasm targets try-job: dist-x86_64-linux try-job: dist-various-*
|
Oh nice! I'll attempt that here and see how it goes... @craterbot run start=master#f5209000832c9d3bc29c91f4daef4ca9f28dc797+target=wasm32-wasip1 end=try#23647e694de8d0904848ad068b2e0ec2dd098c37+target=wasm32-wasip1 mode=build-only
Agreed! I figure we can await the crater results and see if that provides a good signal of which path to take here, as I don't feel I have a great gut feeling myself here. |
|
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
🎉 Experiment
Footnotes
|
|
Ok going through some of the failures there. First thing I'm seing is that Crater's dependency analysis won't work for this change because the build failure will always be at the end with some sort of linked artifact as opposed to when the dependency is compiled itself. For example with this report there are undefined symbols in the The next thing I'm seeing is that Crater isn't actually configured for wasm cross-compilation in the sense that if a crate compiles C code that will have worked before by accident. Both before/after Crater is compiling C code with the native With these combined I think unfortunately that the Crater report is too noisy to draw any information from. The real "regression" here in a sense is the risk of someone using @Mark-Simulacrum is there perhaps historical precedence for approaching a crater report like this? Maybe not wasm-specific but one where it's a possibly-breaking change but crater isn't able to effectively diagnose the breadth/scale of the possible breakage. |
|
Hm, I'm not sure. If there's something we can get the compiler to dump to it's output (e.g., externs without a wasm import module annotation?) we can do analysis of that offline via the tarballs crater produces which contain all the output files (both for regressed and non-regressed crates). In your example, how did "env" end up in the import with allow-undefined? Or does it end up as some kind of star module import? |
|
Would it be possible to install wasi-sdk on the crater image that tests are built in? That'd probably be the best way to reduce the noise here because it would at least get properly configured scenarios back-to-working instead of accidentally looking like regressions. The specific regression this PR might introduce is that, today, this code: unsafe extern "C" {
fn foo();
}
fn main() {
unsafe {
foo();
}
}introduces, in the final wasm binary,
What's hapening here on crater is that crates have the definition above, |
Probably. Crater uses the same images as docs.rs I believe, so that's a PR to https://github.com/rust-lang/crates-build-env.
It seems like the alternative (without requiring source modification to unblock) is for users to pass something like Are there 'popular' wasm projects we could check against to see if their dependency trees have any of the implicit import business?
I think perhaps my question wasn't clear, but I'm wondering why this is |
|
Oh nice! If I send a PR to that repo, and it gets merged, would a rerun in crater automatically pick that up? Or, if we're ok with release notes, should I perhaps do this in parallel to landing this change? And, yes, fixing the issue would be a
Oh sorry for misunderstanding! Yes, this is just an LLD default. Without anything else specified it uses |
…uwer Rollup of 8 pull requests Successful merges: - #149868 (rustc: Stop passing `--allow-undefined` on wasm targets) - #153555 (Clarified docs in std::sync::RwLock + added test to ensure that max reader count is respected) - #152851 (Fix SGX delayed host lookup via ToSocketAddr) - #154051 (use libm for acosh and asinh) - #154581 (More informative `Debug for vec::ExtractIf`) - #154461 (Edit the docs new_in() and with_capacity_in()) - #154526 (Panic/return false on overflow in no_threads read/try_read impl) - #154798 (rustdoc-search: match path components on words)
Rollup merge of #149868 - alexcrichton:wasm-no-allow-undefined, r=Mark-Simulacrum rustc: Stop passing `--allow-undefined` on wasm targets This commit updates how the linker is invoked on WebAssembly targets (all of them) to avoid passing the `--allow-undefined` flag to the linker. Historically, if I remember this correctly, when `wasm-ld` was first integrated this was practically required because at the time it was otherwise impossible to import a function from the host into a wasm binary. Or, at least, I'm pretty sure that was why this was added. At the time, as the documentation around this option indicates, it was known that this was going to be a hazard. This doesn't match behavior on native, for example, and can easily paper over what should be a linker error with some sort of other obscure runtime error. An example is that this program currently compiles and links, it just prints null: unsafe extern "C" { static nonexistent: u8; } fn main() { println!("{:?}", &raw const nonexistent); } This can easily lead to mistakes like rust-lang/libc#4880 and defer what should be a compile-time link error to weird or unusual behavior at link time. Additionally, in the intervening time since `wasm-ld` was first introduced here, lots has changed and notably this program works as expected: #[link(wasm_import_module = "host")] unsafe extern "C" { fn foo(); } fn main() { unsafe { foo(); } } This continues to compile without error and the final wasm binary indeed has an imported function from the host. This program: unsafe extern "C" { fn foo(); } fn main() { unsafe { foo(); } } this currently compiles successfully and emits an import from the `env` module. After this change, however, this will fail to compile with a link error stating that the `foo` symbol is not defined.
This is a post announcing the change happening in rust-lang/rust#149868. The intention is to have this published shortly after that PR lands to ensure there's a nightly that folks can test with first. Co-authored-by: Kevin Reid <kpreid@switchb.org>
…uwer Rollup of 8 pull requests Successful merges: - rust-lang/rust#149868 (rustc: Stop passing `--allow-undefined` on wasm targets) - rust-lang/rust#153555 (Clarified docs in std::sync::RwLock + added test to ensure that max reader count is respected) - rust-lang/rust#152851 (Fix SGX delayed host lookup via ToSocketAddr) - rust-lang/rust#154051 (use libm for acosh and asinh) - rust-lang/rust#154581 (More informative `Debug for vec::ExtractIf`) - rust-lang/rust#154461 (Edit the docs new_in() and with_capacity_in()) - rust-lang/rust#154526 (Panic/return false on overflow in no_threads read/try_read impl) - rust-lang/rust#154798 (rustdoc-search: match path components on words)
…nightly irons out the bugs caused by rust-lang/rust#149868
|
Hmm, ICU4X was using undefined symbols on See: https://github.com/rust-diplomat/diplomat/blob/main/runtime/src/wasm_glue.rs#L48 This breaks that; I'm still investigating and plan to produce a reduced testcase but as far as I can tell this is exactly the use case for |
|
@Manishearth does the workaround listed here, to use |
|
Ah, somehow I misread that entirely. Thanks! |
|
I found breakage from this, currently suspecting https://github.com/WorldSEnder/wasm-split-prototype but filed bug in Leptos here leptos-rs/cargo-leptos#642 |
Rust patch rust-lang/rust#149868 introduced some breaking changes by removing `--allow-undefined` from WASM linker defaults. The compiler ignores `#[link(wasm_import_module)]` on non-C ABI, and these were never real WASM imports. The fix was to use `unsafe extern "C"` instead of `unsafe #declared_abi` on WASM.
Rust patch rust-lang/rust#149868 introduced some breaking changes by removing `--allow-undefined` from WASM linker defaults. The compiler ignores `#[link(wasm_import_module)]` on non-C ABI, and these were never real WASM imports. The fix was to use `unsafe extern "C"` instead of `unsafe #declared_abi` on WASM.
Rust patch rust-lang/rust#149868 introduced some breaking changes by removing `--allow-undefined` from WASM linker defaults. The compiler ignores `#[link(wasm_import_module)]` on non-C ABI, and these were never real WASM imports. The fix was to use `unsafe extern "C"` instead of `unsafe #declared_abi` on WASM.
|
@alexcrichton Requiring |
|
For For |
|
@alexcrichton for the Can confirm that |
|
Oh! No yeah that's a good point, I naively assumed that all imports are satisfied by JS/the host but you're right that another wasm module could be used. For that I think it's "just a bug" that |
View all comments
This commit updates how the linker is invoked on WebAssembly targets (all of them) to avoid passing the
--allow-undefinedflag to the linker. Historically, if I remember this correctly, whenwasm-ldwas first integrated this was practically required because at the time it was otherwise impossible to import a function from the host into a wasm binary. Or, at least, I'm pretty sure that was why this was added.At the time, as the documentation around this option indicates, it was known that this was going to be a hazard. This doesn't match behavior on native, for example, and can easily paper over what should be a linker error with some sort of other obscure runtime error. An example is that this program currently compiles and links, it just prints null:
This can easily lead to mistakes like rust-lang/libc#4880 and defer what should be a compile-time link error to weird or unusual behavior at link time. Additionally, in the intervening time since
wasm-ldwas first introduced here, lots has changed and notably this program works as expected:This continues to compile without error and the final wasm binary indeed has an imported function from the host. This program:
this currently compiles successfully and emits an import from the
envmodule. After this change, however, this will fail to compile with a link error stating that thefoosymbol is not defined.