Improve the documentation around layout guarantees given by Box#155597
Improve the documentation around layout guarantees given by Box#155597weiznich wants to merge 1 commit into
Box#155597Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
| //! convert between `Box<A>` and `Box<B>` using unsafe code if its sound to do the same conversion | ||
| //! between `A` and `B`. | ||
| //! | ||
| //! For zero-sized values, the `Box` pointer has to be non-null and sufficiently aligned. The |
There was a problem hiding this comment.
The existing text has two paragraphs covering the two cases "non-zero-sized" and "zero-sized". Please don't break that structure by adding an entirely unrelated paragraph in the middle.
There was a problem hiding this comment.
I put it a bit further down. If that's still a bad location please suggest a better one.
| //! obtained from [`Box::<T>::into_raw`] may be deallocated using the [`Global`] allocator with | ||
| //! [`Layout::for_value(&*value)`]. | ||
| //! | ||
| //! It can be assumed that the layout of `Box<A>` is compatible to the layout of `Box<B>` if |
There was a problem hiding this comment.
I'm a bit concerned that we have never defined what "compatible layout" actually means. I don't think we are using it in existing docs, are we?
There was a problem hiding this comment.
What about going with "The layout is the same for Box<A> and Box<B> if it's the same for A and B instead?
There was a problem hiding this comment.
That uses a different word but doesn't fix the problem that we haven't actually defined the terms you are using here.
Fundamentally the problem is that we have basically no documentation of the sort you want to add, and adding the first pieces of such documentation needs a lot of infrastructure (like term definitions) to be established. Cc @traviscross @ehuss
Also it occurs to me that it seems odd to make this guarantee for Box but not for raw pointers or references.
There was a problem hiding this comment.
This is using the wording out of the nomicon:
When transmuting between different compound types, you have to make sure they are laid out the same way! If layouts differ, the wrong fields are going to get filled with the wrong data, which will make you unhappy and can also be Undefined Behavior (see above).
So how do you know if the layouts are the same? For repr(C) types and repr(transparent) types, layout is precisely defined. But for your run-of-the-mill repr(Rust), it is not. Even different instances of the same generic type can have wildly different layout. Vec and Vec might have their fields in the same order, or they might not. The details of what exactly is and is not guaranteed for data layout are still being worked out over at the UCG WG.
For me that reads like there is some existing usage of "the same layout" defined there, so I'm rather surprised to hear that "same layout" is not defined yet.
There was a problem hiding this comment.
That's sad, given that this is linked from the docs of transmute. Maybe it that link should be removed from there is it's not something you are allowed to rely on.
Otherwise, given that all this above sounds like quite a lot of discussion that will be required and I do not have the capacity to drive this discussion forward I wondered if there is an other way to get the guarantee I'm looking for. So what's for example about just adding a note to transmute itself stating that transmuting from Box<A> to Box<B> is fine if transmuting A to B is fine? Maybe even extend it to references?
That would get the necessary guarantee in the docs without needing to introduce a lot new vocabulary.
There was a problem hiding this comment.
Someone will have to do that work some day if we want to get proper unsafe docs. Rust is developed by volunteers stepping up to tasks like that, the teams usually don't have the capacity to do such work themselves. But I can understand that it may be a bit daunting and that you also may not have the capacity.
Probably the easiest thing to say is something that compares transmuting Box with other operations.
- if a pointer-to-pointer cast from
*mut Tto*mut Uis allowed, and furthermoreTandUhave the same unsized tail, then the transmute between those two types behaves the same as the cast. (That's a raw pointer guarantee. Not sure where the best place for this is.) - and then for the same pairs of types
T,Uwe can say that a transmute ofBoxis also allowed and that it acts like invokinginto_raw, then casting the raw pointer, and then callingfrom_raw.
There was a problem hiding this comment.
I looked into more documentation and it seems like the reference does use the combination "the same layout" as well when it describes #[repr(transparent)]. It also defines type layout, although it doesn't go in detail what "same layout" exactly mean. On the other hand I wonder how relevant it is to define these details, as the relevant fact is that the layouts are "the same" and that on it's own can be stated as fact for certain combinations (like the #[repr(transparent)] case, or types with an otherwise known layout). For the Box<A> -> Box<B> case this might be sufficient, as it's very similar to #[repr(transparent)] or other pointer casts?
Also the linked page states that the type layout "is its size, alignment, and the relative offsets of its fields" (and the discriminant for enums). So it shouldn't be too hard to clarify there that same layout means that all of this is guaranteed to match. It's then the responsibility of each representation (or each specific type like box) to actually document that these guarantees are gives (or not).
And just to be sure as you noted that the Nomicon is not normative: I assume that the reference is normative?
There was a problem hiding this comment.
Yes, the reference is normative. :)
The problem with "same layout" is that on its own it still doesn't say what happens on a transmute. Somehow a value of the old type changes to a value of the new type. How? You can't say "it's the same value" as the values don't even have the same type so that's a meaningless statement. For repr(transparent) it's kind of obvious what to do, but for most other cases here that would take much more care to describe.
That's why I proposed focusing on saying "a transmute is equivalent to the following sequence of operations".
There was a problem hiding this comment.
Thanks for the clarifications. I finally found some time to go over the wording again. As suggested the added documentation now explicitly states that this is equivalent to the pointer cast. I also added some additional sentence to highlight that also the requirements of transmute need to be satisfied, which mostly boils down to matching size, alignment and validity of values. These requirements are directly taken from the transmute docs, so I assume that they are nominative?
Hopefully that now better fits the existing documentation?
b17e07b to
ad7e61c
Compare
ad7e61c to
11797d0
Compare
This comment has been minimized.
This comment has been minimized.
11797d0 to
50953b7
Compare
This comment has been minimized.
This comment has been minimized.
50953b7 to
7d6d433
Compare
This comment has been minimized.
This comment has been minimized.
7d6d433 to
b21dc98
Compare
There was a problem hiding this comment.
@rust-lang/lang @rust-lang/libs-api this PR proposes to make a guarantee for what transmuting between Box does, by saying it is equivalent to a into_raw + raw ptr cast + from_raw chain. That seems like an "obvious" equivalence to me an it means we can refer to existing docs for further details.
I think only lang actually needs to FCP this (since Box is a primitive language type).
| //! Any such conversion therefore needs to uphold the same guarantees as using | ||
| //! a [pointer-to-pointer cast][ptr-cast] to convert between a `*mut T` and a `*mut U`. | ||
| //! In addition such a `core::mem::transmute` call also needs to uphold the requirements | ||
| //! of [`core::mem::transmute`] for a transmute between `T` and `U`, especially | ||
| //! regarding to size, alignment and the validity of values for both types. |
There was a problem hiding this comment.
This section sounds like it exhaustively lists all safety constraints, but it doesn't. It should link to Box::from_raw. In particular, transmute does not actually make any requirements about the alignment of T/U, but this operation obviously does.
There was a problem hiding this comment.
Well, Box::from_raw links back to "here" (as in the memory layout section, so that feels a bit like going in a circle. I add an explicit statement that the pointers need to be well aligned for both T and U, hopefully that's also fine.
There was a problem hiding this comment.
In that case things will get tricky. What we want to say is that both types must have the same size and alignment, but given that the types can be unsized that statement wouldn't make sense.
I think we need docs similar to this, where we spell out the cases for the possible unsized tails.
| //! Converting between any `Box<T>` and `Box<U>` by using [`core::mem::transmute`] | ||
| //! is equivalent to the following sequence of operations: |
There was a problem hiding this comment.
| //! Converting between any `Box<T>` and `Box<U>` by using [`core::mem::transmute`] | |
| //! is equivalent to the following sequence of operations: | |
| //! Converting between any `Box<T>` and `Box<U>` by using [`core::mem::transmute`] where `T` and `U` both have the same unsized tail | |
| //! is equivalent to the following sequence of operations: |
Do we define "unsized tail" somewhere in the Reference? I mean that they must both be sized, or both have a slice tail, or both have a dyn Trait tail. The as cast would not even get accepted otherwise.
There was a problem hiding this comment.
That's the only occurrence of "unsized tail" in the reference: https://doc.rust-lang.org/reference/behavior-considered-undefined.html?highlight=unsized%20tail#r-undefined.validity.wide
The standard library docs use "unsized tail" for std::mem::size_of_val_raw
I'm not sure if that is already sufficient to use given the existing definitions.
That written: The change as currently suggested would restrict the wording from any Box<T> and Box<U> to such where T and/or U are unsized. While its more clear (and already defined) what happens for sized types it still feels like an unnecessary restriction.
There was a problem hiding this comment.
would restrict the wording from any Box and Box to such where T and/or U are unsized
The intention was that if both types are sized, then they also have "the same unsized tail" (namely none).
But if we haven't defined that yet we'll need to do something else. The case I am worried about is that e.g. you can do a raw pointer cast from *const [u8] to *const u8 but the corresponding Box transmute would obviously not work. So the entire paragraph needs to be limited to cases whee
- both types are sized
- or both types are unsized, and a raw pointer cast between them would be permitted
BoxBox
b21dc98 to
be96624
Compare
This comment has been minimized.
This comment has been minimized.
This change extends the memory layout section in the documentation of `Box` to explicitly state that it is sound to convert between `Box<A>` and `Box<B>` as long as a cast between `*mut A` and `*mut B` is valid and the general requirements of `transmute` are satisfied. See https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/Is.20transmuting.20between.20.60Box.3CA.3E.60.20and.20.60Box.3CB.3E.60.20UB.3F/with/585350243 for the relevant discussion.
be96624 to
a4fc9c4
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
| //! Any such conversion therefore needs to uphold the same guarantees as using | ||
| //! a [pointer-to-pointer cast][ptr-cast] to convert between a `*mut T` and a `*mut U`. |
There was a problem hiding this comment.
This is incorrect. A pointer-to-pointer cast is not necessarily equivalent to a pointer-to-pointer transmute. For example, a pointer-to-pointer cast can perform a trait upcast. Consider for example the following, which in my reading of this this PR, the PR says should work.
trait Super1 {
fn method1(&self) {
println!("method1");
}
}
trait Super2 {
fn method2(&self) {
println!("method2");
}
}
trait Sub: Super1 + Super2 {}
impl Super1 for i32 {}
impl Super2 for i32 {}
impl Sub for i32 {}
fn main() {
// casting pointers works.
let mut data: i32 = 1;
let ptr_sub: *mut dyn Sub = &raw mut data;
let ptr_super: *mut dyn Super2 = ptr_sub as _;
unsafe {
// Works. Prints "method2".
(*ptr_super).method2();
}
let data: Box<i32> = Box::new(1);
let box_sub: Box<dyn Sub> = data;
// This is UB according to miri.
let box_super: Box<dyn Super2> = unsafe { std::mem::transmute(box_sub) };
// Doesn't work. Prints "method1" in my testing.
box_super.method2();
}There was a problem hiding this comment.
I would argue that this is disallowed by the second sentence in this paragraph:
In addition such a
core::mem::transmutecall also needs to uphold the requirements
of [core::mem::transmute] for a transmute betweenTandU, especially
regarding to size, alignment and the validity of values for both types.
as transmuting from dyn Sub to dyn Super2 is also not allowed.
I still can see that this might not be the best way to express when such a transmute is sound. Maybe going via "same layout" is the better solution, maybe combined with a definition of "same layout" by listing combinations that are considered to have the same layout (e.g. #[repr(transparent)] structs and their inner types, #[repr(C)] types with the same fields, certain primitive types, maybe Box if this PR is accepted) and stating everything else doesn't have the same layout? I'm not sure if such an "ad hoc" definition might be acceptable?
There was a problem hiding this comment.
as transmuting from dyn Sub to dyn Super2 is also not allowed.
Transmuting from [u16] to [(u8, u8)] is also not allowed because transmute only works on sized types. I'm afraid you can't refer to transmute here.
There was a problem hiding this comment.
I still can see that this might not be the best way to express when such a transmute is sound. Maybe going via "same layout" is the better solution, maybe combined with a definition of "same layout" by listing combinations that are considered to have the same layout (e.g. #[repr(transparent)] structs and their inner types, #[repr(C)] types with the same fields, certain primitive types, maybe Box if this PR is accepted) and stating everything else doesn't have the same layout? I'm not sure if such an "ad hoc" definition might be acceptable?
Yeah that would be the other option. We'd have to bikeshed that definition for a bit probably, but it seems like a useful definition to have. It's not even really about Box... not sure where the definition should go but the Box docs seem like the wrong place. This sounds more like reference material.
Cc @scottmcm as this seems loosely related to rust-lang/rfcs#3844
There was a problem hiding this comment.
My proposal would be to just have a sentence that says: You can transmute Box<A> to Box<B> for types with the same layout.
"same layout" should then link to a section on the Type Layout reference page with the following content (heading to be bikesheded):
Two types A and B are generally assumed to have a different layout. There are the following cases where the layout of A and B are assumed to be the same:
AandBare#[repr(C)]types with exactly the same fields in regards to type and order of the fieldsAis a#[repr(transparent)]wrapper aroundBAandBare primitive types with the same bit widthAis a primitive type andbis byte array with the same number of bytes as the primitive typeBox<A>andBox<B>ifAandBhave the same layout (only assuming this PR get's accepted)
If A can be assumed to have the same layout as B, it also can be assumed that B has the same layout as A. If A has the same layout as B and B has the same layout as C it can be assumed that A has the same layout as C and vice versa.
If two types A and B have the same layout it is sound to transmute between these types.
This also would allow to link this "same layout" section from the transmute documentation and possible other relevant places as well. The list then can be easily extended by future sound combinations (or anything I already missed)
There was a problem hiding this comment.
This also would allow to link this "same layout" section from the transmute documentation
I don't think that makes sense. It is legal to transmute u32 to (u8, 16) even though they don't have "the same layout" by any reasonable definition.
And conversely, this...
A is a primitive type and b is byte array with the same number of bytes as the primitive type
... is not a legal Box transmute because it can change the alignment. (Unless you never actually drop that Box, then it'd be fine.) Box transmute cares about alignment because it is relevant for drop, regular transmute does not care about alignment.
So thinking about this again, "same layout" seems like a dead end. The conditions are subtly different for different operations so we won't be able to easily re-use a term.
I think part of the problem here is -- we have to figure out what we even want to do. There are two things one can specify for such a transmute:
- when is the transmute sound to expose as a public function?
- when is the transmute itself legal, but might produce a value that could lead to UB later if it is misused?
It is sound to transmute bool to u8. It is legal to transmute &[u8] to &str but giving that string to unknown code can lead to UB. Transmuting u8 to bool is fine if you know the u8 is 0 or 1.
"Same layout" seems to try to capture the former. That's in the realm of the safe-transmute project.
I think for low-level foundational docs such as these, what we want to capture is the latter.
There was a problem hiding this comment.
Turns out that we already guarantee some stuff: https://doc.rust-lang.org/std/primitive.fn.html#abi-compatibility
*const T,*mut T,&T,&mut T,Box<T>(specifically, onlyBox<T, Global>), and
NonNull<T>are all ABI-compatible with each other for allT. They are also ABI-compatible
with each other for differentTif they have the same metadata type (<T as Pointee>::Metadata).
Note the lack of T: Sized requirements
There was a problem hiding this comment.
All that says is that having such a caller-callee type mismatch behaves equivalently to a transmute from the caller type to the callee type. It says nothing about what that transmute does, which is the subject of this PR.
There was a problem hiding this comment.
Thanks for the clarification, that is really helpful for my understanding as well.
There are two things one can specify for such a transmute:
I tried mostly to target the first point, so that might be a source of misunderstanding.
Let me try summarize the requirements for a transmute::<Box<A>, Box<B>() to verify that I have the correct understanding of the current state:
- Requires
AandBto have the same alignment - For Sized types: Requires
AandBto have the same size, needs to verify that all possible values (at call side) ofAare valid forBas well - For Unsized types: Requires
AandBto have the same unsized tail- If
AandBare slicesAandBis required to have the same size to keep the length valid - If
AandBare trait objects they need to have the same vtable (however you would verify that on stable rust?) and the data stored in the trait object needs to fulfill the requirements for sized types? (Same size, alignment and validity of values) - Any other type not allowed for now?
- If
As far as I can see that doesn't really cover transmutes between unsized types and new type wrappers around them yet, right? That might still need an extra point to allow that as explicit exception?
It is legal to transmute u32 to (u8, 16) even though they don't have "the same layout" by any reasonable definition.
I'm not sure if that's correct. For tuples the reference states that they are #[repr(Rust)] so technically the compiler would be free to insert more than 1 byte of padding between the fields, which then would result in both types having a different size.
| //! of [`core::mem::transmute`] for a transmute between `T` and `U`, especially | ||
| //! regarding to size, alignment and the validity of values for both types. Finally |
There was a problem hiding this comment.
mem::transmute makes no requirements about the alignment of T or U, so this does not work the way you intended.
There was a problem hiding this comment.
Also, you cannot transmute between unsized types. So deferring to transmute here will not work.
I think you're writing the docs for the first ever operation that allows transmuting between unsized values, that won't be easy unfortunately.
There was a problem hiding this comment.
Can we just say that transmuting a Box<T> to a Box<U> is equivalent to: using Box::into_raw to get a *mut T, transmuting that into a *mut U, and then using Box::from_raw to get a Box<U>?
There was a problem hiding this comment.
We haven't defined much about transmuting between raw pointers so that doesn't help much.
There was a problem hiding this comment.
That sounds to me like we ought to start there, instead of defining transmuting boxes 😅
There was a problem hiding this comment.
Thinking about this some more I think it's actually not that bad. Basically we should require that
- it is valid to call
size_of_val_raw::<U>on the raw pointer that you get fromBox::into_raw - and
size_of_val_raw::<U>gives the same result assize_of_val_raw::<T> - (and the same for
align_of_val_raw)
The main problem with this approach is that size_of_val_raw is not stable yet...
EDIT: Let's make it stable. :) #157572
View all comments
This change extends the memory layout section in the documentation of
Boxto explicitly state that it is sound to convert betweenBox<A>andBox<B>as long a cast between*mut Aand*mut Bis valid andthe general requirements of
transmuteare satisfied.See
https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/Is.20transmuting.20between.20.60Box.3CA.3E.60.20and.20.60Box.3CB.3E.60.20UB.3F/with/585350243 for the relevant discussion.