Stabilize RandomSource and DefaultRandomSource#157168
Conversation
This comment has been minimized.
This comment has been minimized.
This provides enough of an interface for people to obtain random bytes. The `Distribution` trait and the `random` function remain unstable; those don't need to block stabilization of `RandomSource` and `DefaultRandomSource`. Similarly, this leaves a `fill_buf` function using `BorrowedCursor` as future work.
18dd02e to
eb9d7c8
Compare
|
If it’s only the first call that can fail, could we put |
|
@jdahlstrom That would force every caller to deal with it, albeit only once. If we (in the future) provide a fallible |
|
I'm un-marking this as a draft. Based on experiments with As for |
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
Personally, I do not support this stabilization. The most pressing needs can be alleviated by stabilizing a free-standing (potentially panicking)
IMO they should be named
I don't think that added
This does not apply to HW-based RNGs used in cryptography. Not only they are IO-based, but also commonly use internal security checks. The same somewhat applies to RNGs built-in into CPUs. For example, RDRAND may in theory fail at any moment and some buggy AMD CPUs are known to produce bad values (e.g. after hybernation) which are guarded against with runtime checks. In some niche cases it's also important to prove absence of panics and the suggested potentially panicking behavior will be an annoying hindrance. Checking for errors could also be useful in scenarios where we mix entropy from different sources where failure of one source does not stop the system. |
| /// A source of randomness. | ||
| #[unstable(feature = "random", issue = "130703")] | ||
| #[stable(feature = "random_source", since = "CURRENT_RUSTC_VERSION")] | ||
| pub trait RandomSource { |
There was a problem hiding this comment.
Regarding next_u32/next_u64, while I really want DefaultRandomSource.fill_bytes (in some form) stabilized ASAP, I have reservations about leaving it at "it's not clear we need them for performance and we can add them later". Personally I'd rather err on the side of adding these methods, unless we're quite sure we will never need them, or it's clear that we can't resolve the question in a reasonable time frame.
Adding the methods after stabilization has a cost (even besides opportunity cost). As @dhardy pointed out in the past, adding provided method later means existing implementers that want to offer reproducibility (as in stability of produced values) can't override the provided methods without breaking reproducibility for users who started using those methods. And for libraries that use RNGs to sample some distribution and want to promise reproducibility of that sampling, the same problem applies if they're first written against fill_bytes and later want to use next_uN.
Another (smaller) reason to err on the side of including these is to ease the ecosystem's transition from rand traits (which have always had next_u32/u64) to the std trait. If std doesn't have the methods at first and adds them later, that's two unnecessarily transitions (rand::Rng::next_uN -> fill_bytes + uN::from_*e_bytes -> RandomSource::next_uN). Stabilizing some subset of distributions would avoid this, but the distributions are far from ready for stabilization.
Finally, while the benchmarks in #157193 and on Zulip don't have a smoking gun that the methods are necessary for performance, it's also not clear that we won't want them. Even those benchmarks show a benefit for dyn RandomSource (the only argument is whether you consider that compatible with "cares about performance"), and @dhardy previously mentioned that rand has benchmarks justifying the methods in rand's context. At minimum we should look at those benchmarks as well and see if the fill_bytes semantics (which I think matches rand's) actually works for those benchmarks as well.
There was a problem hiding this comment.
IIRC, it's possible to use inlining and https://doc.rust-lang.org/std/intrinsics/fn.is_val_statically_known.html to perform these optimisations without needing the API surface.
There was a problem hiding this comment.
Inlining doesn't work for dyn RandomSource. And is_val_statically_known only helps when the two implementations that have the same behavior, but in this case, some potentially desirable optimizations change behavior. (Also, the intrinsic doesn't seem to have a clear path to being exposed on stable.)
|
I oppose this stabilization, as I've mentioned before I don't think we are at a point where we want to stabilize traits or anything that represents or implies a canonical "way to do random number generation". The current proposal with There is one real need from the standard library: a (no_std overridable) source of random bytes. This should simply be a function without further baggage or API precedent. Only once we have a clear view of what an opinionated |
To quote the
This is rather vague. Would a report of a defective implementation be considered a security issue? E.g. But my biggest concern is what happens on unsupported platforms, e.g. |
This was my understanding as well, but when I sat down and worked through it, I couldn’t come up with a benchmark that shows a difference (between Maybe this has changed over time as LLVM has improved? The way rand derives |
The specific problem with |
|
@hanna-kruppe I tried benchmarking Xoshiro256++, Sfc32 and Sfc64 using If there's desire to use only a single method, I would consider using |
|
Are there any code examples in the docs? If not it'd be great to add them before stabilisation. |
|
From zulip discussion, there seems to be some tension between goals
|
An alternative to panicking is to seed an infallible RNG from a fallible RNG. This at least defers the error condition to something that happens once up-front, and is avoidable thereafter. I'd probably only recommend that for bare metal embedded use cases though. Anywhere you have a proper kernel entropy pool (and potentially have to worry about forking) you're better off using that. |
Can you provide the benchmarking code? I'd love to see if we can optimize that in the style of hanna-kruppe/chacha8rand#1 . |
|
@joshtriplett here's a diff against rand_pcg code. |
Stabilization report
This partial stabilization provides enough of an interface for people to obtain random bytes, which is a common need in the ecosystem, currently fulfilled via the
getrandomcrate.There have been many requests for a
fill_bytesinterface in the standard library. Per previous libs-api discussions,DefaultRandomSource.fill_bytescan serve that function, rather than adding a separate free function.🚲 Bikeshedding 🚲
We could call this
Rng/DefaultRngrather thanRandomSource/DefaultRandomSource.Alternatives and Future Work
Uninitialized buffers
We're likely to add a
fill_buffunction to fill aBorrowedCursor<'_, u8>. We can do so onceBorrowedBuf/BorrowedCursoris stable. Deferring this means we will need to support trait impls that providefill_bytesbut notfill_buf, which we might not need to if we waited until afterBorrowedBuf/BorrowedCursoris stable. However, that isn't any worse of a problem than we already have withio::Read, and we don't necessarily want to couple the stabilization ofBorrowedBuf/BorrowedCursorwithDistributions
The
Distributiontrait and therandomfunction remain unstable; those don't need to block stabilization ofRandomSourceandDefaultRandomSource.Optimized paths for
u32/u64Some RNGs can provide faster results for generating a whole
u32/u64rather than individual bytes.The definition and documentation of
fill_bytessays:We hope that this will allow RNGs that can generate whole words to do so efficiently as a fast path in
fill_bytes/fill_buf. If dedicatednext_u32/next_u64functions still end up being substantially faster, we can always add them as optional trait methods in the future.Resultversus panickingThere's been extensive discussion about whether the function should return a
Resultrather than panicking, or providing an additional such function. The previous conclusion from libs-api was that while it's possible for the first such call to fail (e.g. because the OS or sandbox provides no access to randomness at all), subsequent calls should never fail, and user code will not be prepared to deal with such failure.Furthermore, an API returning
Resultwould propagate throughout higher-level calls, forcing operations as simple as "roll a d20" to either returnResultor callexpect/unwrap. And even providing atryvariant will lead to higher-level APIs having to consider which variant to call. We should, instead, make the guarantee that a well-behaved underlying OS won't panic after the first call.Note, in particular, that
HashMapalready fails via panic if it can't get data from itsRandomState.If there's a need to allow error recovery for the "no OS/sandbox support" case, we could provide a one-time call to check for an error. Or, such users could continue using
getrandomor the underlying OS APIs.If we did want to make every call fallible, we have the capability, using upcoming language features ("supertrait auto impl"), to add a
TryRandomSourcesupertrait without breaking backwards compatibility.