From 13a3ee1971f899dde29e7957a8f383332ad68989 Mon Sep 17 00:00:00 2001 From: Ted Kaminski Date: Mon, 13 Sep 2021 22:09:09 +0000 Subject: [PATCH 1/4] initial draft of 2nd tutorial section --- rmc-docs/src/SUMMARY.md | 5 +- rmc-docs/src/tutorial-kinds-of-failure.md | 161 ++++++++++++++++++ .../src/tutorial/kinds-of-failure/Cargo.toml | 13 ++ .../kinds-of-failure/tests/bounds-check.rs | 48 ++++++ .../tests/overflow-quicksort.rs | 25 +++ .../kinds-of-failure/tests/overflow.rs | 40 +++++ 6 files changed, 291 insertions(+), 1 deletion(-) create mode 100644 rmc-docs/src/tutorial-kinds-of-failure.md create mode 100644 rmc-docs/src/tutorial/kinds-of-failure/Cargo.toml create mode 100644 rmc-docs/src/tutorial/kinds-of-failure/tests/bounds-check.rs create mode 100644 rmc-docs/src/tutorial/kinds-of-failure/tests/overflow-quicksort.rs create mode 100644 rmc-docs/src/tutorial/kinds-of-failure/tests/overflow.rs diff --git a/rmc-docs/src/SUMMARY.md b/rmc-docs/src/SUMMARY.md index 2235929fee1c..fc52bbeaeeda 100644 --- a/rmc-docs/src/SUMMARY.md +++ b/rmc-docs/src/SUMMARY.md @@ -6,8 +6,11 @@ - [RMC on a single file]() - [RMC on a crate]() - [Debugging failures]() - + - [Debugging non-termination]() + - [RMC tutorial](./rmc-tutorial.md) - [First steps with RMC](./tutorial-first-steps.md) + - [Failures that RMC can spot](./tutorial-kinds-of-failure.md) + - [Loops, unwinding, and bounds]() - [RMC developer documentation]() diff --git a/rmc-docs/src/tutorial-kinds-of-failure.md b/rmc-docs/src/tutorial-kinds-of-failure.md new file mode 100644 index 000000000000..f2dc0bbb5ea1 --- /dev/null +++ b/rmc-docs/src/tutorial-kinds-of-failure.md @@ -0,0 +1,161 @@ +# Failures that RMC can spot + +In the [last section](./tutorial-first-steps.md) we saw RMC spot two major kinds of failures: assertions and panics. +If the proof harness allows some program trace that results in a panic, then RMC will report that as a failure. +We additionally saw very briefly a couple of other kinds of failures, like null pointer dereferences and overflow. +In this section, we're going to expand on these additional checks, to give you an idea of what other problems RMC will find. + +## Bounds checking and pointers + +Rust is safe by default, and so includes dynamic (run-time) bounds checking where needed. +Consider this Rust code (which can be found under [`rmc-docs/src/tutorial/kinds-of-failure`](https://github.com/model-checking/rmc/tree/main/rmc-docs/src/tutorial/kinds-of-failure/)): + +```rust +{{#include tutorial/kinds-of-failure/tests/bounds-check.rs:code}} +``` + +We can again write a simple property test against this code: + +```rust +{{#include tutorial/kinds-of-failure/tests/bounds-check.rs:proptest}} +``` + +This property test will immediately find the failing case because of this dynamic check. + +But what if we change this function to use unsafe Rust: + +```rust +return unsafe { *a.get_unchecked(i % a.len() + 1) }; +``` + +Now the error becomes invisible to this test: + +``` +# cargo test +[...] +test tests::doesnt_crash ... ok +``` + +But we're able to check this unsafe code with RMC: + +```rust +{{#include tutorial/kinds-of-failure/tests/bounds-check.rs:rmc}} +``` + +``` +# rmc tests/bounds-check.rs +[...] +** 1 of 468 failed (2 iterations) +VERIFICATION FAILED +``` + +Notice there were a *lot* of verification conditions being checked: 468! +This is a result of using the standard library `Vec` implementation, which means our harness actually used quite a bit of code, short as it looks. +Let's narrow that output down a bit: + +``` +# rmc tests/bounds-check.rs | grep FAIL +[get_wrapped.pointer_dereference.5] line 10 dereference failure: pointer outside object bounds in *var_5: FAILURE +VERIFICATION FAILED +``` + +Notice that, for RMC, this has gone from a simple bounds-checking problem to a pointer-checking problem. +RMC will check arbitrary operations on pointers to ensure they're not potentially invalid memory accesses. +Any unsafe code that manipulates pointers will, as we see here, raise failures if its behavior is actually unsafe. + +Consider trying a few more small exercises with this example: + +1. Exercise: Switch back to the normal/safe indexing operation and re-try RMC. What changes compared to the unsafe operation and why? +(Try predicting the answer, then seeing if you got it right.) +2. Exercise: Remember how to get a trace from RMC? Find out what inputs it failed on. +3. Exercise: Fix the error, run RMC, and see a successful verification. + +## Overflow and math errors + +Consider a different variant on the above function: + +```rust +fn get_wrapped(i: usize, a: &[u32]) -> u32 { + return a[i % a.len()]; +} +``` + +We've corrected the out-of-bounds access, but now we've omitted the "base case": what to return on an empty list. +RMC will spot this not as a bound error, but as a mathematical error: on an empty list the modulus operator (`%`) will cause a division by zero. + +1. Exercise: Try to run RMC on the above, to see what this kind of failure looks like. + +Rust also performs runtime safety checks for integer overflows, much like it does for bounds checks. +Consider this code (from `tests/overflow.rs`): + +```rust +{{#include tutorial/kinds-of-failure/tests/overflow.rs:code}} +``` + +A trivial function, but if we write a property test for it, we immediately find inputs where it fails, thanks to Rust's dynamic checks. +Here's the output from RMC: + +``` +# rmc tests/overflow.rs +[...] +** Results: +./tests/overflow.rs function simple_addition +[simple_addition.assertion.1] line 6 attempt to compute `move _3 + move _4`, which would overflow: FAILURE +[simple_addition.overflow.1] line 6 arithmetic overflow on unsigned + in var_3 + var_4: FAILURE + +** 2 of 2 failed (2 iterations) +VERIFICATION FAILED +``` + +Notice the two failures: the Rust-inserted overflow check (`simple_addition.assertion.1`) and RMC's explicit overflow check (`simple_addition.overflow.1`). + +> **NOTE:** You could attempt to fix this issue by using Rust's alternative mathematical functions with explicit overflow behavior. +For instance, instead of `a + b` write `a.wrapping_add(b)`. +> +> However, [at the present time](https://github.com/model-checking/rmc/issues/480), while this disables the dynamic assertion that Rust inserts, it does not disable the additional RMC overflow check. +> As a result, this currently still fails in RMC. + + + +### Exercise: Classic overflow failure + +One of the classic subtle bugs that persisted in many implementations for a very long time is finding the midpoint in quick sort. +This often naively looks like this (from `tests/overflow-quicksort.rs`): + +```rust +{{#include tutorial/kinds-of-failure/tests/overflow-quicksort.rs:code}} +``` + +RMC immediately spots the bug in the above code. + +1. Exercise: Fix this function so it no longer overflows. +(Hint: depending on which approach you take, you may need to add the assumption that `high > low` to your proof harness. +Don't add that right away, see what happens if you don't. Just keep it in mind.) +2. Exercise: Prove your new implementation actually finds the midpoint correctly by adding an assertion to the test harness. + +## Future work + +RMC notably does not check the following: + +1. Concurrency bugs, deadlocks, or data races. +It's possible RMC may be extended in the future to find such issues. + +2. Rust type invariants. +For example, it's undefined behavior in Rust to produce a value of type `bool` that isn't `0` or `1`. +RMC will not spot this error (in presumably unsafe code), yet. + +3. Fully generic functions. +To write a proof harness and call functions, they must be fully "monomorphized." +This means we can't currently check a generic function (`foo`) generically. +Proof harnesses have to be written specializing type parameters (`T`) to concrete types (e.g. `u32`), and check those instead. + + +## Summary + +In this section: + +1. We saw RMC spot potential bounds check errors. +2. We saw RMC spot actually-unsafe dereferencing of a raw pointer to invalid memory. +3. We saw RMC spot a division by zero error. +4. We saw RMC spot overflowing addition. +5. As an exercise, we tried proving an assertion (finding the midpoint) that was not completely trivial. diff --git a/rmc-docs/src/tutorial/kinds-of-failure/Cargo.toml b/rmc-docs/src/tutorial/kinds-of-failure/Cargo.toml new file mode 100644 index 000000000000..0b5fdb15edcb --- /dev/null +++ b/rmc-docs/src/tutorial/kinds-of-failure/Cargo.toml @@ -0,0 +1,13 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# SPDX-License-Identifier: Apache-2.0 OR MIT +[package] +name = "kinds-of-failure" +version = "0.1.0" +edition = "2018" + +[dependencies] + +[dev-dependencies] +proptest = "1.0.0" + +[workspace] diff --git a/rmc-docs/src/tutorial/kinds-of-failure/tests/bounds-check.rs b/rmc-docs/src/tutorial/kinds-of-failure/tests/bounds-check.rs new file mode 100644 index 000000000000..9038c92f05e2 --- /dev/null +++ b/rmc-docs/src/tutorial/kinds-of-failure/tests/bounds-check.rs @@ -0,0 +1,48 @@ +// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +// SPDX-License-Identifier: Apache-2.0 OR MIT + +// ANCHOR: code +fn get_wrapped(i: usize, a: &[u32]) -> u32 { + if a.len() == 0 { + return 0; + } + return a[i % a.len() + 1]; +} +// ANCHOR_END: code + +// Alternative unsafe return for the above function: +// return unsafe { *a.get_unchecked(i % a.len() + 1) }; + +#[cfg(test)] +mod tests { + use super::*; + use proptest::prelude::*; + + // ANCHOR: proptest + proptest! { + #[test] + fn doesnt_crash(i: usize, a: Vec) { + get_wrapped(i, &a); + } + } + // ANCHOR_END: proptest +} + +fn __nondet() -> T { + unimplemented!() +} +fn __VERIFIER_assume(cond: bool) { + unimplemented!() +} + +// ANCHOR: rmc +#[cfg(rmc)] +#[no_mangle] +fn main() { + let size: usize = __nondet(); + __VERIFIER_assume(size < 4096); + let index: usize = __nondet(); + let array: Vec = vec![0; size]; + get_wrapped(index, &array); +} +// ANCHOR_END: rmc diff --git a/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow-quicksort.rs b/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow-quicksort.rs new file mode 100644 index 000000000000..dc9ec4e47cc0 --- /dev/null +++ b/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow-quicksort.rs @@ -0,0 +1,25 @@ +// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +// SPDX-License-Identifier: Apache-2.0 OR MIT + +// ANCHOR: code +fn find_midpoint(low: u32, high: u32) -> u32 { + return (low + high) / 2; +} +// ANCHOR_END: code + +fn __nondet() -> T { + unimplemented!() +} +fn __VERIFIER_assume(cond: bool) { + unimplemented!() +} + +// ANCHOR: rmc +#[cfg(rmc)] +#[no_mangle] +fn main() { + let a: u32 = __nondet(); + let b: u32 = __nondet(); + find_midpoint(a, b); +} +// ANCHOR_END: rmc diff --git a/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow.rs b/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow.rs new file mode 100644 index 000000000000..32e1b8180f8e --- /dev/null +++ b/rmc-docs/src/tutorial/kinds-of-failure/tests/overflow.rs @@ -0,0 +1,40 @@ +// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +// SPDX-License-Identifier: Apache-2.0 OR MIT + +// ANCHOR: code +fn simple_addition(a: u32, b: u32) -> u32 { + return a + b; +} +// ANCHOR_END: code + +#[cfg(test)] +mod tests { + use super::*; + use proptest::prelude::*; + + // ANCHOR: proptest + proptest! { + #[test] + fn doesnt_crash(a: u32, b: u32) { + simple_addition(a, b); + } + } + // ANCHOR_END: proptest +} + +fn __nondet() -> T { + unimplemented!() +} +fn __VERIFIER_assume(cond: bool) { + unimplemented!() +} + +// ANCHOR: rmc +#[cfg(rmc)] +#[no_mangle] +fn main() { + let a: u32 = __nondet(); + let b: u32 = __nondet(); + simple_addition(a, b); +} +// ANCHOR_END: rmc From 5c100155c455bd8ed640e67c5b41df08e631f56c Mon Sep 17 00:00:00 2001 From: Ted Kaminski Date: Tue, 14 Sep 2021 22:20:14 +0000 Subject: [PATCH 2/4] add exercise solutions, fix minor issues --- rmc-docs/src/tutorial-kinds-of-failure.md | 54 +++++++++++++++++++++-- 1 file changed, 50 insertions(+), 4 deletions(-) diff --git a/rmc-docs/src/tutorial-kinds-of-failure.md b/rmc-docs/src/tutorial-kinds-of-failure.md index f2dc0bbb5ea1..7a95e88fda3b 100644 --- a/rmc-docs/src/tutorial-kinds-of-failure.md +++ b/rmc-docs/src/tutorial-kinds-of-failure.md @@ -49,8 +49,9 @@ But we're able to check this unsafe code with RMC: VERIFICATION FAILED ``` -Notice there were a *lot* of verification conditions being checked: 468! +Notice there were a *lot* of verification conditions being checked: in the above output, 468! (It may change for you.) This is a result of using the standard library `Vec` implementation, which means our harness actually used quite a bit of code, short as it looks. +RMC is inserting a lot more checks than appear as asserts in our code, so the output can be large. Let's narrow that output down a bit: ``` @@ -60,7 +61,7 @@ VERIFICATION FAILED ``` Notice that, for RMC, this has gone from a simple bounds-checking problem to a pointer-checking problem. -RMC will check arbitrary operations on pointers to ensure they're not potentially invalid memory accesses. +RMC will check operations on pointers to ensure they're not potentially invalid memory accesses. Any unsafe code that manipulates pointers will, as we see here, raise failures if its behavior is actually unsafe. Consider trying a few more small exercises with this example: @@ -69,6 +70,25 @@ Consider trying a few more small exercises with this example: (Try predicting the answer, then seeing if you got it right.) 2. Exercise: Remember how to get a trace from RMC? Find out what inputs it failed on. 3. Exercise: Fix the error, run RMC, and see a successful verification. +4. Exercise: Try switching back to the unsafe code (now with the error fixed) and re-run RMC. It should still successfully verify. + +
+Click to see explanation for exercise 1 + +Having switched back to the safe indexing operation, RMC reports two failures instead of just one: + +``` +# rmc tests/bounds-check.rs | grep FAIL +[get_wrapped.assertion.3] line 9 index out of bounds: the length is move _12 but the index is _5: FAILURE +[get_wrapped.pointer_dereference.5] line 9 dereference failure: pointer outside object bounds in a.data[var_5]: FAILURE +VERIFICATION FAILED +``` + +The first is Rust's implicit assertion for the safe indexing operation. +The second is RMC's check to ensure the pointer operation is actually safe. +This pattern (two checks for similar issues in safe Rust code) is common, and we'll see it again in the next section. + +
## Overflow and math errors @@ -93,6 +113,7 @@ Consider this code (from `tests/overflow.rs`): ``` A trivial function, but if we write a property test for it, we immediately find inputs where it fails, thanks to Rust's dynamic checks. +RMC will find these failures as well. Here's the output from RMC: ``` @@ -115,8 +136,6 @@ For instance, instead of `a + b` write `a.wrapping_add(b)`. > However, [at the present time](https://github.com/model-checking/rmc/issues/480), while this disables the dynamic assertion that Rust inserts, it does not disable the additional RMC overflow check. > As a result, this currently still fails in RMC. - - ### Exercise: Classic overflow failure One of the classic subtle bugs that persisted in many implementations for a very long time is finding the midpoint in quick sort. @@ -133,6 +152,33 @@ RMC immediately spots the bug in the above code. Don't add that right away, see what happens if you don't. Just keep it in mind.) 2. Exercise: Prove your new implementation actually finds the midpoint correctly by adding an assertion to the test harness. +
+Click to see solutions for these exercises + +A very common approach for resolving the overflow issue looks like this: + +```rust +return low + (high - low) / 2; +``` + +But if you naively try this (try it!), you'll find a new underflow error: `high - low` might result in a negative number, but has type `u32`. +Hence, the need to add an assumption that would make that impossible. +(Adding an assumption, though, means there's a new way to "use it wrong." Perhaps we'd like to avoid that!) + +After that, you might wonder how to "prove your new implementation correct." +After all, what does "correct" even mean? +Often we're using a good approximation of correct, such as the equivalence of two implementations (often one much "simpler" than the other somehow). +Here's one possible asssertion to make that obvious: + +```rust +assert!(result as u64 == (a as u64 + b as u64) / 2); +``` + +Since this implementation is just the original one, but cast to a wider unsigned integer type, it should have the same result but without overflowing. +When RMC tells us both of these methods yield the same exact result, that gives us additional confidence that we haven't overlooked something. + +
+ ## Future work RMC notably does not check the following: From 7d1a5fc1146e42deda592310d19d8bf5e8d79795 Mon Sep 17 00:00:00 2001 From: Ted Kaminski Date: Wed, 15 Sep 2021 14:27:01 +0000 Subject: [PATCH 3/4] typo --- rmc-docs/src/tutorial-kinds-of-failure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rmc-docs/src/tutorial-kinds-of-failure.md b/rmc-docs/src/tutorial-kinds-of-failure.md index 7a95e88fda3b..ff701f211645 100644 --- a/rmc-docs/src/tutorial-kinds-of-failure.md +++ b/rmc-docs/src/tutorial-kinds-of-failure.md @@ -168,7 +168,7 @@ Hence, the need to add an assumption that would make that impossible. After that, you might wonder how to "prove your new implementation correct." After all, what does "correct" even mean? Often we're using a good approximation of correct, such as the equivalence of two implementations (often one much "simpler" than the other somehow). -Here's one possible asssertion to make that obvious: +Here's one possible assertion to make that obvious: ```rust assert!(result as u64 == (a as u64 + b as u64) / 2); From c8ff27a868915534505208dd41336031343a4682 Mon Sep 17 00:00:00 2001 From: Ted Kaminski Date: Wed, 15 Sep 2021 16:15:54 +0000 Subject: [PATCH 4/4] word smithing --- rmc-docs/src/tutorial-kinds-of-failure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rmc-docs/src/tutorial-kinds-of-failure.md b/rmc-docs/src/tutorial-kinds-of-failure.md index ff701f211645..4dae879fd363 100644 --- a/rmc-docs/src/tutorial-kinds-of-failure.md +++ b/rmc-docs/src/tutorial-kinds-of-failure.md @@ -181,7 +181,7 @@ When RMC tells us both of these methods yield the same exact result, that gives ## Future work -RMC notably does not check the following: +RMC notably does not currently check the following: 1. Concurrency bugs, deadlocks, or data races. It's possible RMC may be extended in the future to find such issues.