-
Notifications
You must be signed in to change notification settings - Fork 0
Unsafe Rust Study
Tock uses unsafe Rust extensively, since as an embedded operating system it must directly interact with memory mapped hardware registers, interrupt handlers, FFI boundaries, and more, making much of the unsafe code fundamentally required. However, this is restricted to the core kernel code, since capsules (drivers) forbid unsafe Rust.
One of the most common unsafe patterns in Tock is the use of StaticRef, which casts a raw address to a memory mapped register.
unsafe { StaticRef::new(0x4000_5400 as *const I2CRegisters) };Rust cannot verify if the address is valid, so this category of unsafe code cannot realistically be eliminated.
Tock also uses many unsafe extern "C" functions for C FFI.
Example:
pub trait CortexMVariant {
const GENERIC_ISR: unsafe extern "C" fn();
const SYSTICK_HANDLER: unsafe extern "C" fn();
const SVC_HANDLER: unsafe extern "C" fn();
const HARD_FAULT_HANDLER: unsafe extern "C" fn();
}These definitions are inherently unsafe and cannot realistically be made fully safe.
Tock also uses a lot of unsafe code around process memory management.
Example from kernel/src/process_standard.rs:
unsafe {
let (app_memory, kernel_memory) =
raw_slice_split_at_mut(allocated_padded_memory, app_memory_start_offset);
let (grant_ptrs_region, kernel_memory) =
raw_slice_split_at_mut(kernel_memory, grant_ptrs_size);
}This code is unsafe because the raw_slice_split_at_mut performs unchecked pointer arithmetic.
There could be some opportunity to rewrite this with safer slice APIs, but it would introduce additional runtime checks and additional overhead during process creation.
Tock contains >100 instances of inline assembly, for instance
asm!(
"
mrs r0, ipsr
",
out("r0") interrupt_number,
options(nomem, nostack, preserves_flags),
);This category is fundamentally unavoidable in an OS, and can have any effect on the program or memory.
Unlike most typical Rust applications, Tock requires much lower level access to the machine. As a result, a substantial amount of unsafe Rust is required. In addition, there are many other instances of unsafe code in the project that weren't covered here (the keyword unsafe appears over 2000 times in Rust source files!).
Smaller unsafe patterns could be replaced with safe alternatives, like with some NonNull, MaybeUninit, and transmute usages, but overall Tock fundamentally requires unsafe Rust as part of its design.
In almost all cases, unsafe Rust is used to either enforce careful checking of invariants or to perform more efficient operations once invariants are validated. So, almost all unsafe code can be replaced with a safe alternative, requiring only minor runtime or memory overhead. With some small tweaks, I was able to remove almost all unsafe code from this project.
Each unsafe block also has a safety comment explaining its usage or the invariant being relied upon, making tracking unsafe code across this repo much easier.
For example, look at the definition of IdxVec::get in src/core/indices.rs:
impl<I: Idx, T> IdxVec<I, T> {
pub unsafe fn get(&self, index: I) -> &T {
let index = index.into_inner().into_usize();
self.inner
.get(index)
.expect("this to be a valid index due to the safety guarantees made by the caller")
}
}IdxVec is an indexable vector for WASM definitions. Its IdxVec::get method is marked unsafe because it assumes the caller has already validated the index.
The code currently uses the safe get and expect methods instead of the unsafe get_unchecked (a comment exists to change it to get_unchecked once all calls are validated), but IdxVec::get remains unsafe because the index validity guarantee is the caller's responsibility.
In src/validation/code.rs the parser first validates indices and then uses the unsafe calls:
let type_idx = *unsafe { c_funcs.get(func_idx) };
let func_ty = unsafe { fn_types.get(type_idx) };
...
let tab = unsafe { c_tables.get(table_idx) };Those calls are accompanied by safety comments that explain the guarantee established by prior validation. This way, the user is reminded to carefully check the invariants before calling the method, even though there is no unsafe code within IdxVec::get.
Consider this example from src/execution/store/linear_memory.rs:
let dst = unsafe { lock_guard.get_unchecked(i + index) };
dst.store(byte, Ordering::Relaxed);Since it's already known that i+index is a valid index into lock_guard (and thus our invariant is validated), the code uses the unsafe get_unchecked method to avoid repeated bounds checking.
This can be replaced with the safe alternative
let dst = lock_guard.get(i + index).expect("lock guard index should be valid");Another example of this appears in src/execution/value_stack.rs:
Rust's core::mem::MaybeUninit is used to avoid initializing fields that are invalid for the base call frame.
pub(crate) struct CallFrame {
pub return_func_addr: MaybeUninit<FuncAddr>,
pub return_addr: MaybeUninit<usize>,
pub return_stp: MaybeUninit<usize>,
}Instead, this can be rewritten safely with Option<T>:
pub(crate) struct CallFrame {
pub return_func_addr: Option<FuncAddr>,
pub return_addr: Option<usize>,
pub return_stp: Option<usize>,
}The invariant that only non-base frames contain initialized return metadata is validated before calling the unsafe assume_init method:
if self.call_frame_count() == 0 {
None
} else {
unsafe {
Some((
return_func_addr.assume_init(),
return_addr.assume_init(),
return_stp.assume_init(),
))
}
}Though the unsafe code is always okay because the invariant is checked, we can replace the unsafe assume_init method safely with:
if self.call_frame_count() == 0 {
None
} else {
Some((
return_func_addr.expect("return function address should be valid"),
return_addr.expect("return address should be valid"),
return_stp.expect("return stack pointer should be valid"),
))
}A small amount of additional memory and branching overhead is required in exchange for eliminating MaybeUninit-based invariants from the stack frame logic, and we still don't have to initialize those fields for the base frame (though another option would be to store them as T instead of Option<T> and give a default value for the base stack)
Throughout the entire codebase, the only unsafe code that cannot be eliminated is the spinlock in src/core/rw_spinlock.rs:
impl<T> Deref for WriteLockGuard<'_, T> {
type Target = T;
fn deref(&self) -> &T {
unsafe { &*self.lock.inner.get() }
}
}Here, since the spinlock needs to have writers across multiple threads, it explicitly must violate Rust's guarantee of compile-time exclusive mutability across threads and dynamically enforce it instead. So, this cannot be replaced with a safe alternative.
The DLR Wasm Interpreter project uses unsafe Rust in a controlled way. Unsafe blocks are either:
- a safe contract boundary around an invariant (often index validity), or
- an unsafe faster operation after an invariant has already been checked (although improper validation of the invariant can cause UB or crashes).
Both of which can be easily replaced with safe alternatives for little to no overhead.
The only exception to this rule is code within src/core/rw_spinlock.rs, which controls the lock primitive.