Shuxian Wang pic
Senior Engineer, Antithesis

Formal methods for the unsafe side of the Force

Talks

In a recent blog post, Michael Gibson talked about the framework we set up to safely bridge between a single-threaded C++ codebase and a multi-threaded Rust component. This framework gives us an ergonomic way to interact with C++ types that are not fully thread-safe in multi-threaded async Rust.

In the implementation, we effectively reduce all the thread-safety arguments down to the existence of two special primitive types:

  1. SendWrapper<T>: Send allows you to wrap a T: !Send and smuggle it across threads.
  2. MainThreadToken is a witness of execution that exists solely on the main thread.

Both play some unusual tricks on the Send and Sync traits that are not expressible in safe Rust. But we can’t just switch to unsafe and gain unlimited power. We are now burdened with the obligation of showing no runtime abnormalities will occur:

Palpatine: The [strikethrough: dark] unsafe side of [strikethrough: the Force] Rust is a pathway to many abilities some consider to be unnatural… Anakin: is it possible to learn this power

By the end, we’ll formally prove the correctness of our primitives in the RustBelt logical framework.1 And along the way, we’ll gradually build up intuitions on the formal semantics of Rust types and what it really means for them to be Send and Sync.

Send & Sync, semi-formally

Let’s start by discussing the Send and Sync bounds a bit more formally to build up intuitions. Paraphrasing the Rustonomicon slightly:

  • A type T is Send if it is safe to expose exclusive access to T (as T or &mut T) across different threads.2
  • A type T is Sync if it is safe to expose shared access to T (as &T) across different threads.

What do we mean by “safe to expose exclusive/shared access”? Well, the end goal here is making T sound to use for external safe code if we add the Send/Sync bounds. This means that if we expose exclusive/shared access across different threads, and interleave that ability with any sequence of public safe operations we expose on T, we should still be unable to trigger any undefined behavior at run-time.

Thinking through all possible external interactions on T and checking each one against the big list of undefined behaviors is a daunting task. Fortunately, by formalizing and reasoning about invariants of T, we can reduce this task into smaller, more manageable steps:

  1. Identify some predicate P to serve as an invariant over T.
  2. Prove that P really is an invariant:
    1. Show that constructing a value of T establishes P. That is, for functions U -> T that map into T, try to prove P assuming U’s invariant is true.
    2. Dually, show that every safe operation you expose on T preserves the invariant P. That is, for functions T -> U that map out of T, you can now assume P and try to prove U’s invariant is true.
  3. Looking only at P, deduce whether P is logically compatible with cross-thread access. If it is, and since we’ve shown individual safe operations preserve P, we know any sequence of safe operations will still be compatible with cross-thread access.

Owning and sharing

We can use the method of the last section to formalize the semantics of Rust types. The most straightforward way of writing down invariants of a type T is to provide a formal model. In most languages, it means providing a predicate that decides what values are (and aren’t) valid inhabitants of T.

In Rust, however, with its ownership model and attention to low-level programming, modelling a type T instead requires you to demarcate which raw byte patterns in the memory of an executing thread can be considered an owned value of T. Formally, we say every Rust type T has an associated owning predicate to encode these semantics. In formal logic pseudo-code, similar to that used by RustBelt, we can write this as:

T.own(tid, val) := ...

This predicate specifies whether the thread tid can have the memory bytes val as a valid owned value of T. For example, the owning predicate of bool is true exactly when the memory value is an initialized single byte with either the bit pattern 0x00 or 0x01; and the owning predicate of u8 is true exactly when the memory value is an initialized single byte (and all bit patterns are valid). Expressed formally, this looks like:

bool.own(tid, val) := (val = [0x00]) ∨ (val = [0x01])
u8.own(tid, val) := val.length = 1

Interestingly, since Rust allows creation of shared borrows on types (&T), each type T also has an associated sharing predicate:

T.share('a, tid, val) := ...

This predicate decides whether the thread tid can have shared access for lifetime 'a to the memory bytes val as a value of T.

As it turns out, sharing predicates are related to, but not completely determined by the owning predicates. That means one has to specify both the owning and sharing predicates to completely model the semantics of a type. In fact, for example, Cell<u8> and u8 have the exact same owning predicates but very different sharing predicates. That difference allows you to use .set(new_val) to mutate through a &Cell<u8>, but not through a &u8.

Send & Sync, formally

The owning and sharing predicates on T help us capture exactly what it means to have exclusive access (T) and shared access (&T). Under this formalism, we have the following definition of whan a type is Send:

A type T is Send exactly when its owning predicate T.own(tid, val) is independent from the tid parameter.

This means that whether you can validly own a value of T does not depend on which thread you are on.

Symmetrically, for Sync we have:

A type T is Sync exactly when its sharing predicate T.share('a, tid, val) is independent from the thread parameter tid.3

For example, when T internally owns some U, usually you can only claim T: Send when U: Send. Since T.own(tid, _) likely refers to U.own(tid, _) in some way, T.own is thread-independent only if U.own is.

Similarly, when T internally owns some reference &U, you usually have T: Send only when U: Sync. Since T.own(tid, _) now refers to U.share(tid, _), and T.own is thread-independent only if U.share is.

Formally modelling SendWrapper<T>

Now let’s apply this modelling process to SendWrapper<T>. Here’s its public interface:

unsafe impl<T> Send for SendWrapper<T> {} // ??
unsafe impl<T> Sync for SendWrapper<T> {} // ??
pub struct SendWrapper<T>(T);

impl<T> SendWrapper<T> {
    pub fn new(val: T) -> Self;
    /// # Safety
    /// In general only safe to call from the same thread that originally constructed `self`.
    pub unsafe fn unwrap_unchecked(self) -> T;
}

impl<T: Sync> Deref for SendWrapper<T> {
    type Target = T;
    fn deref(&self) -> &T;
}

impl<T> Drop for SendWrapper<T> {
    fn drop(&mut self) {
        panic!("Hey, no dropping here!");
    }
}

First, what are the owning/sharing predicates of SendWrapper<T>? We know SendWrapper<T> has to, in some form, own a T, because we rely on that to expose accessor methods like unwrap_unchecked and the Deref impl. But we can’t directly say:

SendWrapper<T>.own(tid, val) := T.own(tid, val)

since we promise SendWrapper<T>: Send even when T: !Send. Instead, we introduce an existential to unbind the tid. We can’t claim the current thread owns T, but we can claim that there exists some thread that owns T. We do the same for sharing:

SendWrapper<T>.own(tid, val) := ∃ tid'. T.own(tid', val)
SendWrapper<T>.share('a, tid, val) := ∃ tid'. T.share('a, tid', val)

This results in somewhat curious semantics: owning a SendWrapper<T> on the current thread does not give you full rights to own T. Instead, you get a certificate that some other thread (maybe you, we don’t know) can fully own T. You are only allowed to extract the T from the wrapper if you can prove that it is valid for your thread to directly own the underlying inhabitant of T.4

After we decoupled the owning thread of SendWrapper<T> and that of T, the invariants we specify are trivially thread-independent. That’s why we can claim SendWrapper<T>: Send + Sync. What remains is to show they are compatible with the operations we expose:

  1. First, in Rust you can always obtain a shared reference from an owned value by borrowing it. That is, we need to start from SendWrapper<T>.own(tid, val) and derive SendWrapper<T>.share('a, tid, val). This can be done since knowing some thread can own T means that same thread can perform a borrow and satisfies T.share.
  2. new is the only public constructor, and is effectively the true implication T.own(tid, val) -> ∃ tid'. T.own(tid', val).
  3. unwrap_unchecked is effectively the implication ∃ tid'. T.own(tid', val) ∧ tid = tid' -> T.own(tid, val), with the tid = tid' condition provided by the safety requirement.
  4. The Deref implementation requires us to obtain a &T on the current thread when we only know some thread can borrow T. But remember we also require T to be Sync, and that means if some thread can have &T, all threads (including the current thread) can as well!
  5. Finally the Drop on SendWrapper<T> has to be banned by a run-time panic. Otherwise the default destructor will effectively unwrap the existential without checking and violates our invariants.

The real machine-checkable version of this formalism in RustBelt is done through a separation logic framework implemented in Rocq. There are some more technical details, but the overall idea remains the same. Expand to see the full proof for SendWrapper<T>:

Rocq formal proof

From lrust.typing Require Import typing lib.panic.

Section sw.

Context `{!typeGS Σ}.

-- The owning & sharing predicates of `SendWrapper<T>`
Program Definition send_wrapper (ty : type) :=
  {|
    ty_size := ty.(ty_size);
    ty_own tid vl := ( tid', ty.(ty_own) tid' vl);
    ty_shr κ tid l := ( tid', ty.(ty_shr) κ tid' l);
  |}%I.

  Next Obligation. iIntros (ty _ vl) "[% H]". iApply (ty.(ty_size_eq) with "H"). Qed.

  -- Goal #1
  Next Obligation.
    iIntros (ty E κ l _ q ?) "#LFT Hb Htok".
    iAssert (&{κ}( tid' vl, l ↦∗ vl ∗ ty.(ty_own) tid' vl))%I with "[Hb]" as "Hb".
    { iApply (bor_iff with "[] Hb"). iModIntro. iModIntro. iSplit.
      - iIntros "[% [$ [% $]]]".
      - iIntros "[% [% [$ $]]]". }
    iMod (bor_exists with "LFT Hb") as (tid') "Hb"; first done.
    iMod (ty.(ty_share) with "LFT Hb Htok") as "[$ $]"; done.
  Qed.

  Next Obligation.
    iIntros (ty k k' _ l) "#SUB [%tid H]".
    iExists tid. iApply (ty.(ty_shr_mono) with "SUB H").
  Qed.

Global Instance sw_wf ty `{!TyWf ty} : TyWf (send_wrapper ty) :=
{ ty_lfts := ty_lfts ty; ty_wf_E := ty_wf_E ty }.

-- Proving `SendWrapper<T>: Send`
Global Instance sw_send ty : Send (send_wrapper ty).
Proof. by unfold send_wrapper, Send. Qed.

-- Proving `SendWrapper<T>: Sync`
Global Instance sw_sync ty : Sync (send_wrapper ty).
Proof. by unfold send_wrapper, Sync. Qed.

End sw.

Section code.
  Context `{!typeGS Σ, !rcG Σ}.

  Definition sw_new ty `{!TyWf ty} : val :=
    fn: ["x"] := return: ["x"].

  -- Goal #2
  -- Proving `SendWrapper<T>::new : T -> SendWrapper<T>` is well-typed
  Lemma sw_new_type ty `{!TyWf ty} :
    typed_val (sw_new ty) (fn(∅; ty)  send_wrapper ty).
  Proof.
    iIntros (E L). iApply type_fn; [solve_typing..|]. iIntros "/= !>".
    iIntros ([] ϝ ret p). inv_vec p=>x. simpl_subst.
    iApply type_jump; [solve_typing..|].
    iIntros (tid qmax q2) "#LFT #HE $ HT !>".
    iApply type_incl_tctx_incl; try iFrame.
    iApply own_type_incl; try solve_typing.
    iSplit; try solve_typing.
    iSplit; iIntros "!> !> * $".
  Qed.

  Definition sw_deref ty `{!TyWf ty, !Sync ty} : val :=
    fn: ["sw"] := return: ["sw"].

  Lemma shr_incl A B α : (□  tid l, (ty_shr A α tid l -∗ ty_shr B α tid l))%I -∗ type_incl (&shr{α} A) (&shr{α} B).
  Proof.
    iIntros "#H". iSplit; try solve_typing. iSplit.
    - iIntros "!>" (tid [|[[]|][]]) "Ho //". iApply ("H" with "Ho").
    - iIntros "!>" (κ tid l) "[%vl [$ Hb]]".
      destruct vl as [|[[]|][]]; try iFrame. iApply ("H" with "Hb").
  Qed.

  -- Goal #4
  -- Proving `SendWrapper<T: Sync>::deref : &SendWrapper<T> -> &T` is well-typed
  Lemma sw_deref_type ty `{!TyWf ty, !Sync ty} :
    typed_val (sw_deref ty) (fn( α, ∅; &shr{α}(send_wrapper ty))  &shr{α}ty).
  Proof.
    iIntros (E L). iApply type_fn; [solve_typing..|]. iIntros "/= !>".
    iIntros (α ϝ ret p). inv_vec p=>sw. simpl_subst.
    iApply type_jump; [solve_typing..|].
    iIntros (tid qmax q2) "#LFT #HE $ HT !>".
    iApply type_incl_tctx_incl; try iFrame.
    iApply own_type_incl; try solve_typing.
    iApply shr_incl.
    iIntros "!> !>" (? ?) "[% H]".
    iApply sync_change_tid. iFrame.
  Qed.

  Definition sw_drop ty `{!TyWf ty} : val :=
    fn: ["sw"] :=
      let: "panic" := panic in
      letcall: "bot" := "panic" [] in
      case: !"bot" of [].

  -- Goal #5
  -- Proving `drop : SendWrapper<T> -> ()` is well-typed with panicking.
  Lemma sw_drop_type ty `{!TyWf ty} :
    typed_val (sw_drop ty) (fn(∅; (sendwrap ty))  unit).
  Proof.
    iIntros (E L). iApply type_fn; [solve_typing..|]. iIntros "/= !>".
    iIntros ([] ϝ ret p). inv_vec p=>x. simpl_subst.
    iApply type_let; [iApply panic_type|solve_typing|].
    iIntros (panic); simpl_subst.
    iApply (type_letcall ()); [solve_typing..|].
    iIntros (bot); simpl_subst.
    iApply type_case_own; solve_typing.
  Qed.

End code.

Formally modelling MainThreadToken

Now let’s apply the same modelling process to MainThreadToken. MainThreadToken has a smaller interface:

#[derive(Clone, Copy)]
pub struct MainThreadToken(PhantomData<*mut ()>);
// Owning a `*mut ()` here ensures !Send and !Sync

impl MainThreadToken {
    /// # Safety
    /// This function must be called from the main fuzzer thread.
    pub unsafe fn new() -> Self;
}

The owning predicate encodes the requirement that MainThreadToken is only supposed to live on the main thread, and MainThreadToken itself is merely a zero-sized type:

MainThreadToken.own(tid, val) := (tid = MAIN_THREAD_ID) ∧ (val = [])

For convenience, we also mark MainThreadToken as Copy, which allows us to freely duplicate them from a shared borrow &MainThreadToken. This requires the sharing predicate to also bear the “only living on the main thread” semantics:

MainThreadToken.share('a, tid, val) := (tid = MAIN_THREAD_ID)

This time those predicates’ truth value depends on the exact value of the current thread id tid, and that explains why we have to mark MainThreadToken: !Send + !Sync.

Expand to see the full Rocq proof for the safety of MainThreadToken:

Rocq formal proof

From lrust.typing Require Import typing.

Section thtk.

Context `{!typeGS Σ}.

Parameter (main_thread_id: thread_id).

-- The owning & sharing predicates of `MainThreadToken`
Program Definition main_thread_token :=
  {|
    ty_size := 0;
    ty_own tid vl := ⌜tid = main_thread_id⌝ ∗ ⌜vl = []⌝;
    ty_shr κ tid l := ⌜tid = main_thread_id⌝;
  |}%I.

  Next Obligation. iIntros (? ?) "[_ %]". by subst. Qed.

  Next Obligation.
    iIntros (E κ l tid q ?) "#LFT Hb Htok".
    iMod (bor_exists with "LFT Hb") as "[% Hb]"; first done.
    iMod (bor_sep with "LFT Hb") as "[_ Hb]"; first done.
    iMod (bor_sep with "LFT Hb") as "[Hb _]"; first done.
    iMod (bor_persistent with "LFT Hb Htok") as "[>$ $]"; done.
  Qed.

  Next Obligation. iIntros "**". done. Qed.

Global Instance main_thread_token_wf : TyWf main_thread_token :=
{ ty_lfts := [] ; ty_wf_E := [] }.

-- Proving `MainThreadToken: Copy`
Global Instance main_thread_token_copy : Copy main_thread_token.
Proof.
split; first by apply _.
iIntros (????????) "_ #Hshr Htok $".
iDestruct (na_own_acc with "Htok") as "[$ Htok]"; first solve_ndisj.
iExists 1%Qp. iModIntro. iSplitR.
{ iExists []. iSplit; first by rewrite heap_pointsto_vec_nil. iSplit; done. }
iIntros "Htok2 _". iApply "Htok". done.
Qed.

End thtk.

A few unnatural abilities

Phew, that’s quite a lot of sweat and tears for us to finally claim that SendWrapper and MainThreadToken are safe abstractions!

This wasn’t just about the formal proofs we got at the end, though. Thinking through how to model Send and Sync at this level of detail has given us a better understanding of thread safety in Rust.

One fact we came to appreciate along the way is that in safe Rust, the compiler is essentially doing all the proofs that we’re manually verifying here. A good strategy for working with unsafe interfaces, then, is to express that with as much safe code as you can and concentrate the dark side of the Force into a few unnatural abilities reduce the unsafeness to the existence of a few primitive types. This way, we can focus our energy more effectively on proving the correctness of a smaller target.


If you’ve made it through this second post as well, and are still thinking “what I don’t like about this post series is that it’s too short and doesn’t go into the weeds enough,”… well, that’s all we have for you for now, but we can point you in the direction of more good things.

If you want to learn even more about formal Rust semantics, the original RustBelt paper and Ralf Jung’s PhD thesis are great starting points. For formal verification of low-level system programming with concurrency, a general separation logic framework like Iris might be the current best tool.