Audit · tokio@1.52.3

cargo : tokio @ 1.52.3

PE Patrick Elsen signed 2026-05-28 published 2026-05-28

Claims

has-binarieshas-build-exechas-fuzz-testshas-install-exechas-integration-testshas-property-testshas-unit-testsimpl-algorithmimpl-concurrencyimpl-cryptoimpl-datastructureimpl-interpreterimpl-jitimpl-parserimpl-protocolis-benignuses-concurrencyuses-cryptouses-environmentuses-execuses-filesystemuses-interpreteruses-jituses-networkuses-unsafe

Summary

tokio 1.52.3, Rust's async runtime (work-stealing scheduler, mio reactor, timers, sync primitives). Scoped audit: contents are byte-equivalent to upstream git, no build/install-time execution, all I/O feature-gated and off by default. The work-stealing queue, task vtable, and refcount state machine were read and found sound and documented. No findings; exhaustive unsafe and full loom review across ~1,036 unsafe sites left unasserted.

Report

Subject

tokio 1.52.3 is the dominant asynchronous runtime for Rust. It implements a multi-threaded work-stealing scheduler and a current-thread scheduler, an mio-based reactor (the I/O driver), a hierarchical timer wheel, asynchronous TCP/UDP/Unix-socket and filesystem types, child-process management, OS-signal handling, and a full set of async synchronization primitives (Mutex, RwLock, Semaphore, mpsc/oneshot/broadcast/watch channels, Notify, Barrier). The public surface is large and feature-gated; default = [], so a bare dependency pulls in no I/O. It is the foundation most of the async Rust ecosystem builds on.

Methodology

Tools: openvet 0.6.0, ripgrep, diff, git, cargo metadata. The source is roughly 104K lines across 373 .rs files with about 1,036 unsafe occurrences and 59 unsafe impl Send/Sync sites; there is no build.rs (build = false) and no proc-macro lib, no FFI extern blocks in contents/src (platform calls go through the optional libc/windows-sys dependencies).

VCS comparison: diff -rq contents vcs shows only the expected differences, Cargo.toml (cargo normalization) plus the publish-excluded CHANGELOG.md and the included Cargo.lock. No source file diverges between the published contents and the git checkout (head fd9e8e7).

I read the highest-risk unsafe surfaces in full: the work-stealing run queue (runtime/scheduler/multi_thread/queue.rs), the RawTask/Vtable type-erasure layer (runtime/task/raw.rs), the packed lifecycle-plus-refcount state machine (runtime/task/state.rs), and the loom UnsafeCell abstraction (loom/std/unsafe_cell.rs). I surveyed the capability surface, the dependency graph, and benign-code indicators (no base64 blobs, no obfuscation, no hardcoded network endpoints, no telemetry).

Scope. Given the size (about 104K LOC, ~1,036 unsafe occurrences across 373 files), the following claims were not evaluated and are left unasserted; they must not be read as either satisfied or violated: unsafe-safe, unsafe-documented, unsafe-minimal, and concurrency-impl-correct. Exhaustive loom/TSan model-checking of every primitive is out of scope. This audit verifies supply-chain integrity (VCS byte-equivalence), the capability surface (uses-*), the absence of build/install-time execution, the implementation categorization (impl-*), dependency enumeration, and the correctness and documentation of the representative high-risk unsafe surfaces it read.

Results

The published contents are byte-equivalent to the upstream git tree apart from cargo normalization, supporting is-benign. No build script and no proc-macro library exist, so the crate executes nothing at build or install time, supporting has-build-exec and has-install-exec; it ships no compiled artifacts, supporting has-binaries.

The runtime does real I/O, all of it feature-gated and off by default: TCP/UDP/Unix sockets via mio/socket2 (uses-network), async filesystem operations (uses-filesystem), child-process spawning (uses-exec), and reading of tuning environment variables such as RUST_MAX_BLOCKING_THREADS (uses-environment). It pervasively uses unsafe and concurrency (uses-unsafe, uses-concurrency). It performs no cryptography, no JIT, and no interpretation (uses-crypto, uses-jit, uses-interpreter).

tokio implements the synchronization primitives, the schedulers, intrusive linked lists, the bounded work-stealing deque, and the timer wheel itself, so impl-concurrency, impl-datastructure, and impl-algorithm hold; it is not a parser, interpreter, JIT, cryptography library, or network-protocol implementation, so impl-parser, impl-interpreter, impl-jit, impl-crypto, and impl-protocol are false.

The unsafe I read is sound and documented. The run queue is a single-producer multi-consumer ring buffer that coordinates the producer and stealers through one packed AtomicU64 head (a steal and a real cursor) with a claim-before-touch protocol; unsync_load of tail is justified because the producer is its sole writer, and unsafe impl Send/Sync for Inner rests on that atomic coordination. The RawTask vtable uses the std RawWaker manual-vtable idiom with per-monomorphization field offsets. The reference count uses the same relaxed-increment, isize::MAX overflow-abort pattern as std Arc. Interior mutability routes through a loom-swappable UnsafeCell, which is how tokio model-checks this code under loom. These observations support uses-unsafe and impl-concurrency.

Test coverage is extensive: 221 #[test]/#[cfg(test)] annotations in src (has-unit-tests), 171 integration-test files under tests/ (has-integration-tests), and proptest property tests in the timer-wheel and interval code (has-property-tests). There is no in-tree fuzz harness in the published crate, so has-fuzz-tests is false.

No findings were recorded.

Conclusion

The audit read tokio 1.52.3's highest-risk unsafe surfaces, the work-stealing run queue, the task vtable and reference-counting state machine, and the loom interior-mutability layer, and found them sound and documented, matching established lock-free patterns. The published contents are byte-equivalent to the upstream git tree, there is no build- or install-time code execution, and all network, filesystem, process, and signal capability is optional and disabled by default. Per the scope note, exhaustive unsafe-safe/unsafe-documented/unsafe-minimal review across all ~1,036 unsafe occurrences and full loom verification of every primitive were not performed and those claims are left unasserted. No security, safety, correctness, or quality findings were recorded.

Findings

No findings.

Annotations(5)

`Cargo.toml`

Normalized manifest. build = false confirms there is no build script, and [lib] is an ordinary library (not proc-macro), so the crate runs no code at build or install time. Justifies has-build-exec and has-install-exec. default = []: all capability-bearing features (net, fs, process, signal, rt-multi-thread) are opt-in. All network/fs/process/signal dependencies (mio, libc, socket2, signal-hook-registry, windows-sys) are optional and feature-gated. Justifies uses-network, uses-filesystem, uses-exec.

`src/loom/std/unsafe_cell.rs`

`src/loom/std/unsafe_cell.rs`, line 1-18

#[derive(Debug)]
pub(crate) struct UnsafeCell<T>(std::cell::UnsafeCell<T>);

impl<T> UnsafeCell<T> {
    pub(crate) const fn new(data: T) -> UnsafeCell<T> {
        UnsafeCell(std::cell::UnsafeCell::new(data))
    }

    #[inline(always)]
    pub(crate) fn with<R>(&self, f: impl FnOnce(*const T) -> R) -> R {
        f(self.0.get())
    }

    #[inline(always)]
    pub(crate) fn with_mut<R>(&self, f: impl FnOnce(*mut T) -> R) -> R {
        f(self.0.get())
    }
}

The loom abstraction layer. In normal builds UnsafeCell is a thin newtype over std::cell::UnsafeCell whose with/with_mut hand a raw pointer to a closure; under cfg(loom, test) it is replaced by loom's instrumented cell so the concurrency model checker can detect data races. Every internal data structure routes its interior mutability through this type, which is what lets tokio run its unsafe concurrency code under loom. Justifies uses-unsafe and uses-concurrency.

`src/runtime/scheduler/multi_thread/queue.rs`

`src/runtime/scheduler/multi_thread/queue.rs`, line 41-186

    /// of stealing values. It represents the first value being stolen in the
    /// batch. The `UnsignedShort` indices are intentionally wider than strictly
    /// required for buffer indexing in order to provide ABA mitigation and make
    /// it possible to distinguish between full and empty buffers.
    ///
    /// When both `UnsignedShort` values are the same, there is no active
    /// stealer.
    ///
    /// Tracking an in-progress stealer prevents a wrapping scenario.
    head: AtomicUnsignedLong,

    /// Only updated by producer thread but read by many threads.
    tail: AtomicUnsignedShort,

    /// Elements
    buffer: Box<[UnsafeCell<MaybeUninit<task::Notified<T>>>; LOCAL_QUEUE_CAPACITY]>,
}

unsafe impl<T> Send for Inner<T> {}
unsafe impl<T> Sync for Inner<T> {}

#[cfg(not(loom))]
const LOCAL_QUEUE_CAPACITY: usize = 256;

// Shrink the size of the local queue when using loom. This shouldn't impact
// logic, but allows loom to test more edge cases in a reasonable a mount of
// time.
#[cfg(loom)]
const LOCAL_QUEUE_CAPACITY: usize = 4;

const MASK: usize = LOCAL_QUEUE_CAPACITY - 1;

// Constructing the fixed size array directly is very awkward. The only way to
// do it is to repeat `UnsafeCell::new(MaybeUninit::uninit())` 256 times, as
// the contents are not Copy. The trick with defining a const doesn't work for
// generic types.
fn make_fixed_size<T>(buffer: Box<[T]>) -> Box<[T; LOCAL_QUEUE_CAPACITY]> {
    assert_eq!(buffer.len(), LOCAL_QUEUE_CAPACITY);

    // safety: We check that the length is correct.
    unsafe { Box::from_raw(Box::into_raw(buffer).cast()) }
}

/// Create a new local run-queue
pub(crate) fn local<T: 'static>() -> (Steal<T>, Local<T>) {
    let mut buffer = Vec::with_capacity(LOCAL_QUEUE_CAPACITY);

    for _ in 0..LOCAL_QUEUE_CAPACITY {
        buffer.push(UnsafeCell::new(MaybeUninit::uninit()));
    }

    let inner = Arc::new(Inner {
        head: AtomicUnsignedLong::new(0),
        tail: AtomicUnsignedShort::new(0),
        buffer: make_fixed_size(buffer.into_boxed_slice()),
    });

    let local = Local {
        inner: inner.clone(),
    };

    let remote = Steal(inner);

    (remote, local)
}

impl<T> Local<T> {
    /// Returns the number of entries in the queue
    pub(crate) fn len(&self) -> usize {
        let (_, head) = unpack(self.inner.head.load(Acquire));
        // safety: this is the **only** thread that updates this cell.
        let tail = unsafe { self.inner.tail.unsync_load() };
        len(head, tail)
    }

    /// How many tasks can be pushed into the queue
    pub(crate) fn remaining_slots(&self) -> usize {
        let (steal, _) = unpack(self.inner.head.load(Acquire));
        // safety: this is the **only** thread that updates this cell.
        let tail = unsafe { self.inner.tail.unsync_load() };

        LOCAL_QUEUE_CAPACITY - len(steal, tail)
    }

    pub(crate) fn max_capacity(&self) -> usize {
        LOCAL_QUEUE_CAPACITY
    }

    /// Returns false if there are any entries in the queue
    ///
    /// Separate to `is_stealable` so that refactors of `is_stealable` to "protect"
    /// some tasks from stealing won't affect this
    pub(crate) fn has_tasks(&self) -> bool {
        self.len() != 0
    }

    /// Pushes a batch of tasks to the back of the queue. All tasks must fit in
    /// the local queue.
    ///
    /// # Panics
    ///
    /// The method panics if there is not enough capacity to fit in the queue.
    pub(crate) fn push_back(&mut self, tasks: impl ExactSizeIterator<Item = task::Notified<T>>) {
        let len = tasks.len();
        assert!(len <= LOCAL_QUEUE_CAPACITY);

        if len == 0 {
            // Nothing to do
            return;
        }

        let head = self.inner.head.load(Acquire);
        let (steal, _) = unpack(head);

        // safety: this is the **only** thread that updates this cell.
        let mut tail = unsafe { self.inner.tail.unsync_load() };

        if tail.wrapping_sub(steal) <= (LOCAL_QUEUE_CAPACITY - len) as UnsignedShort {
            // Yes, this if condition is structured a bit weird (first block
            // does nothing, second returns an error). It is this way to match
            // `push_back_or_overflow`.
        } else {
            panic!()
        }

        for task in tasks {
            let idx = tail as usize & MASK;

            self.inner.buffer[idx].with_mut(|ptr| {
                // Write the task to the slot
                //
                // Safety: There is only one producer and the above `if`
                // condition ensures we don't touch a cell if there is a
                // value, thus no consumer.
                unsafe {
                    ptr::write((*ptr).as_mut_ptr(), task);
                }
            });

            tail = tail.wrapping_add(1);
        }

        self.inner.tail.store(tail, Release);
    }

    /// Pushes a task to the back of the local queue, if there is not enough

The multi-thread work-stealing run queue: a bounded (256-slot) single-producer, multi-consumer ring buffer. The producer owns tail and reads it via unsync_load because it is the only writer (push_back, pop, steal_into, drop_tasks). Stealers and the producer coordinate through a single packed AtomicU64 head holding a steal and a real cursor; a steal claims slots by advancing real while leaving steal behind, which blocks other stealers and lets the producer detect concurrent steals. Slot reads/writes through UnsafeCell are gated by this claim-before-touch protocol, so no slot is read while uninitialised or written while live. unsafe impl Send/Sync for Inner (lines 41-42) is sound because all cross-thread access goes through the atomic cursors. The Acquire/Release/AcqRel orderings pair tail stores with steal-side loads. Justifies uses-unsafe, uses-concurrency, impl-concurrency, impl-datastructure, impl-algorithm.

`src/runtime/task/raw.rs`

`src/runtime/task/raw.rs`, line 47-139

    pub(super) trailer_offset: usize,

    /// The number of bytes that the `scheduler` field is offset from the header.
    pub(super) scheduler_offset: usize,

    /// The number of bytes that the `id` field is offset from the header.
    pub(super) id_offset: usize,

    /// The number of bytes that the `spawned_at` field is offset from the header.
    #[cfg(tokio_unstable)]
    pub(super) spawn_location_offset: usize,
}

/// Get the vtable for the requested `T` and `S` generics.
pub(super) fn vtable<T: Future, S: Schedule>() -> &'static Vtable {
    &Vtable {
        poll: poll::<T, S>,
        schedule: schedule::<S>,
        dealloc: dealloc::<T, S>,
        try_read_output: try_read_output::<T, S>,
        drop_join_handle_slow: drop_join_handle_slow::<T, S>,
        drop_abort_handle: drop_abort_handle::<T, S>,
        shutdown: shutdown::<T, S>,
        trailer_offset: OffsetHelper::<T, S>::TRAILER_OFFSET,
        scheduler_offset: OffsetHelper::<T, S>::SCHEDULER_OFFSET,
        id_offset: OffsetHelper::<T, S>::ID_OFFSET,
        #[cfg(tokio_unstable)]
        spawn_location_offset: OffsetHelper::<T, S>::SPAWN_LOCATION_OFFSET,
    }
}

/// Calling `get_trailer_offset` directly in vtable doesn't work because it
/// prevents the vtable from being promoted to a static reference.
///
/// See this thread for more info:
/// <https://users.rust-lang.org/t/custom-vtables-with-integers/78508>
struct OffsetHelper<T, S>(T, S);
impl<T: Future, S: Schedule> OffsetHelper<T, S> {
    // Pass `size_of`/`align_of` as arguments rather than calling them directly
    // inside `get_trailer_offset` because trait bounds on generic parameters
    // of const fn are unstable on our MSRV.
    const TRAILER_OFFSET: usize = get_trailer_offset(
        std::mem::size_of::<Header>(),
        std::mem::size_of::<Core<T, S>>(),
        std::mem::align_of::<Core<T, S>>(),
        std::mem::align_of::<Trailer>(),
    );

    // The `scheduler` is the first field of `Core`, so it has the same
    // offset as `Core`.
    const SCHEDULER_OFFSET: usize = get_core_offset(
        std::mem::size_of::<Header>(),
        std::mem::align_of::<Core<T, S>>(),
    );

    const ID_OFFSET: usize = get_id_offset(
        std::mem::size_of::<Header>(),
        std::mem::align_of::<Core<T, S>>(),
        std::mem::size_of::<S>(),
        std::mem::align_of::<Id>(),
    );

    #[cfg(tokio_unstable)]
    const SPAWN_LOCATION_OFFSET: usize = get_spawn_location_offset(
        std::mem::size_of::<Header>(),
        std::mem::align_of::<Core<T, S>>(),
        std::mem::size_of::<S>(),
        std::mem::align_of::<Id>(),
        std::mem::size_of::<Id>(),
        std::mem::align_of::<&'static Location<'static>>(),
    );
}

/// Compute the offset of the `Trailer` field in `Cell<T, S>` using the
/// `#[repr(C)]` algorithm.
///
/// Pseudo-code for the `#[repr(C)]` algorithm can be found here:
/// <https://doc.rust-lang.org/reference/type-layout.html#reprc-structs>
const fn get_trailer_offset(
    header_size: usize,
    core_size: usize,
    core_align: usize,
    trailer_align: usize,
) -> usize {
    let mut offset = header_size;

    let core_misalign = offset % core_align;
    if core_misalign > 0 {
        offset += core_align - core_misalign;
    }
    offset += core_size;

    let trailer_misalign = offset % trailer_align;

The task system erases each Future/Schedule pair behind a RawTask (a NonNull<Header>) plus a per-monomorphization Vtable. The vtable stores the byte offsets of the Trailer, scheduler, and Id fields (computed once in vtable()), and field access reads through ptr.byte_add(offset).cast(). This is the manual-vtable idiom std uses for RawWaker; the offsets are derived from the concrete Cell<T,S> layout so the casts target valid, live fields. The unsafe fn vtable entries carry mutual-exclusion or ownership preconditions documented on the wrapper methods (for example poll requires exclusive access). Justifies uses-unsafe and impl-concurrency.

`src/runtime/task/state.rs`

`src/runtime/task/state.rs`, line 208-232


    /// Transitions the state to `NOTIFIED`.
    ///
    /// If no task needs to be submitted, a ref-count is consumed.
    ///
    /// If a task needs to be submitted, the ref-count is incremented for the
    /// new Notified.
    pub(super) fn transition_to_notified_by_val(&self) -> TransitionToNotifiedByVal {
        self.fetch_update_action(|mut snapshot| {
            let action;

            if snapshot.is_running() {
                // If the task is running, we mark it as notified, but we should
                // not submit anything as the thread currently running the
                // future is responsible for that.
                snapshot.set_notified();
                snapshot.ref_dec();

                // The thread that set the running bit also holds a ref-count.
                assert!(snapshot.ref_count() > 0);

                action = TransitionToNotifiedByVal::DoNothing;
            } else if snapshot.is_complete() || snapshot.is_notified() {
                // We do not need to submit any notifications, but we have to
                // decrement the ref-count.

Task lifecycle and reference counting are packed into one AtomicUsize: low bits hold the RUNNING/NOTIFIED/COMPLETE/CANCELLED lifecycle flags, the upper bits (shift 16) hold the reference count. State transitions use compare-exchange retry loops. ref_inc increments with Relaxed ordering (sound because a new reference can only derive from an existing one, the Boost-documented pattern) and aborts the process if the prior count exceeds isize::MAX, mirroring std Arc overflow protection. Justifies uses-concurrency and impl-concurrency.

Claims

Summary

Report

Subject

Methodology

Results

Conclusion

Findings

Annotations(5)

Cargo.toml

src/loom/std/unsafe_cell.rs

src/loom/std/unsafe_cell.rs, line 1-18

src/runtime/scheduler/multi_thread/queue.rs

src/runtime/scheduler/multi_thread/queue.rs, line 41-186

src/runtime/task/raw.rs

src/runtime/task/raw.rs, line 47-139

src/runtime/task/state.rs

src/runtime/task/state.rs, line 208-232