Audit · memchr@2.8.0

cargo : memchr @ 2.8.0

PE Patrick Elsen signed 2026-05-28 published 2026-05-28

Claims

algorithm-impl-boundsalgorithm-impl-correctalgorithm-impl-safealgorithm-impl-testedhas-binarieshas-build-exechas-fuzz-testshas-install-exechas-integration-testshas-property-testshas-unit-testsimpl-algorithmimpl-concurrencyimpl-cryptoimpl-datastructureimpl-interpreterimpl-jitimpl-parserimpl-protocolis-benignunsafe-documentedunsafe-minimalunsafe-safeunsafe-testeduses-concurrencyuses-cryptouses-environmentuses-execuses-filesystemuses-interpreteruses-jituses-networkuses-unsafe

Summary

memchr 2.8.0 implements SIMD-accelerated byte search (1/2/3 needles, forward/reverse) and substring search for arbitrary byte slices. The 325 unsafe blocks are confined to SIMD intrinsics, pointer-walking loops, and CPU dispatch; each carries a SAFETY comment and the AVX2 path is gated behind a runtime feature check. Quickcheck property tests and upstream cargo-fuzz targets cover all search variants. No findings.

Report

Subject

memchr 2.8.0, authored by Andrew Gallant (BurntSushi), implements SIMD-accelerated byte search and substring search routines for arbitrary byte slices. The public API exposes memchr/memrchr (1-, 2-, 3-needle byte search, forward and reverse), plus a memmem submodule for substring search with Finder and FinderRev types. The crate is no_std compatible, supports SSE2, AVX2 (x86_64), NEON (aarch64), and wasm-simd128, and falls back to a word-at-a-time scalar implementation on unsupported targets.

Methodology

The published crate contents were compared against the upstream Git repository at the commit recorded in .cargo_vcs_info.json using diff -rq. Source files were read in full. Tools used: diff (macOS 15.5), grep (BSD grep), and openvet 0.6.0. The 15,791 lines of Rust source across 45 files were reviewed, with particular focus on the 325 unsafe blocks concentrated in src/arch/generic/memchr.rs, src/arch/generic/packedpair.rs, src/vector.rs, src/ext.rs, and all architecture-specific facades. The unsafe_ifunc! macro and runtime CPU dispatch mechanism were read in full. Fuzz infrastructure lives in vcs/fuzz/ (excluded from the published crate as expected via exclude in Cargo.toml).

Results

The diff shows only expected differences: Cargo.toml normalization, .cargo_vcs_info.json added, Cargo.lock added (library crate with workspace lock present), and directories excluded by exclude (/.github, /benchmarks, /fuzz, /scripts) absent from the published package. No source file content divergence between contents/ and vcs/. The crate contains no binary artifacts (has-binaries), no build.rs (has-build-exec), and no install hooks (has-install-exec).

The crate is no_std and does not touch the network (uses-network), filesystem (uses-filesystem), environment variables (uses-environment), process execution (uses-exec), or cryptography (uses-crypto). It does not use or implement a JIT compiler (uses-jit, impl-jit), interpreter (uses-interpreter, impl-interpreter), parser (impl-parser), protocol (impl-protocol), or standalone data structure (impl-datastructure). Cryptographic operations are absent (impl-crypto). The crate has no integration tests outside of the src/tests inline module (has-integration-tests). The codebase was reviewed for obfuscated code, network endpoints, and suspicious payloads; none were found (is-benign). It does not implement concurrency primitives (impl-concurrency), though it does use an AtomicPtr for function-pointer caching in the x86_64 dispatch macro; this is a lock-free store of a function pointer and does not constitute a concurrency abstraction exposed to users. The crate does not spawn threads or use an async runtime, so uses-concurrency is false. The unsafe impl Send/Sync for Iter is correctly justified in a SAFETY comment at src/arch/generic/memchr.rs:1013-1022.

The dominant use of unsafe (uses-unsafe) is the SIMD search loops in src/arch/generic/memchr.rs and src/arch/generic/packedpair.rs, the platform Vector trait implementations in src/vector.rs, the pointer arithmetic utilities in src/ext.rs, and the CPU dispatch macro in src/arch/x86_64/memchr.rs. Every unsafe fn and unsafe block carries a // SAFETY: comment documenting the invariants it relies on (unsafe-documented). The unsafe surface is minimal: intrinsics and pointer arithmetic are the only use, and both are strictly necessary for SIMD (unsafe-minimal).

The runtime CPU feature detection in src/arch/x86_64/avx2/memchr.rs uses is_available(), which calls std::is_x86_feature_detected!("avx2") when std is enabled, or falls back to compile-time #[cfg(target_feature)]; the AVX2 implementation is only dispatched to after this check passes. The NEON and wasm-simd128 paths use only compile-time #[cfg(target_arch)] and #[cfg(target_feature)] gates, which is sufficient because those features are always available on their respective architectures. All pointer bounds in the aligned-load loops are guarded by debug_assert! and the loop conditions cur <= end.sub(LOOP_SIZE) ensure loads stay in bounds. The tail handling with overlapping unaligned loads is correct because the preceding load found no match (unsafe-safe).

The crate implements byte-search and substring-search algorithms (impl-algorithm), including: a classic SIMD vector search algorithm, the Two-Way string matching algorithm (O(n) search, O(m) preprocessing, O(1) space), Rabin-Karp, a Shift-Or implementation, and a "packed pair" heuristic prefilter. The algorithm-level correctness (algorithm-impl-correct) is supported by a quickcheck property test suite that cross-checks every architecture-specific implementation against a naive reference for random inputs. The VCS repository contains 8 cargo-fuzz targets covering all public-facing routines (has-fuzz-tests, has-property-tests, algorithm-impl-tested). Linear worst-case search time is guaranteed by the Two-Way fallback path (algorithm-impl-bounds). Memory safety of the algorithms follows from the unsafe-block analysis above (algorithm-impl-safe).

No findings were recorded.

Conclusion

The crate's unsafe code is concentrated at the SIMD intrinsic boundary and is uniformly documented with SAFETY comments. The CPU feature gating is correct: AVX2 code is dispatched only after a runtime or compile-time availability check. Pointer arithmetic in the search loops is bounded by loop conditions, with tail cases handled by overlapping loads rather than out-of-bounds accesses. The test infrastructure includes a quickcheck property suite applied to every architecture-specific implementation and a cargo-fuzz harness in the upstream repository. No issues were found across any class.

Findings

No findings.

Annotations(4)

`src/arch/generic/memchr.rs`

The One::find_raw, Two::find_raw, Three::find_raw (and their rfind_raw counterparts) in src/arch/generic/memchr.rs are the load-bearing unsafe SIMD search loops. The bounds argument to all these functions is documented with a detailed SAFETY contract: both start < end, distance at least V::BYTES, pointers valid, initialized, in the same allocation, no address-space wraparound. Internally, aligned loads are gated by debug_assert_eq!(0, cur.as_usize() % V::BYTES). The tail of the haystack is handled by an overlapping unaligned load rather than a byte-at-a-time loop, which is correct because if the preceding load found no match, any bytes in the overlap region are known not to match. The pointer arithmetic (start.add(V::BYTES - (start.as_usize() & V::ALIGN))) correctly computes the next aligned address. The unsafe impl Send/Sync for Iter is justified by a SAFETY comment noting no interior mutations and lifetime-bounded haystack pointer. Justifies unsafe-safe and unsafe-documented.

`src/arch/x86_64/memchr.rs`

`src/arch/x86_64/memchr.rs`, line 58-159

macro_rules! unsafe_ifunc {
    (
        $memchrty:ident,
        $memchrfind:ident,
        $fnty:ty,
        $retty:ty,
        $hay_start:ident,
        $hay_end:ident,
        $($needle:ident),+
    ) => {{
        #![allow(unused_unsafe)]

        use core::sync::atomic::{AtomicPtr, Ordering};

        type Fn = *mut ();
        type RealFn = $fnty;
        static FN: AtomicPtr<()> = AtomicPtr::new(detect as Fn);

        #[cfg(target_feature = "sse2")]
        #[target_feature(enable = "sse2", enable = "avx2")]
        unsafe fn find_avx2(
            $($needle: u8),+,
            $hay_start: *const u8,
            $hay_end: *const u8,
        ) -> $retty {
            use crate::arch::x86_64::avx2::memchr::$memchrty;
            $memchrty::new_unchecked($($needle),+)
                .$memchrfind($hay_start, $hay_end)
        }

        #[cfg(target_feature = "sse2")]
        #[target_feature(enable = "sse2")]
        unsafe fn find_sse2(
            $($needle: u8),+,
            $hay_start: *const u8,
            $hay_end: *const u8,
        ) -> $retty {
            use crate::arch::x86_64::sse2::memchr::$memchrty;
            $memchrty::new_unchecked($($needle),+)
                .$memchrfind($hay_start, $hay_end)
        }

        unsafe fn find_fallback(
            $($needle: u8),+,
            $hay_start: *const u8,
            $hay_end: *const u8,
        ) -> $retty {
            use crate::arch::all::memchr::$memchrty;
            $memchrty::new($($needle),+).$memchrfind($hay_start, $hay_end)
        }

        unsafe fn detect(
            $($needle: u8),+,
            $hay_start: *const u8,
            $hay_end: *const u8,
        ) -> $retty {
            let fun = {
                #[cfg(not(target_feature = "sse2"))]
                {
                    debug!(
                        "no sse2 feature available, using fallback for {}",
                        stringify!($memchrty),
                    );
                    find_fallback as RealFn
                }
                #[cfg(target_feature = "sse2")]
                {
                    use crate::arch::x86_64::{sse2, avx2};
                    if avx2::memchr::$memchrty::is_available() {
                        debug!("chose AVX2 for {}", stringify!($memchrty));
                        find_avx2 as RealFn
                    } else if sse2::memchr::$memchrty::is_available() {
                        debug!("chose SSE2 for {}", stringify!($memchrty));
                        find_sse2 as RealFn
                    } else {
                        debug!("chose fallback for {}", stringify!($memchrty));
                        find_fallback as RealFn
                    }
                }
            };
            FN.store(fun as Fn, Ordering::Relaxed);
            // SAFETY: The only thing we need to uphold here is the
            // `#[target_feature]` requirements. Since we check is_available
            // above before using the corresponding implementation, we are
            // guaranteed to only call code that is supported on the current
            // CPU.
            fun($($needle),+, $hay_start, $hay_end)
        }

        // SAFETY: By virtue of the caller contract, RealFn is a function
        // pointer, which is always safe to transmute with a *mut (). Also,
        // since we use $memchrty::is_available, it is guaranteed to be safe
        // to call $memchrty::$memchrfind.
        unsafe {
            let fun = FN.load(Ordering::Relaxed);
            core::mem::transmute::<Fn, RealFn>(fun)(
                $($needle),+,
                $hay_start,
                $hay_end,
            )
        }
    }};

The unsafe_ifunc! macro in src/arch/x86_64/memchr.rs implements a function pointer dispatch mechanism that amortizes runtime CPU feature detection across calls. On first invocation, detect() checks avx2::$memchrty::is_available() and sse2::$memchrty::is_available() before selecting an implementation; the selected function pointer is stored in an AtomicPtr<()> with Ordering::Relaxed and subsequent calls transmute and invoke it directly. The is_available() functions use std::is_x86_feature_detected!("avx2") (gated on the std feature) or compile-time #[cfg(target_feature)] checks; AVX2 code is never called unless is_available() returned true. This justifies uses-unsafe in the dispatch layer.

`src/tests`

The test module (src/tests/) contains 84 #[test] items and a comprehensive quickcheck-based property test suite (src/tests/memchr/prop.rs, src/tests/substring/prop.rs) that compares every vectorized implementation against a naive reference implementation across random inputs. The define_memchr_quickcheck! macro is applied to every architecture-specific module (aarch64/neon, x86_64/sse2, x86_64/avx2, wasm32/simd128, all/). The VCS repository also contains eight cargo-fuzz targets (fuzz/fuzz_targets/) covering all public-facing search routines. Justifies has-unit-tests, has-property-tests, and has-fuzz-tests. Justifies unsafe-tested.

`src/vector.rs`

The Vector and MoveMask traits in src/vector.rs abstract over SSE2 (__m128i), AVX2 (__m256i), NEON (uint8x16_t) and wasm-simd128 (v128) vector types. Each platform impl delegates directly to the corresponding compiler intrinsic. The NeonMoveMask variant encodes match positions in bits spread across 64 bits (4 bits per lane), and both first_offset and last_offset correctly divide trailing/leading zeros by 4. The aarch64 implementation of load_aligned falls back to an unaligned load because, as commented, ARM naturally handles aligned addresses without cost. All methods are marked unsafe and callers are required to hold the target-feature precondition. Justifies unsafe-minimal: the unsafe surface is confined to the platform-intrinsic boundary and not extended beyond what is necessary.