cargo / memchr / audit
cargo : memchr @ 2.8.1
PE Patrick Elsen signed 2026-06-02 published 2026-06-02

Claims

algorithm-impl-boundsalgorithm-impl-correctalgorithm-impl-safealgorithm-impl-testedhas-binarieshas-build-exechas-fuzz-testshas-install-exechas-integration-testshas-property-testshas-unit-testsimpl-algorithmimpl-concurrencyimpl-cryptoimpl-datastructureimpl-interpreterimpl-jitimpl-parserimpl-protocolis-benignunsafe-documentedunsafe-minimalunsafe-safeunsafe-testeduses-concurrencyuses-cryptouses-environmentuses-execuses-filesystemuses-interpreteruses-jituses-networkuses-unsafe

Summary

memchr 2.8.1 is a #![no_std] library providing SIMD-accelerated byte and substring search (Two-Way, Rabin-Karp, Shift-Or, plus per-arch SSE2/AVX2/NEON/simd128). No build.rs, no proc macros, no I/O, two optional deps. The crate uses extensive unsafe for SIMD intrinsics and raw-pointer loops, but every boundary carries a # Safety block, quickcheck property tests check each implementation against a naive reference, and the upstream tree has miri configuration and 8 cargo-fuzz targets. No findings.

Report

Subject

memchr is a #![no_std] Rust library by Andrew Gallant (BurntSushi) and bluss that provides byte and substring search primitives optimised with SIMD. The crate exposes memchr, memchr2, memchr3 and their reverse counterparts, the corresponding iterator wrappers, and the memmem sub-module for forward/reverse substring search (Finder, FinderRev, find_iter, rfind_iter, find, rfind). Under the arch:: namespace it also re-exports the underlying per-architecture searchers (x86_64 SSE2 and AVX2, aarch64 NEON, wasm32 simd128) and the architecture-independent fall-back implementations (Two-Way, Rabin-Karp, Shift-Or, plus a "packedpair" SIMD prefilter). At construction time a meta-searcher selects an implementation based on the needle and runtime CPU features; for small inputs a Rabin-Karp fast-path is used. The crate has no build.rs, no proc macros, and two optional dependencies: log (for diagnostic trace! messages when the logging feature is enabled) and rustc-std-workspace-core (only when consumed from inside std itself via rustc-dep-of-std).

Methodology

Tooling used:

  • openvet audit new (0.6.0) to fetch and unpack the crate from crates.io and clone the upstream GitHub repository at the commit recorded in .cargo_vcs_info.json.
  • diff -r (Apple Darwin) to compare published crate contents against the upstream VCS checkout.
  • grep to scan contents/src/ for unsafe, extern "C", process::*, std::net::*, std::fs::*, env::*, std::thread, transmute, raw-pointer manipulation, allocator usage, and panic-prone calls.
  • Manual reading of the crate-level entry point (src/lib.rs), the Vector and MoveMask traits (src/vector.rs), the pointer-extension helpers (src/ext.rs), the meta-searcher dispatch logic (src/memmem/searcher.rs), the safety-documentation conventions in src/arch/generic/memchr.rs, and the prose headers of src/arch/all/twoway.rs, src/arch/all/rabinkarp.rs and src/arch/all/shiftor.rs. The four per-architecture vector implementations (~5 K LOC) were spot-checked but not read end to end.
  • Survey of the test suite under src/tests/ (~1300 LOC of quickcheck-based property tests and naive reference implementations) and of the upstream-only fuzz/ directory (8 cargo-fuzz targets covering every public byte- and substring-search entry point).

The published memchr-2.8.1.crate was diffed against the upstream repository at the commit pinned in .cargo_vcs_info.json. Every file under src/ matches the upstream tree byte-for-byte; differences are limited to cargo's Cargo.toml normalisation, the auto-generated Cargo.lock, and the upstream-only .github/, benchmarks/, fuzz/, scripts/ (excluded via the exclude list in Cargo.toml.orig).

Results

The diff between published contents and the upstream repository shows no unexpected changes. The crate contains no binary artefacts (justifying has-binaries) and no build.rs; Cargo.toml declares build = false and [lib] has no proc-macro = true. There is no install-time hook either, justifying has-build-exec and has-install-exec.

The codebase was reviewed for cryptography, process spawning, network I/O, filesystem I/O, environment-variable access, concurrency primitives, JIT, interpreters and parsing of external data formats. None was found, justifying uses-crypto, uses-exec, uses-network, uses-filesystem, uses-environment, uses-concurrency, uses-jit, uses-interpreter, and the corresponding implementation claims impl-crypto, impl-protocol, impl-interpreter, impl-jit, impl-parser, impl-datastructure, and impl-concurrency. The crate's purpose is implementing the Two-Way, Rabin-Karp, Shift-Or, and SIMD-accelerated byte/substring search algorithms; their published time and space bounds are documented in the source (twoway.rs references the Crochemore-Perrin 1991 paper; the substring find / rfind doc-comments document the O(n + m) worst-case guarantee), justifying impl-algorithm together with algorithm-impl-safe, algorithm-impl-correct, algorithm-impl-tested, and algorithm-impl-bounds.

unsafe is used pervasively (~340 occurrences across the crate) — this is unavoidable for the SIMD intrinsics and raw-pointer-based search loops the crate is built around. Every pub unsafe fn in the arch/ tree carries a # Safety block enumerating pointer validity, allocated-object provenance, isize-distance non-overflow and no-wrap requirements; the Vector, MoveMask and Pointer traits document their safety contracts at the trait level. The memmem::searcher::Searcher uses a hand-rolled union for dispatching between substring implementations, with the safety invariant (paired function pointer must read the populated variant) documented on the SearcherKindFn type and on every dispatch function. The unsafe portions are exercised by quickcheck property tests under src/tests/ that compare every searcher against a naive reference implementation, and the codebase carries #[cfg(miri)] gates indicating routine validation under miri; the upstream fuzz/ directory carries cargo-fuzz harnesses for every public entry point. Together this justifies uses-unsafe, unsafe-safe, unsafe-documented, unsafe-minimal, and unsafe-tested.

No malicious behaviour was identified, justifying is-benign. No findings were recorded.

Conclusion

memchr is a mature, widely-deployed library (used by regex, ripgrep, the Rust standard library's substring search, and many others) whose safety-critical sections are well-documented, well-tested and well-bounded. The crate's unsafe-heavy design is intrinsic to its goal — SIMD-accelerated byte search — and the safety contracts at every unsafe boundary are explicit. The test suite combines per-implementation property tests, a miri-aware configuration, and an upstream fuzz harness. The audit produced no findings.

Findings

No findings.

Annotations(7)

Cargo.toml

Cargo.toml, line 12-33

[package]
edition = "2021"
rust-version = "1.61"
name = "memchr"
version = "2.8.1"
authors = [
    "Andrew Gallant <jamslam@gmail.com>",
    "bluss",
]
build = false
exclude = [
    "/.github",
    "/benchmarks",
    "/fuzz",
    "/scripts",
    "/tmp",
]
autolib = false
autobins = false
autoexamples = false
autotests = false
autobenches = false

build = false and no build.rs; [lib] declares neither proc-macro = true nor any other compile-time execution surface, justifying has-build-exec and has-install-exec. The exclude list (/.github, /benchmarks, /fuzz, /scripts, /tmp) keeps the published crate to its source and tests — fuzz/ lives upstream only.

src/arch/all/twoway.rs

src/arch/all/twoway.rs, line 1-78

/*!
An implementation of the [Two-Way substring search algorithm][two-way].

[`Finder`] can be built for forward searches, while [`FinderRev`] can be built
for reverse searches.

Two-Way makes for a nice general purpose substring search algorithm because of
its time and space complexity properties. It also performs well in practice.
Namely, with `m = len(needle)` and `n = len(haystack)`, Two-Way takes `O(m)`
time to create a finder, `O(1)` space and `O(n)` search time. In other words,
the preprocessing step is quick, doesn't require any heap memory and the worst
case search time is guaranteed to be linear in the haystack regardless of the
size of the needle.

While vector algorithms will usually beat Two-Way handedly, vector algorithms
also usually have pathological or edge cases that are better handled by Two-Way.
Moreover, not all targets support vector algorithms or implementations for them
simply may not exist yet.

Two-Way can be found in the `memmem` implementations in at least [GNU libc] and
[musl].

[two-way]: https://en.wikipedia.org/wiki/Two-way_string-matching_algorithm
[GNU libc]: https://www.gnu.org/software/libc/
[musl]: https://www.musl-libc.org/
*/

use core::cmp;

use crate::{
    arch::all::{is_prefix, is_suffix},
    memmem::Pre,
};

/// A forward substring searcher that uses the Two-Way algorithm.
#[derive(Clone, Copy, Debug)]
pub struct Finder(TwoWay);

/// A reverse substring searcher that uses the Two-Way algorithm.
#[derive(Clone, Copy, Debug)]
pub struct FinderRev(TwoWay);

/// An implementation of the TwoWay substring search algorithm.
///
/// This searcher supports forward and reverse search, although not
/// simultaneously. It runs in `O(n + m)` time and `O(1)` space, where
/// `n ~ len(needle)` and `m ~ len(haystack)`.
///
/// The implementation here roughly matches that which was developed by
/// Crochemore and Perrin in their 1991 paper "Two-way string-matching." The
/// changes in this implementation are 1) the use of zero-based indices, 2) a
/// heuristic skip table based on the last byte (borrowed from Rust's standard
/// library) and 3) the addition of heuristics for a fast skip loop. For (3),
/// callers can pass any kind of prefilter they want, but usually it's one
/// based on a heuristic that uses an approximate background frequency of bytes
/// to choose rare bytes to quickly look for candidate match positions. Note
/// though that currently, this prefilter functionality is not exposed directly
/// in the public API. (File an issue if you want it and provide a use case
/// please.)
///
/// The heuristic for fast skipping is automatically shut off if it's
/// detected to be ineffective at search time. Generally, this only occurs in
/// pathological cases. But this is generally necessary in order to preserve
/// a `O(n + m)` time bound.
///
/// The code below is fairly complex and not obviously correct at all. It's
/// likely necessary to read the Two-Way paper cited above in order to fully
/// grok this code. The essence of it is:
///
/// 1. Do something to detect a "critical" position in the needle.
/// 2. For the current position in the haystack, look if `needle[critical..]`
/// matches at that position.
/// 3. If so, look if `needle[..critical]` matches.
/// 4. If a mismatch occurs, shift the search by some amount based on the
/// critical position and a pre-computed shift.
///
/// This type is wrapped in the forward and reverse finders that expose
/// consistent forward or reverse APIs.

Hand-coded Two-Way substring search (Crochemore & Perrin, 1991): O(n + m) time, O(1) space, with a heuristic skip table and an optional prefilter. The accompanying GNU libc and musl references are linked in the module docs. Together with rabinkarp.rs (the classical hash-rolling algorithm with documented O(mn) worst-case time but kept on short inputs only) and shiftor.rs (Bitap), this directory justifies impl-algorithm and the documented worst-case bounds underpinning algorithm-impl-bounds.

src/arch/generic/memchr.rs

src/arch/generic/memchr.rs, line 1-92

/*!
Generic crate-internal routines for the `memchr` family of functions.
*/

// What follows is a vector algorithm generic over the specific vector
// type to detect the position of one, two or three needles in a haystack.
// From what I know, this is a "classic" algorithm, although I don't
// believe it has been published in any peer reviewed journal. I believe
// it can be found in places like glibc and Go's standard library. It
// appears to be well known and is elaborated on in more detail here:
// https://gms.tf/stdfind-and-memchr-optimizations.html
//
// While the routine below is fairly long and perhaps intimidating, the basic
// idea is actually very simple and can be expressed straight-forwardly in
// pseudo code. The pseudo code below is written for 128 bit vectors, but the
// actual code below works for anything that implements the Vector trait.
//
//     needle = (n1 << 15) | (n1 << 14) | ... | (n1 << 1) | n1
//     // Note: shift amount is in bytes
//
//     while i <= haystack.len() - 16:
//       // A 16 byte vector. Each byte in chunk corresponds to a byte in
//       // the haystack.
//       chunk = haystack[i:i+16]
//       // Compare bytes in needle with bytes in chunk. The result is a 16
//       // byte chunk where each byte is 0xFF if the corresponding bytes
//       // in needle and chunk were equal, or 0x00 otherwise.
//       eqs = cmpeq(needle, chunk)
//       // Return a 32 bit integer where the most significant 16 bits
//       // are always 0 and the lower 16 bits correspond to whether the
//       // most significant bit in the correspond byte in `eqs` is set.
//       // In other words, `mask as u16` has bit i set if and only if
//       // needle[i] == chunk[i].
//       mask = movemask(eqs)
//
//       // Mask is 0 if there is no match, and non-zero otherwise.
//       if mask != 0:
//         // trailing_zeros tells us the position of the least significant
//         // bit that is set.
//         return i + trailing_zeros(mask)
//
//     // haystack length may not be a multiple of 16, so search the rest.
//     while i < haystack.len():
//       if haystack[i] == n1:
//         return i
//
//     // No match found.
//     return NULL
//
// In fact, we could loosely translate the above code to Rust line-for-line
// and it would be a pretty fast algorithm. But, we pull out all the stops
// to go as fast as possible:
//
// 1. We use aligned loads. That is, we do some finagling to make sure our
//    primary loop not only proceeds in increments of 16 bytes, but that
//    the address of haystack's pointer that we dereference is aligned to
//    16 bytes. 16 is a magic number here because it is the size of SSE2
//    128-bit vector. (For the AVX2 algorithm, 32 is the magic number.)
//    Therefore, to get aligned loads, our pointer's address must be evenly
//    divisible by 16.
// 2. Our primary loop proceeds 64 bytes at a time instead of 16. It's
//    kind of like loop unrolling, but we combine the equality comparisons
//    using a vector OR such that we only need to extract a single mask to
//    determine whether a match exists or not. If so, then we do some
//    book-keeping to determine the precise location but otherwise mush on.
// 3. We use our "chunk" comparison routine in as many places as possible,
//    even if it means using unaligned loads. In particular, if haystack
//    starts with an unaligned address, then we do an unaligned load to
//    search the first 16 bytes. We then start our primary loop at the
//    smallest subsequent aligned address, which will actually overlap with
//    previously searched bytes. But we're OK with that. We do a similar
//    dance at the end of our primary loop. Finally, to avoid a
//    byte-at-a-time loop at the end, we do a final 16 byte unaligned load
//    that may overlap with a previous load. This is OK because it converts
//    a loop into a small number of very fast vector instructions. The overlap
//    is OK because we know the place where the overlap occurs does not
//    contain a match.
//
// And that's pretty all there is to it. Note that since the below is
// generic and since it's meant to be inlined into routines with a
// `#[target_feature(enable = "...")]` annotation, we must mark all routines as
// both unsafe and `#[inline(always)]`.
//
// The fact that the code below is generic does somewhat inhibit us. For
// example, I've noticed that introducing an unlineable `#[cold]` function to
// handle the match case in the loop generates tighter assembly, but there is
// no way to do this in the generic code below because the generic code doesn't
// know what `target_feature` annotation to apply to the unlineable function.
// We could make such functions part of the `Vector` trait, but we instead live
// with the slightly sub-optimal codegen for now since it doesn't seem to have
// a noticeable perf difference.

Core vectorized memchr loop, generic over the Vector trait. The 70-line preamble comment explains the algorithm (a classic byte-search-with-movemask approach also used in glibc and Go), the four-vector unrolling, the aligned/unaligned chunk strategy, and the rationale for #[inline(always)] + unsafe fn over #[target_feature] directly. Justifies impl-algorithm and the documentation portion of unsafe-documented.

src/arch/generic/memchr.rs, line 121-141

    /// Return a pointer to the first occurrence of the needle in the given
    /// haystack. If no such occurrence exists, then `None` is returned.
    ///
    /// When a match is found, the pointer returned is guaranteed to be
    /// `>= start` and `< end`.
    ///
    /// # Safety
    ///
    /// * It must be the case that `start < end` and that the distance between
    /// them is at least equal to `V::BYTES`. That is, it must always be valid
    /// to do at least an unaligned load of `V` at `start`.
    /// * Both `start` and `end` must be valid for reads.
    /// * Both `start` and `end` must point to an initialized value.
    /// * Both `start` and `end` must point to the same allocated object and
    /// must either be in bounds or at most one byte past the end of the
    /// allocated object.
    /// * Both `start` and `end` must be _derived from_ a pointer to the same
    /// object.
    /// * The distance between `start` and `end` must not overflow `isize`.
    /// * The distance being in bounds must not rely on "wrapping around" the
    /// address space.

Representative # Safety block for a public unsafe fn: enumerates pointer validity, allocated-object provenance, isize-distance non-overflow, and no-wrap requirements. Every pub unsafe fn in the arch/ tree carries an equivalent block.

src/lib.rs

src/lib.rs, line 172-195

#![deny(missing_docs)]
#![no_std]
// It's just not worth trying to squash all dead code warnings. Pretty
// unfortunate IMO. Not really sure how to fix this other than to either
// live with it or sprinkle a whole mess of `cfg` annotations everywhere.
#![cfg_attr(
    not(any(
        all(target_arch = "x86_64", target_feature = "sse2"),
        all(target_arch = "wasm32", target_feature = "simd128"),
        target_arch = "aarch64",
    )),
    allow(dead_code)
)]
// Same deal for miri.
#![cfg_attr(miri, allow(dead_code, unused_macros))]

// Supporting 8-bit (or others) would be fine. If you need it, please submit a
// bug report at https://github.com/BurntSushi/memchr
#[cfg(not(any(
    target_pointer_width = "16",
    target_pointer_width = "32",
    target_pointer_width = "64"
)))]
compile_error!("memchr currently not supported on non-{16,32,64}");

Crate-level attributes: #![deny(missing_docs)], #![no_std], plus platform-pointer-width validation. The crate compiles for 16-, 32- and 64-bit targets only. extern crate std and extern crate alloc are pulled in via feature flags. There is no global forbid(unsafe_code) because the crate makes extensive use of unsafe for SIMD intrinsics, justifying uses-unsafe.

src/memmem/searcher.rs

src/memmem/searcher.rs, line 237-278

/// A union indicating one of several possible substring search implementations
/// that are in active use.
///
/// This union should only be read by one of the functions prefixed with
/// `searcher_kind_`. Namely, the correct function is meant to be paired with
/// the union by the caller, such that the function always reads from the
/// designated union field.
#[derive(Clone, Copy)]
union SearcherKind {
    empty: (),
    one_byte: u8,
    two_way: twoway::Finder,
    two_way_with_prefilter: TwoWayWithPrefilter,
    #[cfg(all(target_arch = "x86_64", target_feature = "sse2"))]
    sse2: crate::arch::x86_64::sse2::packedpair::Finder,
    #[cfg(all(target_arch = "x86_64", target_feature = "sse2"))]
    avx2: crate::arch::x86_64::avx2::packedpair::Finder,
    #[cfg(all(target_arch = "wasm32", target_feature = "simd128"))]
    simd128: crate::arch::wasm32::simd128::packedpair::Finder,
    #[cfg(target_arch = "aarch64")]
    neon: crate::arch::aarch64::neon::packedpair::Finder,
}

/// A two-way substring searcher with a prefilter.
#[derive(Copy, Clone, Debug)]
struct TwoWayWithPrefilter {
    finder: twoway::Finder,
    prestrat: Prefilter,
}

/// The type of a substring search function.
///
/// # Safety
///
/// When using a function of this type, callers must ensure that the correct
/// function is paired with the value populated in `SearcherKind` union.
type SearcherKindFn = unsafe fn(
    searcher: &Searcher,
    prestate: &mut PrefilterState,
    haystack: &[u8],
    needle: &[u8],
) -> Option<usize>;

SearcherKind is a union whose active field is identified by the paired SearcherKindFn function pointer rather than a tagged discriminant — the documented motivation is avoiding the runtime cost of enum dispatch in a hot path. The safety invariant (function pointer must read from the union variant actually populated at construction) is documented on SearcherKindFn and on each of the searcher_kind_* functions.

src/tests

Eight test files structured under src/tests/ (compiled with #[cfg(test)]). memchr/prop.rs and substring/prop.rs define quickcheck property tests that compare every vectorized searcher against a naive reference implementation in memchr/naive.rs / substring/naive.rs; the property tests are gated off under miri. The presence of #[cfg(miri)] gates throughout indicates the codebase is run under miri to validate unsafe-code soundness. Justifies has-unit-tests, has-property-tests, and has-integration-tests (no top-level tests/ directory is shipped; the [[test]] sections in Cargo.toml are absent). The upstream fuzz/ workspace member (excluded from the published crate via the exclude list) carries eight cargo-fuzz harnesses targeting every public byte- and substring-search entry point, justifying has-fuzz-tests.

src/vector.rs

src/vector.rs, line 1-66

/// A trait for describing vector operations used by vectorized searchers.
///
/// The trait is highly constrained to low level vector operations needed.
/// In general, it was invented mostly to be generic over x86's __m128i and
/// __m256i types. At time of writing, it also supports wasm and aarch64
/// 128-bit vector types as well.
///
/// # Safety
///
/// All methods are not safe since they are intended to be implemented using
/// vendor intrinsics, which are also not safe. Callers must ensure that the
/// appropriate target features are enabled in the calling function, and that
/// the current CPU supports them. All implementations should avoid marking the
/// routines with #[target_feature] and instead mark them as #[inline(always)]
/// to ensure they get appropriately inlined. (inline(always) cannot be used
/// with target_feature.)
pub(crate) trait Vector: Copy + core::fmt::Debug {
    /// The number of bytes in the vector. That is, this is the size of the
    /// vector in memory.
    const BYTES: usize;
    /// The bits that must be zero in order for a `*const u8` pointer to be
    /// correctly aligned to read vector values.
    const ALIGN: usize;

    /// The type of the value returned by `Vector::movemask`.
    ///
    /// This supports abstracting over the specific representation used in
    /// order to accommodate different representations in different ISAs.
    type Mask: MoveMask;

    /// Create a vector with 8-bit lanes with the given byte repeated into each
    /// lane.
    unsafe fn splat(byte: u8) -> Self;

    /// Read a vector-size number of bytes from the given pointer. The pointer
    /// must be aligned to the size of the vector.
    ///
    /// # Safety
    ///
    /// Callers must guarantee that at least `BYTES` bytes are readable from
    /// `data` and that `data` is aligned to a `BYTES` boundary.
    unsafe fn load_aligned(data: *const u8) -> Self;

    /// Read a vector-size number of bytes from the given pointer. The pointer
    /// does not need to be aligned.
    ///
    /// # Safety
    ///
    /// Callers must guarantee that at least `BYTES` bytes are readable from
    /// `data`.
    unsafe fn load_unaligned(data: *const u8) -> Self;

    /// _mm_movemask_epi8 or _mm256_movemask_epi8
    unsafe fn movemask(self) -> Self::Mask;
    /// _mm_cmpeq_epi8 or _mm256_cmpeq_epi8
    unsafe fn cmpeq(self, vector2: Self) -> Self;
    /// _mm_and_si128 or _mm256_and_si256
    unsafe fn and(self, vector2: Self) -> Self;
    /// _mm_or or _mm256_or_si256
    unsafe fn or(self, vector2: Self) -> Self;
    /// Returns true if and only if `Self::movemask` would return a mask that
    /// contains at least one non-zero bit.
    unsafe fn movemask_will_have_non_zero(self) -> bool {
        self.movemask().has_non_zero()
    }
}

Defines the crate-internal Vector trait abstracting over __m128i (SSE2), __m256i (AVX2), uint8x16_t (NEON) and v128 (wasm simd128). Every method is unsafe because the underlying vendor intrinsics require runtime CPU-feature availability. The module's # Safety documentation requires callers to enable the appropriate target_feature and confirm the CPU supports the feature before invoking these methods, justifying unsafe-documented.