Audit · winnow@1.0.3

cargo : winnow @ 1.0.3

PE Patrick Elsen signed 2026-05-28 published 2026-05-28

Claims

has-binarieshas-build-exechas-fuzz-testshas-install-exechas-integration-testshas-property-testshas-unit-testsimpl-algorithmimpl-concurrencyimpl-cryptoimpl-datastructureimpl-interpreterimpl-jitimpl-parserimpl-protocolis-benignparser-impl-safeparser-impl-testedunsafe-documentedunsafe-minimalunsafe-safeunsafe-testeduses-concurrencyuses-cryptouses-environmentuses-execuses-filesystemuses-interpreteruses-jituses-networkuses-unsafe

Summary

winnow 1.0.3 is a zero-copy parser-combinator library (successor to nom). All 44 unsafe sites are in src/stream/, covering a repr(transparent) transmute and get_unchecked slice operations whose invariants are documented and enforced by callers. Miri runs in upstream CI. No findings.

Report

Subject

winnow 1.0.3 is a byte-oriented, zero-copy parser-combinator library, a fork and successor of nom. It exposes a Parser trait and a Stream trait, with built-in stream types (&[u8], &str, Bytes, BStr, LocatingSlice, Partial, Stateful, TokenSlice, Recoverable) and combinator modules for ASCII text, binary data, and token-sequence parsing. The library is no_std-compatible (with alloc and std feature gates) and targets Rust 1.65+.

Methodology

The published crate was compared against the upstream Git repository at the commit recorded in .cargo_vcs_info.json using diff -rq. Source was surveyed with grep for unsafe blocks, FFI, network, filesystem, process, environment, and crypto patterns. All 44 unsafe sites were identified by grep; the key unsafe implementations in src/stream/mod.rs, src/stream/bstr.rs, src/stream/bytes.rs, src/stream/locating.rs, src/stream/partial.rs, src/stream/stateful.rs, src/stream/token.rs, and src/stream/recoverable.rs were read in full. The token, ascii, binary, and combinator modules were surveyed for call sites of the unchecked stream methods. CI configuration was inspected for Miri and fuzz coverage.

Results

The diff shows no source-file divergence between the published crate and VCS. The only differences are the normalized Cargo.toml, the .cargo_vcs_info.json file added by crates.io, and expected-absent VCS-only files (tests, benches, .github, fuzz). This justifies is-benign and has-binaries.

No build.rs, proc macro, network, filesystem, process, or environment access was found, justifying has-build-exec, has-install-exec, uses-network, uses-filesystem, uses-exec, uses-environment, uses-jit, uses-interpreter. No crypto or concurrency primitives are used or implemented, justifying uses-crypto, impl-crypto, uses-concurrency, impl-concurrency, impl-datastructure, impl-algorithm, impl-jit, impl-interpreter, impl-protocol. No integration test directory is published in the crate, justifying has-integration-tests. The crate implements a parser-combinator framework, justifying impl-parser. parser-impl-correct was not evaluated against a formal specification; winnow is a combinator framework that does not target a specific data-format specification.

All 44 unsafe sites live in src/stream/. They fall into two categories. First, core::mem::transmute(&[u8]) -> &BStr and -> &Bytes: both target types are #[repr(transparent)] over [u8], so the transmute is valid. Second, get_unchecked(..offset) calls in next_slice_unchecked and peek_slice_unchecked implementations for &[T], &str, Bytes, BStr, and wrapper types (LocatingSlice, Partial, Stateful, TokenSlice, Recoverable). Each carries a // SAFETY: comment referencing the caller-side preconditions documented on the trait: offset must be in-bounds, and for &str, must lie on a UTF-8 character boundary. All concrete implementations also call the safe (panicking) variant first under #[cfg(debug_assertions)]. The unchecked methods are called only within src/stream/ (wrapper impls delegating to inner streams); no combinator, token, or ascii code calls them directly. Callers of the safe next_slice provide offsets computed by iter_offsets, offset_for, offset_at, and eof_offset, all of which return valid in-bounds, character-boundary-aligned offsets for their respective stream types. This justifies unsafe-safe, unsafe-documented, and unsafe-minimal.

A minor documentation inconsistency exists: the peek_slice_unchecked impl on &str (line 479 of src/stream/mod.rs) carries a SAFETY comment that omits the UTF-8 boundary requirement, while the next_slice_unchecked impl on the same type states it explicitly. This is cosmetic and does not affect soundness.

The crate ships 270 unit tests and proptest-based property tests covering token, ascii, binary/bits, and stream modules. Upstream CI runs cargo miri test (with -Zmiri-disable-isolation) and a fuzz target is present in VCS. This justifies has-unit-tests, has-property-tests, has-fuzz-tests, and unsafe-tested, and parser-impl-safe, parser-impl-tested.

Conclusion

All 44 unsafe sites are concentrated in src/stream/ and cover two patterns: a sound repr(transparent) transmute and get_unchecked slice operations whose invariants are stated in the trait contract and enforced by callers that compute offsets through safe iteration APIs. Miri runs in upstream CI. No findings were recorded.

Findings

No findings.

Annotations(2)

`src/stream/mod.rs`

`src/stream/mod.rs`, line 195-225

    ///
    /// </div>
    ///
    /// # Safety
    ///
    /// Callers of this function are responsible that these preconditions are satisfied:
    ///
    /// * Indexes must be within bounds of the original input;
    /// * Indexes must uphold invariants of the stream, like for `str` they must lie on UTF-8
    ///   sequence boundaries.
    ///
    unsafe fn next_slice_unchecked(&mut self, offset: usize) -> Self::Slice {
        // Inherent impl to allow callers to have `unsafe`-free code
        self.next_slice(offset)
    }
    /// Split off a slice of tokens from the input
    fn peek_slice(&self, offset: usize) -> Self::Slice;
    /// Split off a slice of tokens from the input
    ///
    /// # Safety
    ///
    /// Callers of this function are responsible that these preconditions are satisfied:
    ///
    /// * Indexes must be within bounds of the original input;
    /// * Indexes must uphold invariants of the stream, like for `str` they must lie on UTF-8
    ///   sequence boundaries.
    unsafe fn peek_slice_unchecked(&self, offset: usize) -> Self::Slice {
        // Inherent impl to allow callers to have `unsafe`-free code
        self.peek_slice(offset)
    }

The Stream trait defines next_slice_unchecked and peek_slice_unchecked as unsafe methods with documented preconditions: offsets must be in-bounds and, for &str, must lie on UTF-8 character boundaries. All implementations in this module (for &[T], &str, Bytes, BStr, and wrapper types LocatingSlice, Partial, Stateful, TokenSlice, Recoverable) delegate to get_unchecked(..offset) with a // SAFETY: comment referencing the trait's contract. In debug builds, each unchecked override calls the safe (panicking) variant first via #[cfg(debug_assertions)]. The unchecked methods are never called outside the stream module; all public combinators (token, ascii, binary, combinator) use only next_slice and peek_slice. Justifies uses-unsafe, unsafe-safe, unsafe-documented, unsafe-minimal.

BStr and Bytes use core::mem::transmute(&[u8]) -> &Self. Both types are #[repr(transparent)] over [u8], making this transmute valid.

`src/token/tests.rs`

The crate ships 270 unit tests (#[test]) across the token, ascii, binary, stream, and combinator modules, justifying has-unit-tests. The published crate includes proptest-based property tests in src/token/tests.rs, src/ascii/tests.rs, and src/binary/bits/tests.rs, justifying has-property-tests. Upstream CI runs cargo miri test (with -Zmiri-disable-isolation), justifying unsafe-tested. A fuzz target is present in the VCS (fuzz/fuzz_targets/fuzz_arithmetic.rs), justifying has-fuzz-tests. The parser-combinator framework is exercised by the property tests and miri integration, justifying parser-impl-tested.