Subject
wasmparser 0.244.0 is the WebAssembly binary parser and validator from the
bytecodealliance wasm-tools workspace. It is an event-driven, zero-copy
reader: Parser yields a stream of Payload values over a wasm module or
component, and Validator consumes those payloads to type-check the module
against the core wasm and component-model specifications. The public API
exposes the low-level BinaryReader, the section readers in src/readers, the
operator decoder, and the full validator. It is the front-end stage shared by
downstream engines such as wasmtime; it parses and validates but does not
compile or execute wasm. The crate is no_std-capable, with std,
validate, serde, component-model, hash-collections, and simd
features enabled by default.
Methodology
Tools: openvet 0.6.0, ripgrep, diff, git, Read. The published contents/ tree
holds 39,831 lines of Rust across 38 files. I read binary_reader.rs (LEB128
decoders, EOF bounds checks, string and reader length-prefix handling),
limits.rs (resource caps) and parser.rs (section dispatch, the delimited
length-tracking helper) in full or near-full; the validator core
(validator.rs count checks, validator/operators.rs control-flow and operand
stack, validator/func.rs locals, validator/core/canonical.rs subtype-depth
limiting, validator/names.rs for the single unsafe site) in the
load-bearing sections; and the src/readers section-iterator machinery and
src/collections wrappers more selectively. I surveyed the whole tree with
ripgrep for unsafe, FFI, network, filesystem, process, environment, crypto,
RNG, and concurrency primitives. A diff -rq contents vcs shows the only
difference is the cargo-normalised Cargo.toml; the source is byte-identical
to the seeded VCS snapshot. I did not fetch the WebAssembly specification, so I
did not evaluate spec conformance.
Results
The source under contents/src is byte-equivalent to vcs/src; only
Cargo.toml differs, as expected from cargo normalisation. The survey found no
network, filesystem, process, environment, cryptographic, RNG, or concurrency
usage anywhere in the crate, so uses-network, uses-filesystem,
uses-environment, uses-exec, uses-crypto, and uses-concurrency are all
false; the sha256 occurrences are component-model import-name string parsing,
not cryptographic operations. There is no build.rs and build = false in the
manifest, no install hooks, and no pre-compiled assets, so has-build-exec,
has-install-exec, and has-binaries are false. The crate neither compiles nor
runs wasm, so uses-jit, uses-interpreter, impl-jit, impl-interpreter,
impl-protocol, and impl-crypto are false.
The crate implements a parser (impl-parser) and the validation algorithms that
type-check it (impl-algorithm). The collection types in src/collections are
thin wrappers that delegate to indexmap, hashbrown, or BTreeMap rather
than novel storage, so impl-datastructure is false, and the crate uses no
concurrency primitives of its own, so impl-concurrency is false.
The crate contains exactly one unsafe block (uses-unsafe), at
validator/names.rs:38: a core::mem::transmute coercing &str to
&KebabStr, where KebabStr is #[repr(transparent)] over str. The
transparent representation makes the layouts identical, the safety comment
states the invariant, and this is the only unsafe in a crate that otherwise
sets unsafe_code = "deny", so unsafe-safe, unsafe-documented, and
unsafe-minimal hold. I did not run miri or a sanitizer against it, so I did
not assert unsafe-tested.
Parser safety on adversarial input (parser-impl-safe) rests on three
mechanisms. First, the LEB128 decoders in binary_reader.rs:441-658 reject
overlong encodings (a continuation bit on the final permitted byte) and
out-of-range values (unused high bits set), and every byte read goes through
read_u8, which is EOF-checked, so a truncated integer yields an error rather
than a panic or over-read. Second, section and string lengths are bounds-checked:
read_size/read_string cap lengths against limits.rs constants,
ensure_has_bytes rejects reads past the buffer end, and delimited
(parser.rs:1280) uses checked_sub to keep function-body reads inside the
declared section byte range. Third, attacker-controlled count fields do not
drive eager allocation: SectionLimited (readers.rs:84) stores only the
count and reads items one at a time through its iterator, hitting EOF when the
section data is exhausted. The validator additionally enforces the
WebAssembly JS API resource limits in limits.rs (types, functions, imports,
exports, globals, tables, memories, tags, locals, subtyping depth, and the
component-model caps) through check_max, which uses checked arithmetic.
Locals are capped at MAX_WASM_FUNCTION_LOCALS via checked_add
(operators.rs:4601) and stored in a bounded cache plus a binary-searched
overflow list. The control-flow and operand stacks are pushed and popped in
lockstep, pop_ctrl verifies the operand stack returns to the frame height,
branch targets are bounds-checked by jump, and out-of-bounds local indices
are rejected by local; these support algorithm-impl-safe. Subtype-chain
depth is capped at MAX_WASM_SUBTYPING_DEPTH (63) in canonical.rs,
bounding recursion in type matching.
The crate carries #[cfg(test)] mod tests unit tests in five source files
(has-unit-tests) and one integration test, tests/big-module.rs, which
exercises hundred-thousand-element index spaces (has-integration-tests). The
published crate ships no fuzz/ directory and no proptest or quickcheck
harness, so has-fuzz-tests and has-property-tests are false; upstream fuzz
targets live in the parent workspace, outside this crate. Because I did not run
the fuzzers or the spec test suite, I did not assert parser-impl-tested,
parser-impl-correct, algorithm-impl-tested, algorithm-impl-correct, or
algorithm-impl-bounds. No malicious or obfuscated code, telemetry, hidden
endpoints, or time-based behaviour was found, so is-benign holds. No findings
were recorded.
Conclusion
wasmparser 0.244.0 parses and validates WebAssembly with a single documented
unsafe transmute whose #[repr(transparent)] invariant holds. Its
adversarial-input handling is layered: EOF-checked LEB128 decoding that rejects
overlong and out-of-range encodings, length and byte-range bounds checks on
sections and strings, lazy per-item section iteration that prevents
count-driven allocation amplification, and validator resource limits enforced
with checked arithmetic against the WebAssembly JS API caps. The crate performs
no I/O of any kind and pulls in four optional dependencies behind features. No
security, safety, or correctness findings were recorded. Spec conformance and
fuzz coverage were not independently re-run, and the corresponding *-correct
and *-tested claims were left unasserted.