cargo / quick-xml / audit
cargo : quick-xml @ 0.39.4
PE Patrick Elsen signed 2026-05-28 published 2026-05-28

Claims

filesystem-safehas-binarieshas-build-exechas-fuzz-testshas-install-exechas-integration-testshas-property-testshas-unit-testsimpl-algorithmimpl-concurrencyimpl-cryptoimpl-datastructureimpl-interpreterimpl-jitimpl-parserimpl-protocolis-benignparser-impl-correctparser-impl-safeparser-impl-testeduses-concurrencyuses-cryptouses-environmentuses-execuses-filesystemuses-interpreteruses-jituses-networkuses-unsafe

Summary

quick-xml 0.39.4 is a pull-based XML reader/writer with optional serde and tokio support. The crate uses #![forbid(unsafe_code)], has no network access, and does not resolve external entities or expand DTD-declared entities, eliminating XXE and billion-laughs risks at the library level. No findings were recorded.

Report

Subject

quick-xml is a pull-based XML reader and writer for Rust. It exposes a StAX-style event API (Reader, NsReader) that yields typed events (Start, End, Text, CData, Comment, DocType, GeneralRef, etc.) without allocating unless the caller requests owned data. An optional serialize feature adds serde Deserialize/Serialize support via the de and se modules. An optional async-tokio feature wraps the reader over tokio's AsyncBufRead. The crate positions itself as high-performance and nearly zero-copy.

Methodology

The published crate contents were compared against the upstream Git repository at the commit recorded in .cargo_vcs_info.json using diff -rq. All source files in contents/src/ (36 files, ~39 KLOC) were surveyed. Files central to the security-relevant paths (src/parser/dtd.rs, src/escape.rs, src/reader/mod.rs, src/reader/state.rs, src/reader/buffered_reader.rs, src/de/resolver.rs, src/de/mod.rs, src/lib.rs) were read in full. The serde serializer module (src/se/) and writer (src/writer.rs) were sampled. Surveys were run with grep for unsafe code, FFI, network, filesystem, process execution, environment variable, crypto, and concurrency usage. The VCS repository was checked for fuzz targets.

Results

The diff -rq comparison shows that all source-file differences between the published crate and the VCS checkout are CRLF line-ending differences only. Code is byte-equivalent after stripping carriage returns. Files excluded from publishing (tests, benchmarks, examples, fuzz targets, CI configuration) are absent from the published crate as expected, matching the include directive in Cargo.toml.orig. No unexplained files or binaries are present, justifying has-binaries and is-benign.

The crate declares #![forbid(unsafe_code)] at src/lib.rs:47, enforced by the compiler. grep confirms zero unsafe blocks in the published source, justifying uses-unsafe. There are no FFI declarations, no network access (uses-network), no process execution (uses-exec), no JIT or interpreter usage (uses-jit, uses-interpreter), no cryptographic operations (uses-crypto), no environment variable reads (uses-environment), and no concurrency primitives (uses-concurrency). There is no build.rs and the crate is not a proc macro, justifying has-build-exec and has-install-exec.

The only filesystem operation is Reader::from_file and NsReader::from_file (src/reader/buffered_reader.rs:575-576), which open a caller-supplied path via std::fs::File::open. The crate does not follow any paths internally; this justifies uses-filesystem and filesystem-safe.

The crate implements a pull XML parser (impl-parser) and does not implement cryptography (impl-crypto), a concurrency primitive (impl-concurrency), a data structure (impl-datastructure), an algorithm (impl-algorithm), an interpreter (impl-interpreter), a JIT (impl-jit), or a protocol (impl-protocol). The DTD parser (src/parser/dtd.rs) skips DTD contents structurally. It recognises "ENTITY", "ELEMENT", "ATTLIST", and "NOTATION" markup to track quoting context, but it does not fetch SYSTEM or PUBLIC URIs, does not extract entity values, and never resolves external entity references. General entity references encountered in document content are surfaced as Event::GeneralRef events and not automatically expanded. This means the parser has no XXE (XML External Entity) exposure by default. Entity expansion is also not recursive: unescape_with in src/escape.rs resolves one level only and errors on unrecognised entities, making billion-laughs entity-expansion attacks inapplicable. These properties justify parser-impl-safe.

The serde Deserializer in src/de/mod.rs uses PredefinedEntityResolver by default, which resolves only the five standard XML entities. Custom entity resolution (for DTD-declared entities) requires an explicit EntityResolver implementation from the caller. parser-impl-correct is set to false: the crate does not demonstrate full conformance to the XML 1.0/1.1 specification; in particular, DTD entity declarations are not parsed from the DTD, external DTD subsets are not fetched, and comment validation is opt-in.

The fuzz corpus in the VCS repository (fuzz/fuzz_targets/fuzz_target_1.rs and structured_roundtrip.rs) exercises the core reader and serde round-trip paths, justifying has-fuzz-tests. The crate has 252 in-source unit tests (has-unit-tests). No integration tests or property-based tests exist in the published crate (has-integration-tests, has-property-tests). Combined, the testing coverage justifies parser-impl-tested.

The Arc usage in src/errors.rs is for making Error cloneable, not for shared mutable state; there is no thread spawning or use of concurrency primitives.

Conclusion

The crate has no unsafe code (enforced by #![forbid(unsafe_code)]), no network access, and no external-entity resolution. The parser does not expand DTD entities and surfaces GeneralRef events to the caller unevaluated, eliminating XXE and billion-laughs risks at the library level. The filesystem surface is limited to two from_file convenience constructors. No findings were recorded.

Findings

No findings.

Annotations(4)

src/escape.rs

src/escape.rs, line 255-305

pub fn unescape_with<'input, 'entity, F>(
    raw: &'input str,
    mut resolve_entity: F,
) -> Result<Cow<'input, str>, EscapeError>
where
    // the lifetime of the output comes from a capture or is `'static`
    F: FnMut(&str) -> Option<&'entity str>,
{
    let bytes = raw.as_bytes();
    let mut unescaped = None;
    let mut last_end = 0;
    let mut iter = memchr2_iter(b'&', b';', bytes);
    while let Some(start) = iter.by_ref().find(|p| bytes[*p] == b'&') {
        match iter.next() {
            Some(end) if bytes[end] == b';' => {
                // append valid data
                if unescaped.is_none() {
                    unescaped = Some(String::with_capacity(raw.len()));
                }
                let unescaped = unescaped.as_mut().expect("initialized");
                unescaped.push_str(&raw[last_end..start]);

                // search for character correctness
                let pat = &raw[start + 1..end];
                if let Some(entity) = pat.strip_prefix('#') {
                    let codepoint = parse_number(entity).map_err(EscapeError::InvalidCharRef)?;
                    unescaped.push_str(codepoint.encode_utf8(&mut [0u8; 4]));
                } else if let Some(value) = resolve_entity(pat) {
                    unescaped.push_str(value);
                } else {
                    return Err(EscapeError::UnrecognizedEntity(
                        start + 1..end,
                        pat.to_string(),
                    ));
                }

                last_end = end + 1;
            }
            _ => return Err(EscapeError::UnterminatedEntity(start..raw.len())),
        }
    }

    if let Some(mut unescaped) = unescaped {
        if let Some(raw) = raw.get(last_end..) {
            unescaped.push_str(raw);
        }
        Ok(Cow::Owned(unescaped))
    } else {
        Ok(Cow::Borrowed(raw))
    }
}

The unescape_with function resolves entity references one level deep only; it does not recursively expand entities. Unrecognised entity names produce an EscapeError::UnrecognizedEntity error rather than silently passing through. The five predefined XML entities (lt, gt, amp, apos, quot) and numeric character references (&#N; and &#xN;) are always handled. The optional escape-html feature extends recognition to the full HTML5 entity table. There is no billowing-laughs risk: the crate never interprets entity values as XML or expands them recursively. This evidence supports parser-impl-safe and impl-parser.

src/lib.rs

src/lib.rs, line 47-47

#![forbid(unsafe_code)]

The crate declares #![forbid(unsafe_code)] at the crate root, which is enforced by the compiler. No unsafe blocks exist anywhere in the published source. This justifies uses-unsafe.

src/parser/dtd.rs

The DTD parser skips DTD contents structurally without resolving or fetching external entities. It recognises the DOCTYPE internal subset markers (ENTITY, ELEMENT, ATTLIST, NOTATION declarations) to track quoting context, but it does not follow SYSTEM or PUBLIC URIs and never reads external resources. This means the parser is not susceptible to XXE (XML External Entity) attacks by default. Entity references encountered outside the DTD are surfaced as Event::GeneralRef events and are not automatically expanded. The serde deserializer resolves only the five predefined XML entities by default; custom entity resolution requires an explicit EntityResolver implementation supplied by the caller.

src/reader/buffered_reader.rs

src/reader/buffered_reader.rs, line 573-580

impl Reader<BufReader<File>> {
    /// Creates an XML reader from a file path.
    pub fn from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
        let file = File::open(path)?;
        let reader = BufReader::new(file);
        Ok(Self::from_reader(reader))
    }
}

The Reader::from_file and NsReader::from_file methods open a caller-specified path via File::open. This is the only filesystem operation in the crate; it opens only what the caller passes and does not chase external references. Path traversal safety is the caller's responsibility. This justifies uses-filesystem and filesystem-safe.