quick-xml

status Crate docs.rs codecov MSRV

High performance xml pull reader/writer.

The reader: - is almost zero-copy (use of Cow whenever possible) - is easy on memory allocation (the API provides a way to reuse buffers) - support various encoding (with encoding feature), namespaces resolution, special characters.

Syntax is inspired by xml-rs.

Example

Reader

```rust use quickxml::events::Event; use quickxml::reader::Reader;

let xml = r#" Test Test 2 "#; let mut reader = Reader::fromstr(xml); reader.trimtext(true);

let mut count = 0; let mut txt = Vec::new(); let mut buf = Vec::new();

// The Reader does not implement Iterator because it outputs borrowed data (Cows) loop { // NOTE: this is the generic case when we don't know about the input BufRead. // when the input is a &str or a &[u8], we don't actually need to use another // buffer, we could directly call reader.read_event() match reader.readeventinto(&mut buf) { Err(e) => panic!("Error at position {}: {:?}", reader.buffer_position(), e), // exits the loop when reaching end of file Ok(Event::Eof) => break,

    Ok(Event::Start(e)) => {
        match e.name().as_ref() {
            b"tag1" => println!("attributes values: {:?}",
                                e.attributes().map(|a| a.unwrap().value)
                                .collect::<Vec<_>>()),
            b"tag2" => count += 1,
            _ => (),
        }
    }
    Ok(Event::Text(e)) => txt.push(e.unescape().unwrap().into_owned()),

    // There are several other `Event`s we do not consider here
    _ => (),
}
// if we don't keep a borrow elsewhere, we can clear the buffer to keep memory usage low
buf.clear();

} ```

Writer

```rust use quickxml::events::{Event, BytesEnd, BytesStart}; use quickxml::reader::Reader; use quick_xml::writer::Writer; use std::io::Cursor;

let xml = r#"text"#; let mut reader = Reader::fromstr(xml); reader.trimtext(true); let mut writer = Writer::new(Cursor::new(Vec::new())); loop { match reader.readevent() { Ok(Event::Start(e)) if e.name().asref() == b"this_tag" => {

        // crates a new element ... alternatively we could reuse `e` by calling
        // `e.into_owned()`
        let mut elem = BytesStart::new("my_elem");

        // collect existing attributes
        elem.extend_attributes(e.attributes().map(|attr| attr.unwrap()));

        // copy existing attributes, adds a new my-key="some value" attribute
        elem.push_attribute(("my-key", "some value"));

        // writes the event to the writer
        assert!(writer.write_event(Event::Start(elem)).is_ok());
    },
    Ok(Event::End(e)) if e.name().as_ref() == b"this_tag" => {
        assert!(writer.write_event(Event::End(BytesEnd::new("my_elem"))).is_ok());
    },
    Ok(Event::Eof) => break,
    // we can either move or borrow the event to write, depending on your use-case
    Ok(e) => assert!(writer.write_event(e).is_ok()),
    Err(e) => panic!("Error at position {}: {:?}", reader.buffer_position(), e),
}

}

let result = writer.intoinner().intoinner(); let expected = r#"text"#; asserteq!(result, expected.asbytes()); ```

Serde

When using the serialize feature, quick-xml can be used with serde's Serialize/Deserialize traits. The mapping between XML and Rust types, and in particular the syntax that allows you to specify the distinction between elements and attributes, is described in detail in the documentation for deserialization.

Credits

This has largely been inspired by serde-xml-rs. quick-xml follows its convention for deserialization, including the $value special name.

Parsing the "value" of a tag

If you have an input of the form <foo abc="xyz">bar</foo>, and you want to get at the bar, you can use either the special name $text, or the special name $value:

rust,ignore struct Foo { #[serde(rename = "@abc")] pub abc: String, #[serde(rename = "$text")] pub body: String, }

Read about the difference in the documentation.

Performance

Note that despite not focusing on performance (there are several unnecessary copies), it remains about 10x faster than serde-xml-rs.

Features

Performance

Benchmarking is hard and the results depend on your input file and your machine.

Here on my particular file, quick-xml is around 50 times faster than xml-rs crate.

``` // quick-xml benches test benchquickxml ... bench: 198,866 ns/iter (+/- 9,663) test benchquickxmlescaped ... bench: 282,740 ns/iter (+/- 61,625) test benchquickxmlnamespaced ... bench: 389,977 ns/iter (+/- 32,045)

// same bench with xml-rs test benchxmlrs ... bench: 14,468,930 ns/iter (+/- 321,171)

// serde-xml-rs vs serialize feature test benchserdequickxml ... bench: 1,181,198 ns/iter (+/- 138,290) test benchserdexmlrs ... bench: 15,039,564 ns/iter (+/- 783,485) ```

For a feature and performance comparison, you can also have a look at RazrFalcon's parser comparison table.

Contribute

Any PR is welcomed!

License

MIT