Represents an XML 1.0 document as a read-only tree.
rust
// Find element by id.
let doc = roxmltree::Document::parse("<rect id='rect1'/>").unwrap();
let elem = doc.descendants().find(|n| n.attribute("id") == Some("rect1")).unwrap();
assert!(elem.has_tag_name("rect"));
Because in some cases all you need is to retrieve some data from an XML document. And for such cases, we can make a lot of optimizations.
As for roxmltree, it's fast not only because it's read-only, but also because it uses [xmlparser], which is many times faster than [xml-rs]. See the Performance section for details.
Sadly, XML can be parsed in many different ways. roxmltree tries to mimic the behavior of Python's lxml. But unlike lxml, roxmltree does support comments outside the root element.
Fo more details see docs/parsing.md.
| Feature/Crate | roxmltree | [libxml2] | [xmltree] | [elementtree] | [sxd-document] | [treexml] | | ------------------------------- | :--------------: | :-----------------: | :--------------: | :--------------: | :--------------: | :--------------: | | Element namespace resolving | ✔ | ✔ | ✔ | ✔ | ~1 | | | Attribute namespace resolving | ✔ | ✔ | | | ✔ | | | [Entity references] | ✔ | ✔ | ⚠ | ⚠ | ⚠ | ⚠ | | [Character references] | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | | [Attribute-Value normalization] | ✔ | ✔ | | | | | | Comments | ✔ | ✔ | | | ✔ | | | Processing instructions | ✔ | ✔ | ⚠ | | ✔ | | | UTF-8 BOM | ✔ | ✔ | ⚠ | ⚠ | ⚠ | ⚠ | | Non UTF-8 input | | ✔ | | | | | | Complete DTD support | | ✔ | | | | | | Position preserving2 | ✔ | ✔ | | | | | | HTML support | | ✔ | | | | | | Tree modification | | ✔ | ✔ | ✔ | ✔ | ✔ | | Writing | | ✔ | ✔ | ✔ | ✔ | ✔ | | No unsafe | ✔ | | ✔ | ~3 | | ✔ | | Language | Rust | C | Rust | Rust | Rust | Rust | | Size overhead4 | ~60KiB | ~1.4MiB5 | ~80KiB | ~96KiB | ~135KiB | ~110KiB | | Dependencies | 1 | ?5 | 2 | 18 | 2 | 14 | | Tested version | 0.9.0 | 2.9.8 | 0.10.0 | 0.5.0 | 0.3.0 | 0.7.0 | | License | MIT / Apache-2.0 | MIT | MIT | BSD-3-Clause | MIT | MIT |
Legend:
Notes:
string_cache
crate.```text test largeroxmltree ... bench: 3,344,633 ns/iter (+/- 9,063) test largesdxdocument ... bench: 7,583,625 ns/iter (+/- 20,025) test largeelementtree ... bench: 20,636,201 ns/iter (+/- 606,186) test largexmltree ... bench: 20,792,783 ns/iter (+/- 523,851) test largetreexml ... bench: 21,119,276 ns/iter (+/- 607,112)
test mediumroxmltree ... bench: 659,865 ns/iter (+/- 7,519) test mediumsdxdocument ... bench: 2,510,734 ns/iter (+/- 18,054) test mediumtreexml ... bench: 7,598,947 ns/iter (+/- 69,761) test mediumxmltree ... bench: 7,678,284 ns/iter (+/- 174,265) test mediumelementtree ... bench: 7,899,743 ns/iter (+/- 92,997)
test tinyroxmltree ... bench: 4,178 ns/iter (+/- 23) test tinysdxdocument ... bench: 18,202 ns/iter (+/- 91) test tinytreexml ... bench: 28,987 ns/iter (+/- 811) test tinyelementtree ... bench: 29,421 ns/iter (+/- 239) test tinyxmltree ... bench: 29,425 ns/iter (+/- 877) ```
roxmltree uses [xmlparser] internally, while sdx-document uses its own implementation and xmltree, elementtree and treexml use the [xml-rs] crate. Here is a comparison between xmlparser, xml-rs and quick-xml:
```text test largequickxml ... bench: 1,245,293 ns/iter (+/- 532,460) test largexmlparser ... bench: 1,615,152 ns/iter (+/- 11,505) test largexmlrs ... bench: 19,024,349 ns/iter (+/- 1,102,255)
test mediumquickxml ... bench: 246,507 ns/iter (+/- 3,300) test mediumxmlparser ... bench: 337,958 ns/iter (+/- 2,465) test mediumxmlrs ... bench: 6,944,242 ns/iter (+/- 29,862)
test tinyquickxml ... bench: 2,328 ns/iter (+/- 67) test tinyxmlparser ... bench: 2,578 ns/iter (+/- 931) test tinyxmlrs ... bench: 27,343 ns/iter (+/- 3,299) ```
You can try it yourself by running cargo bench
in the benches
dir.
Notes:
xmlReadFile()
will parse only an XML structure,
without attributes normalization and stuff. So it's hard to compare.
And we have to use a separate benchmark utility.unsafe
code.This library uses Rust's idiomatic API based on iterators. In case you are more familiar with browser/JS DOM APIs - you can check out tests/dom-api.rs to see how it can be converted into a Rust one.
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.