xml-rs is an XML library for Rust programming language. It is heavily inspired by Java Streaming API for XML (StAX).
This library currently contains pull parser much like StAX event reader. It provides iterator API, so you can leverage Rust's existing iterators library features.
It also provides a streaming document writer much like StAX event writer. This writer consumes its own set of events, but reader events can be converted to writer events easily, and so it is possible to write XML transformation chains in a pretty clean manner.
This parser is mostly full-featured, however, there are limitations:
* Only UTF-8 is supported;
* DTD validation is not supported, <!DOCTYPE>
declarations are completely ignored; thus no
support for custom entities too; internal DTD declarations are likely to cause parsing errors;
* attribute value normalization is not performed, and end-of-line characters are not normalized too.
Other than that the parser tries to be mostly XML-1.0-compliant.
Writer is also mostly full-featured with the following limitations:
* no support for encodings other than UTF-8, for the same reason as above;
* no support for emitting <!DOCTYPE>
declarations;
* more validations of input are needed, for example, checking that namespace prefixes are bounded
or comments are well-formed.
What is planned (highest priority first, approximately):
xml-rs uses Cargo, so add it with cargo add xml-rs
or modify Cargo.toml
:
toml
[dependencies]
xml-rs = "0.8"
The package exposes a single crate called xml
.
xml::reader::EventReader
requires a Read
instance to read from. It can be a File
wrapped in BufReader
, or a Vec<u8>
, or a &[u8]
slice.
EventReader
implements IntoIterator
trait, so you can use it in a for
loop directly:
```rust,no_run use std::fs::File; use std::io::BufReader;
use xml::reader::{EventReader, XmlEvent};
fn main() -> std::io::Result<()> { let file = File::open("file.xml")?; let file = BufReader::new(file); // Buffering is important for performance
let parser = EventReader::new(file);
let mut depth = 0;
for e in parser {
match e {
Ok(XmlEvent::StartElement { name, .. }) => {
println!("{:spaces$}+{name}", "", spaces = depth * 2);
depth += 1;
}
Ok(XmlEvent::EndElement { name }) => {
depth -= 1;
println!("{:spaces$}-{name}", "", spaces = depth * 2);
}
Err(e) => {
eprintln!("Error: {e}");
break;
}
// There's more: https://docs.rs/xml-rs/latest/xml/reader/enum.XmlEvent.html
_ => {}
}
}
Ok(())
} ```
Document parsing can end normally or with an error. Regardless of exact cause, the parsing process will be stopped, and iterator will terminate normally.
You can also have finer control over when to pull the next event from the parser using its own
next()
method:
rust,ignore
match parser.next() {
...
}
Upon the end of the document or an error the parser will remember that last event and will always
return it in the result of next()
call afterwards. If iterator is used, then it will yield
error or end-of-document event once and will produce None
afterwards.
It is also possible to tweak parsing process a little using xml::reader::ParserConfig
structure.
See its documentation for more information and examples.
You can find a more extensive example of using EventReader
in src/analyze.rs
, which is a
small program (BTW, it is built with cargo build
and can be run after that) which shows various
statistics about specified XML document. It can also be used to check for well-formedness of
XML documents - if a document is not well-formed, this program will exit with an error.
xml-rs also provides a streaming writer much like StAX event writer. With it you can write an
XML document to any Write
implementor.
```rust,no_run use std::io; use xml::writer::{EmitterConfig, XmlEvent};
/// A simple demo syntax where "+foo" makes <foo>
, "-foo" makes </foo>
fn makeeventfromline(line: &str) -> XmlEvent {
let line = line.trim();
if let Some(name) = line.stripprefix("+") {
XmlEvent::startelement(name).into()
} else if line.startswith("-") {
XmlEvent::end_element().into()
} else {
XmlEvent::characters(line).into()
}
}
fn main() -> io::Result<()> { let input = io::stdin(); let output = io::stdout(); let mut writer = EmitterConfig::new() .performindent(true) .createwriter(output);
let mut line = String::new();
loop {
line.clear();
let bytes_read = input.read_line(&mut line)?;
if bytes_read == 0 {
break; // EOF
}
let event = make_event_from_line(&line);
if let Err(e) = writer.write(event) {
panic!("Write error: {e}")
}
}
Ok(())
} ```
The code example above also demonstrates how to create a writer out of its configuration.
Similar thing also works with EventReader
.
The library provides an XML event building DSL which helps to construct complex events, e.g. ones having namespace definitions. Some examples:
```rust,ignore
//
//
//
Of course, one can create XmlEvent
enum variants directly instead of using the builder DSL.
There are more examples in xml::writer::XmlEvent
documentation.
The writer has multiple configuration options; see EmitterConfig
documentation for more
information.
All known issues are present on GitHub issue tracker: https://github.com/kornelski/xml-rs/issues. Feel free to post any found problems there.
This library is licensed under MIT license.
Copyright (C) Vladimir Matveev, 2014-2020