WIP - work in progress, use at your own risk
A simple and general purpose html/xhtml parser, using Pest.
<cat/>
, <Cat/>
and <C4-t/>
are all ok!If your requirements matches any of the above, then you're most likely looking for one of the crates below:
Parse html document
```rust use html_parser::Dom;
fn main() {
let html = r#"
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Html parser</title>
</head>
<body>
<h1 id="a" class="b c">Hello world</h1>
</h1> <!-- comments & dangling elements are ignored -->
</body>
</html>"#;
assert!(Dom::parse(html).is_ok());
}
```
Parse html fragment
```rust use html_parser::Dom;
fn main() {
let html = "<div id=cat />";
assert!(Dom::parse(html).is_ok());
}
```
Print to json
```rust use html_parser::{Dom, Result};
fn main() -> Result<()> {
let html = "<div id=cat />";
let json = Dom::parse(html)?.to_json_pretty()?;
println!("{}", json);
Ok(())
}
```
I would love to get some feedback if you find my little project useful. Please feel free to highlight issues with my code or submit a PR in case you want to improve it.