HTML Streaming Editor

License: MIT GitHub Workflow Status [docs.rs](https://docs.rs/html-streaming-editor/) Crates.io

Run (simple) manipulations on HTML files, like extracting parts. Use CSS selectors to define which parts of the HTML to operator on, use different commands in pipes to perform the desired operations.

Syntax

The basic syntax is:

COMMAND{ SELECTOR } | COMMAND{ SELECTOR }

Some COMMAND use sub-pipelines. There are two kind of COMMANDS with this: - "iterate"/"forEach": For each (sub) node matching the inner selector the sub-pipeline is processed, but the elements themselves are not changed

COMMAND{ SELECTOR ↦ COMMAND{ SELECTOR } | COMMAND { SELECTOR } }

COMMAND{ SELECTOR => COMMAND{ SELECTOR } | COMMAND { SELECTOR } }

The SELECTOR is a CSS selector.

Pipeline Types

There are three types of pipelines:

Commands

Currently supported element processing commands:

Currently supported element creating commands:

Currently supported string-value creating commands:

Binary

The binary is called hse and supports following options:

``` USAGE: hse [OPTIONS]

ARGS: Single string with the command pipeline to perform. If it starts with an @ the rest is treated as file name to read the pipeline definition from

OPTIONS: -h, --help Print help information -i, --input File name of the Input. - for stdin (default) -o, --output File name of the Output. - for stdout (default) -V, --version Print version information ```

Example

```shell

fetches all elements with CSS class "content" inside a
element

hse -i index.html 'ONLY{main .content}'

fetches the <main> or element with CSS class main, but without any <script> defined inside

hse -i index.html 'ONLY{main, .main} | WITHOUT{script}'

replaces all elements with placeholder class with the
from a second HTML file

hse -i index.html 'MAP{.placeholder ↤ SOURCE{"other.html"} | ONLY{div.content} }'

add a new element to with git version info

hse -i index.html "WITH{head ↦ APPEND-ELEMENT{ NEW{meta} | SET-ATTR{name ↤ 'version'} | SET-ATTR{content ↤ 'git describe --tags'} } }"

add a new comment to with git version info

hse -i index.html "WITH{body ↦ APPEND-COMMENT{'git describe --tags'}}"

add an RDF with same content as </h1> <p>hse -i input.html "WITH{head ↦ APPEND-ELEMENT{ NEW{meta} | SET-ATTR{name ↤ 'dc:title' } } | WITH{meta[name='dc:title'] ↦ SET-ATTR{content ↤ QUERY-PARENT{title} | GET-TEXT-CONTENT } } }"</p> <h1>replace non-word characters with an underscore in an attribute</h1> <p>hse -i index.html "EXTRACT-ELEMENT{#target} | SET-ATTR{data-test ↤ USE-ELEMENT | GET-ATTR{data-test} | REGEX-REPLACE{'\W' ↤ '_'} }"</p> <h1>run the pipeline defined in file <code>file.hsp</code> on content of <code>index.html</code></h1> <p>hse -i index.html @file.hsp ```</p> </body></html>