rphtml

A html parser written in rust, build wasm code for npm package.

一个用 rust 编写的 html 解析器,通过 wasm-pack/wasm-bindgen 提供 npm 包。

npm version Build Status codecov

Use in node

```bash

npm

npm install rphtml --save

yarn

yarn add rphtml ```

``javascript import rphtml from "rphtml"; const htmlCode =

this is header.

`; const ast = rphtml.parse(htmlCode, { allowselfclosing: true, allowfixunclose: false, casesensitivetagname: false, });

const jsonData = ast.toJson(); console.log(jsonData); /* // will output like this { tagindex: 0, depth: 0, nodetype: 'AbstractRoot', beginat: { lineno: 1, colno: 0, index: 0 }, endat: { lineno: 6, colno: 0, index: 72 }, childs: [ { tagindex: 0, depth: 1, nodetype: 'SpacesBetweenTag', beginat: [Object], endat: [Object], content: [Array] }, { tagindex: 1, depth: 2, nodetype: 'Tag', beginat: [Object], endat: [Object], endtag: [Object], childs: [Array], meta: [Object] }, { tagindex: 0, depth: 1, nodetype: 'SpacesBetweenTag', beginat: [Object], end_at: [Object], content: [Array] } ] } */

const code = ast.render(nodeList, { alwaysclosevoid: false, lowercasetagname: true, minifyspaces: true, removeattrquote: false, removecomment: false, removeendtag_space: true, }); console.log(code);

/* // output

this is header.

*/ ```

API

Methods

parse(content: string, parseOptions?: IJsParseOptions) : IJsNode

parse html code to AST, it's a pointer.

通过 parse 静态方法将 html 字符串解析成 html 解析树,它的返回值是一个指针对象,需要调用其上的方法才能获得实际可用的数据。


IJsParseOptions

typescript // static 'parse' method argument options. // parse方法提供以下配置参数 type IJsParserOptions = { allow_self_closing?: boolean; allow_fix_unclose?: boolean; case_sensitive_tagname?: boolean; };


IJsNode

Methods of IJsNode

render(renderOptions?: IJsRenderOptions) : string

render the ast to html code.

将 html 解析树渲染成 html 代码。


IJsRenderOptions

typescript type IJsRenderOptions = { always_close_void?: boolean; lowercase_tagname?: boolean; minify_spaces?: boolean; remove_attr_quote?: boolean; remove_comment?: boolean; remove_endtag_space?: boolean; inner_html?: boolean; decode_entity?: boolean; };


toJson() : IJsNodeTree

return a json data from the pointer after call the parse method.

通过parse方法得到了指针对象后,调用toJson()方法可以获得一个 json 格式的解析树数据。

typescript type IJsNodeTree = { uuid?: string; // the tag's uuid, only for element tag node. depth: number; // the node's depth of the nested. node_type: NodeType; begin_at: CodePosAt; end_at: CodePosAt; content?: Array<string>; // content character end_tag?: IJsNodeTree; // the closed tag meta?: IJsNodeTagMeta; // tag meta information. childs?: Array<IJsNodeTree>; // the childs of the tag. };


toString() : string

return the string of the json data, like JSON.stringify(), you can use JSON.parse() in javascript to get the json data,is same as the toJSON().

toJson方法类似,但该方法返回的是 json 字符串,需要进行 parse 后才能获得真实的 json 对象。


getTagByUuid(uuid:string) : IJsNode

return the tag node by uuid.

每个 tag 节点都会带有自己的 uuid 标记,通过调用根节点的该方法可以获得子节点的引用。与 dom 内的getElementById方法类似。


isAloneTag: boolean

check if the node is a tag node without child tags, e.g. <div>abc</div>

属性isAloneTag表示标签 tag 不包含有其它子 tag。


License

MIT License.