A server-side html document syntax and operation library written in Rust, it uses apis similar to jQuery, left off the parts thoes only worked in browser(e.g. render and event related methods), and use names with snake-case instead of camel-case in javasript.
It's not only helpful for the working with web scraping, but also supported useful apis to operate text
nodes, so you can use it to mix your html with dirty html segement to keep away from web scrapers.
main.rs
```rust use visdom::Vis; use std::error::Error;
fn main()-> Result<(), Box
Static method:load(html: &str) -> Result<Elements, Box<dyn Error>>
Load the `html` string into a document `Elements`
Static method:load_catch(html: &str, handle: Box<dyn Fn(Box<dyn Error>)>) -> Elements
Load the `html` string into a document `Elements`, and use the handle to do with the errors such as html parse error, wrong selectors, this is useful if you don't want the process paniced by the errors.
Static method:dom(ele: &BoxDynElement) -> Elements
Change the `ele` node to single node `Elements`, this will copy the `ele`, you don't need it if you just need do something with methods of the `BoxDynElement` its'own.
e.g.:
rust
// go on the code before
let texts = lis.map(|_index, ele|{
let ele = Vis::dom(ele);
return String::from(ele.text());
});
// now `texts` will be a `Vec<String>`: ["Hello,", "Vis", "Dom"]
The following API are inherited from the library mesdoc 。
| Instance | Trait | Inherit | Document | | :---------------------------- | :------------- | :--------- | :-------------------------------------------------------------------------------------------------- | | BoxDynNode | INodeTrait | None | INodeTrait Document | | BoxDynElement | IElementTrait | INodeTrait | IElementTrait Document | | BoxDynText | ITextTrait | INodeTrait | ITextTrait Document | | Box<dyn IDocumentTrait> | IDocumentTrait | None | IDocumentTrait Document |
| Collections | Document | | :---------- | :--------------------------------------------------------------------------------------- | | Elements | Elements Document | | Texts | Texts Document |
| Selector API | Description | Remarks |
| :-------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------: |
| The caller Self
is a Elements
, Return Elements
| Tha all APIs are same with the jQuery library | |
| find
(selector: &str) | Get the descendants of each element in the Self
, filtered by the selector
. | |
| filter
(selector: &str) | Reduce Self
to those that match the selector
. | |
| filter_by
(handle: |index: usize, ele: &BoxDynElement| -> bool) | Reduce Self
to those that pass the handle
function test. | |
| filter_in
(elements: &Elements) | Reduce Self
to those that also in the elements
| |
| not
(selector: &str) | Remove elements those that match the selector
from Self
. | |
| not_by
(handle: |index: usize, ele: &BoxDynElement| -> bool) | Remove elements those that pass the handle
function test from Self
. | |
| not_in
(elements: &Elements) | Remove elements those that also in the elements
from Self
. | |
| is
(selector: &str) | Check at least one element in Self
is match the selector
. | |
| is_by
(handle: |index: usize, ele: &BoxDynElement| -> bool) | Check at least one element call the handle
function return true
. | |
| is_in
(elements: &Elements) | Check at least one element in Self
is also in elements
. | |
| is_all
(selector: &str) | Check if each element in Self
are all matched the selector
. | |
| is_all_by
(handle: |index: usize, ele: &BoxDynElement| -> bool) | Check if each element in Self
call the handle
function are all returned true
. | |
| is_all_in
(elements: &Elements) | Check if each element in Self
are all also in elements
. | |
| has
(selector: &str) | Reduce Self
to those that have a descendant that matches the selector
. | |
| has_in
(elements: &Elements) | Reduce Self
to those that have a descendant that in the elements
. | |
| children
(selector: &str) | Get the children of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| parent
(selector: &str) | Get the parent of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| parents
(selector: &str) | Get the ancestors of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| parents_until
(selector: &str, filter: &str, contains: bool) | Get the ancestors of each element in Self
, until the ancestor matched the selector
, when contains
is true, the matched ancestor will be included, otherwise it will exclude; when the filter
is not empty, will filtered by the selector
; | |
| closest
(selector: &str) | Get the first matched element of each element in Self
, traversing from self to it's ancestors. | |
| siblings
(selector: &str) | Get the siblings of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| next
(selector: &str) | Get the next sibling of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| next_all
(selector: &str) | Get all following siblings of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| next_until
(selector: &str, filter: &str, contains: bool) | Get all following siblings of each element in Self
, until the sibling element matched the selector
, when contains
is true, the matched sibling will be included, otherwise it will exclude; when the filter
is not empty, will filtered by the selector
; |
| prev
(selector: &str) | Get the previous sibling of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| prev_all
(selector: &str) | Get all preceding siblings of each element in Self
, when the selector
is not empty, will filtered by the selector
. | |
| prev_until
(selector: &str, filter: &str, contains: bool) | Get all preceding siblings of each element in Self
, until the previous sibling element matched the selector
, when contains
is true, the matched previous sibling will be included, otherwise it will exclude; when the filter
is not empty, will filtered by the selector
; |
| eq
(index: usize) | Get one element at the specified index
. | |
| first
() | Get the first element of the set,equal to eq(0)
. | |
| last
() | Get the last element of the set, equal to eq(len - 1)
. | |
| slice
add
(eles: Elements) | Get a concated element set from Self
and eles
, it will generate a new element set, take the ownership of the parameter eles
, but have no sence with Self
| |
| Helper API | Description | Remarks |
| :---------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------- | :--------------------------------------------: |
| length
() | Get the number of Self
's element. | |
| is_empty
() | Check if Self
has no element, length() == 0
. | |
| for_each
(handle: |index: usize, ele: &mut BoxDynElement| -> bool) | Iterate over the elements in Self
, when the handle
return false
, stop the iterator. | You can also use each
if you like less code. |
| map
<T>(|index: usize, ele: &BoxDynElement| -> T) -> Vec<T> | Get a collection of values by iterate the each element in Self
and call the handle
function. | |
| Selectors | Description | Remarks |
| :------------------------------------- | :-------------------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------: |
| *
| MDN Universal Selectors | |
| #id
| MDN Id Selector | |
| .class
| MDN Class Selector | |
| p
| MDN Type Selectors | |
| [attr]
| MDN Attribute Selectors | |
| [attr=value]
| See the above. | |
| [attr*=value]
| See the above. | |
| [attr|=value]
| See the above. | |
| [attr~=value]
| See the above. | |
| [attr^=value]
| See the above. | |
| [attr$=value]
| See the above. | |
| [attr!=value]
| jQuery supported, match the element that has an attribute of attr
,but it's value is not equal to value
. | |
| span > a
| MDN Child Combinator | match the element of a
that who's parent is a span
|
| span a
| MDN Descendant Combinator | |
| span + a
| MDN Adjacent Sibling Combinator | |
| span ~ a
| MDN Generic Sibling Combinator | |
| span,a
| MDN Selector list | |
| span.a
| Adjoining Selectors | match an element that who's tag type is span
and also has a class of .a
|
| :empty
| MDN :empty
| Pseudo Selectors |
| :first-child
| MDN :first-child
| |
| :last-child
| MDN :last-child
| |
| :only-child
| MDN :only-child
| |
| :nth-child(nth)
| MDN :nth-child()
| nth
support keyword odd
and even
|
| :nth-last-child(nth)
| MDN :nth-last-child()
| |
| :first-of-type
| MDN :first-of-type
| |
| :last-of-type
| MDN :last-of-type
| |
| :only-of-type
| MDN :only-of-type
| |
| :nth-of-type(nth)
| MDN :nth-of-type()
| |
| :nth-last-of-type(nth)
| MDN :nth-last-of-type()
| |
| :not(selector)
| MDN :not()
| |
| :contains(content)
| Match the element who's text()
contains the content. | |
| :header
| All title tags,alias of: h1,h2,h3,h4,h5,h6
. | |
| :input
| All form input tags, alias of: input,select,textarea,button
. | |
| :submit
| Form submit buttons, alias of: input\[type="submit"\],button\[type="submit"\]
. | |
| Attribute API | Description | Remarks |
| :------------------------------------------------------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----------------------------------------------------------------------------------------------------------------: |
| attr
(attrname: &str) -> Option<IAttrValue> | Get an atrribute of key attr_name
| The return value is an Option Enum IAttrValue
, IAttrValue
has is_true()
, is_str(&str)
, to_list()
methods. |
| set_attr
(attrname: &str, value: Option<&str>) | Set an attribute of key attr_name
,the value is an Option<&str>
, when the value is None
,that means the attribute does'n have a string value, it's a bool value of true
. | |
| remove_attr
(attrname: &str) | Remove an attribute of key attr_name
. | |
| has_class
(classname: &str) -> bool | Check if Self
's ClassList contains class_name
, multiple classes can be splitted by whitespaces. | |
| add_class
(classname: &str) | Add class to Self
's ClassList, multiple classes can be splitted by whitespaces. | |
| remove_class
(classname: &str) | Remove class from Self
's ClassList, multiple classes can be splitted by whitespaces. | |
| toggle_class
(class_name: &str) | Toggle class from Self
's ClassList, multiple classes can be splitted by whitespaces. | |
| Content API | Description | Remarks |
| :---------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----: |
| text
() -> &str | Get the text of each element in Self
,the html entity will auto decoded. | |
| set_text
(content: &str) | Set the Self
's text, the html entity in content
will auto encoded. | |
| html
() | Get the first element in Self
's html. | |
| set_html
(content: &str) | Set the html to content
of each element in Self
. | |
| outer_html
() | Get the first element in Self
's outer html. | |
| texts
(limit_depth: u32) -> Texts | Get the text node of each element in Self
, if limit_depth
is 0
, will get all the descendant text nodes; if 1
, will just get the children text nodes.Texts
not like Elements
, it doesn't have methods by implemented the IElementTrait
trait, but it has append_text
and prepend_text
methods by implemented the ITextTrait
. | |
| DOM Insertion and Remove API | Description | Remarks |
| :---------------------------------------------- | :---------------------------------------------------------------------------------- | :-----: |
| append
(elements: &Elements) | Append all elements
into Self
, after the last childappend_to
(elements: &mut Elements) | The same as the above,but exchange the caller and the parameter target. | |
| prepend
(elements: &mut Elements) | Append all elements
into Self
, befpre the first childprepend_to
(elements: &mut Elements) | The same as the above,but exchange the caller and the parameter target. | |
| insert_after
(elements: &mut Elements) | Insert all elements
after Self
after
(elements: &mut Elements) | The same as the above,but exchange the caller and the parameter target. | |
| insert_before
(elements: &mut Elements) | Insert all elements
before Self
before
(elements: &mut Elements) | The same as the above,but exchange the caller and the parameter target. | |
| remove
() | Remove the Self
, it will take the ownership of Self
, so you can't use it again. | |
| empty
() | Clear the all childs of each element in Self
. | |
``rust
let html = r##"
<div class="second-child"></div>
<div id="container">
<div class="first-child"></div>
</div>
"##;
let root = Vis::load(html)?;
let mut container = root.find("#container");
let mut second_child = root.find(".second-child");
// append the
second-childelement to the
container`
container.append(&mut second_child);
// then the code become to below
/*
/
// create new element by Vis::load
let mut third_child = Vis::load(r##""##)?;
container.append(&mut third_child);
// then the code become to below
/
*/ ```
Welcome to report Issue to us if you have any question or bug or good advice.