Requests2

A Rust library the crate help you Write a function similar to that of Python's request repository ([Python] BS4 library).

Version 0.1.41 Update Intro

Example:

```rust

[derive(DBfile)]

[dbnote(tablename = "b1", driver = "postgres", primarykey="href")]

struct Item { href: String // primary key should not appear at the end of the structure ... } ```

Item {}.to_db() put data to postgres and the primary_key is href. Item {}.create_table() create a table named b1 .

If not provided driver and primarykey you can use Item{}.to_csv() put data to csv file ,the filename is talename, so you can usetablename="table.csv"`

The conteng of config file like this: postgres=

Version 0.1.3 Update Intro

| Support Css Selector List| | ---- | | .class.class | | .class | | #id| |element.class| |element element| |[attr=value]| |[attr~value]| |element|

Version 0.1.31 add css

| div#id div| |[attr~value]| value support regex

Add #[derive(DBfile)] to struct, you can use func DBStore::to_csv put data in a csv file

Example code

```rust let data = Cache::new(); let client = Requests::new(&data); let rq = client.connect("https://www.qq.com/", Headers::Default);

#[derive(DBfile, Debug)]
struct Link<'a> {
    href: &'a str,
    link_name: String,
}


rq.free_parse(|mut p| {
    p.free_select("li.nav-item a",|n| {
        let links = n.iter().map(|x| {
            Link { href: x.attr("href").expect("extra href error"), link_name: x.text() }
        }).collect::<Vec<Link>>();

        DBStore::to_csv(links, "D:\\links.csv", "a", true);

    });

});

```

Open links.csv view result:

rust href,link_name http://news.qq.com/,新闻 https://v.qq.com/?isoldly=1,视频 http://gongyi.qq.com/,公益 https://new.qq.com/ch/milite/,军事 https://sports.qq.com/,体育 https://sports.qq.com/nba/,NBA https://new.qq.com/ch/ent/,娱乐 https://new.qq.com/ch/finance/,财经 https://new.qq.com/ch/tech/,科技 https://new.qq.com/ch/fashion/,时尚 https://new.qq.com/ch/auto/,汽车 http://house.qq.com/,房产 https://new.qq.com/ch/edu/,教育 https://new.qq.com/ch/cul/,文化 https://new.qq.com/ch/astro/,星座 https://new.qq.com/ch/games/,游戏 http://book.qq.com/,文学 https://v.qq.com/tv/,热剧 https://new.qq.com/ch/antip/,抗肺炎 http://new.qq.com/ch/history/,历史 http://sports.qq.com/premierleague/,英超 http://sports.qq.com/cba/,CBA https://new.qq.com/ch2/star,明星 https://new.qq.com/ch/finance_licai/,理财 https://new.qq.com/ch/kepu/,科普 https://new.qq.com/ch/health/,健康 https://auto.qq.com/car_public/index.shtml,车型 http://www.jia360.com,家居 https://new.qq.com/ch/baby/,育儿 https://new.qq.com/ch/emotion/,情感 https://new.qq.com/ch/comic/,动漫 https://new.qq.com/omv/,享看 http://tianqi.qq.com/index.htm,天气 https://new.qq.com/omn/author/5107513,较真 https://v.qq.com/channel/variety,综艺 https://new.qq.com/ch/cul_ru/,新国风 https://new.qq.com/ch/world/,国际 http://sports.qq.com/csocce/csl/,中超 http://fans.sports.qq.com/#/,社区 http://v.qq.com/movie/,电影 https://new.qq.com/ch/finance_stock/,证券 https://new.qq.com/ch/digi/,数码 https://new.qq.com/ch2/makeup,美容 https://new.qq.com/ch/topic/,话题 https://new.qq.com/ch/life/,生活 http://kid.qq.com/,儿童 http://www.qq.com/map/,全部

Example

```rust let data = Cache::new(); let client = Requests::new(&data); let mut rq = client.connect("https://www.qq.com/", Headers::Default);

rq.parser(|p| { p.findall("a", |x| { x.attr("href").mapor(false, |v| v.starts_with("http")) }, "href") }, "href"); // data.print() ```

Use data.print you can view the value stored as the [href] key. It is a value enumeration type that contains most data types.

Headers

Headers defines three types of request headers. The default Header::default has only one user agent, or it can be without any Headers::None.also use JSON string to make a request header containing useragent and host, this code:

rust let headers = r#"{"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", "host": "www.qq.com"}"#; let store = Cache::new(); let client = Requests::new(&store); let mut p = client.connect("https://www.qq.com", Headers::JSON(headers));

If you need more request header fields, you need to use add corresponding fields in headers.rs

Parser

When you use the connect method to connect to a URL, you can use the parser method to write the work of parsing HTML, the current parser has find and find_ all method

the parser must return Value type

If you use find find_all the code:

rust rq.parser(|p| { p.find_all("a", |x| { x.attr("href").map_or(false, |v| v.starts_with("http")) }, "href") }, "href")

The first parameter of parser is obtained by closure automatically saves the value to value:: list

In general, you may need to handle the parsing manually and customize the returned Value, please use p.select function , this example code:

```rust let data = Cache::new(); let client = Requests::new(&data); let mut parser = client.connect("https://www.qq.com", Headers::Default); parser.parser(|p| { let mut result = HashMap::new();

    let navs = p.select("li.nav-item", |nodes| {
        let navs = nodes.into_iter().map(|n| {
            let mut item = HashMap::new();
            n.find(Name("a")).next().map_or(HashMap::from([("".to_string(), Value::NULL)]), |a| {
                let nav_name = a.text();
                let nav_href = a.attr("href").map_or(String::from(""), |x| x.to_string());
                item.insert("nav_name".to_string(), Value::STR(nav_name));
                item.insert("nav_href".to_string(), Value::STR(nav_href));
                item
            })
        }).collect::<Vec<HashMap<String, Value>>>();

        Value::VECMAP(navs)
    });

    let news = p.select("ul.yw-list", |nodes| {
        let mut news = Vec::new();

        for node in nodes {
            for n in node.find(Class("news-top")) {
                for a in n.find(Name("a")) {
                    let title = a.text();
                    news.push(title);
                }
            }
        }

        Value::LIST(news)
    });

    result.insert("titles".to_owned(), news);
    result.insert("nav".to_owned(), navs);

    Value::MAP(result)
}, "index");

data.print();

```

Value

```rust pub enum Value { /// 字符串类型 STR(String), /// 字符串列表 LIST(Vec), INT(i32), /// 空数据 NULL, /// bool BOOL(bool),

/// map类型的列表
VECMAP(Vec<HashMap<String, Value>>),

/// map类型
MAP(HashMap<String, Value>)

} ``` Add the data type you need in the Value.rs

Concurrency support

use rayon library test the concurrency, this have a simple code:

```rust let data = Cache::new(); let client = Requests::new(&data); let urls = ["https://www.baidu.com", "https://www.qq.com", "https://www.163.com"]; let _ = urls.pariter().map(|url| { let mut p = client.connect(url, Headers::Default); p.parser(|p| { p.findall("a", |f| f.attr("href").mapor(false, |v| v.startswith("http://")), "href")
}, format!("{}link", url).asstr());

    p.parser(|p| {
        p.find("title", |f| f.text() != "", "text")
    }, format!("{}_title", url).as_str());
})
.map(|_| String::from("")).collect::<String>();


match data.get("https://www.qq.com_title") {
    Value::STR(i) => assert_eq!(i, "腾讯首页"),
    _ => panic!("")
};

if let Value::STR(i) = data.get("https://www.163.com_title") {
    assert_eq!(i, "网易");
}

if let Value::STR(i) = data.get("https://www.baidu.com_title") {
    assert_eq!(i, "百度一下,你就知道");
}

```