Maman

Maman is a Rust Web Crawler saving pages on Redis.

Pages are send to list <MAMAN_ENV>:queue:maman using Sidekiq job format

json { "class": "Maman", "jid": "b4a577edbccf1d805744efa9", "retry": true, "created_at": 1461789979, "enqueued_at": 1461789979, "args": { "document":"<html><body><a href='#' /><a href='/new' /></html>", "urls": ["https://example.net/new"], "headers": {"content-type": "text/html"}, "url": "https://example.net/" } }

Dependencies

Installation

With cargo

~~~ cargo install maman ~~~

With make

~~~ PREFIX=~/.local make install ~~~

Usage

~~~ maman URL [LIMIT] [MIME_TYPES] ~~~

LIMIT must be an integer or 0 is the default, meaning no limit.

Environment variables

Defaults

Others

LICENSE

The MIT License

Copyright (c) 2016-2018 Laurent Arnoud laurent@spkdev.net


Build Version Documentation License Project status Dependency status