crawls websites to gather all possible urls
Make sure to have Rust installed.
make sure to create a .env file and add CRAWL_URL=http://0.0.0.0:8080/api/website-crawl
.
replace CRAWL_URL with your production endpoint to accept results. A valid endpoint to accept the hook is required for the crawler to work.
curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
cargo run
you can start the service with docker by running docker build -t crawler . && docker run -dp 8000:8000 crawler
use the docker image
jeffmendez19/crawler
you can install the program as create at crate
crawl - async determine all urls in a website with a post hook
POST
http://localhost:8000/crawl
Body: { url: https://www.a11ywatch.com, id: 0 }
CARGORELEASE=false //determine if prod/dev build ROCKETENV=dev // determine api env CRAWL_URL="http://api:8080/api/website-crawl-background" // endpoint to send results
check the license file in the root of the project.