Crates.io | web-crawler |
lib.rs | web-crawler |
version | 0.1.3 |
source | src |
created_at | 2023-05-27 15:25:54.733279 |
updated_at | 2024-10-12 09:55:00.592341 |
description | Finds every page, image, and script on a website (and downloads it) |
homepage | |
repository | https://github.com/Antosser/web-crawler |
max_upload_size | |
id | 875972 |
size | 54,553 |
Finds every page, image, and script on a website (and downloads it)
Rust Web Crawler
Usage: web-crawler [OPTIONS] <URL>
Arguments:
<URL>
Options:
-d, --download
Download all files
-c, --crawl-external
Whether or not to crawl other websites it finds a link to. Might result in downloading the entire internet
-m, --max-url-length <MAX_URL_LENGTH>
Maximum url length it allows. Will ignore page it url length reaches this limit [default: 300]
-e, --exclude <EXCLUDE>
Will ignore paths that start with these strings (comma-seperated)
--export <EXPORT>
Where to export found URLs
--export-internal <EXPORT_INTERNAL>
Where to export internal URLs
--export-external <EXPORT_EXTERNAL>
Where to export external URLs
-t, --timeout <TIMEOUT>
Timeout between requests in milliseconds [default: 100]
-h, --help
Print help
-V, --version
Print version
cargo build -r
target/release
or
cargo install web-crawler