web-crawler

Crates.ioweb-crawler
lib.rsweb-crawler
version0.1.3
sourcesrc
created_at2023-05-27 15:25:54.733279
updated_at2024-10-12 09:55:00.592341
descriptionFinds every page, image, and script on a website (and downloads it)
homepage
repositoryhttps://github.com/Antosser/web-crawler
max_upload_size
id875972
size54,553
Anton Aparin (Antosser)

documentation

https://github.com/Antosser/web-crawler#readme

README

Web Crawler

Finds every page, image, and script on a website (and downloads it)

Usage

Rust Web Crawler

Usage: web-crawler [OPTIONS] <URL>

Arguments:
  <URL>

Options:
  -d, --download
          Download all files
  -c, --crawl-external
          Whether or not to crawl other websites it finds a link to. Might result in downloading the entire internet
  -m, --max-url-length <MAX_URL_LENGTH>
          Maximum url length it allows. Will ignore page it url length reaches this limit [default: 300]
  -e, --exclude <EXCLUDE>
          Will ignore paths that start with these strings (comma-seperated)
      --export <EXPORT>
          Where to export found URLs
      --export-internal <EXPORT_INTERNAL>
          Where to export internal URLs
      --export-external <EXPORT_EXTERNAL>
          Where to export external URLs
  -t, --timeout <TIMEOUT>
          Timeout between requests in milliseconds [default: 100]
  -h, --help
          Print help
  -V, --version
          Print version

How to compile yourself

  1. Download Rust
  2. Type cargo build -r
  3. Executable is in target/release

or

  1. Download Rust
  2. Install using cargo install web-crawler
Commit count: 136

cargo fmt