surly-spider

Crates.iosurly-spider
lib.rssurly-spider
version1.0.2
created_at2025-03-05 17:12:54.983925+00
updated_at2025-03-05 17:14:57.716968+00
descriptionA command line interface for crawling websites
homepage
repositoryhttps://github.com/thesurlydev/surly-spider/
max_upload_size
id1579251
size64,215
Shane Witbeck (thesurlydev)

documentation

README

spider

A command line interface for crawling websites and storing their content.

usage

USAGE:
    ss [FLAGS] [OPTIONS] --domain <DOMAIN>

FLAGS:
    -h, --help              Prints help information
    -r, --respect-robots    Respect robots.txt file and not scrape not allowed files
    -V, --version           Prints version information
    -v, --verbose           Turn verbose logging on

OPTIONS:
    -c, --concurrency <NUM>                 How many request can be run simultaneously
    -d, --domain <DOMAIN>                   Domain to crawl
    -p, --polite-delay <DELAY_IN_MILLIS>    Polite crawling delay in milli seconds
    -m, --max-depth <DEPTH>                 Maximum crawl depth from the starting URL
    -t, --timeout <SECONDS>                 Timeout for HTTP requests in seconds
    -u, --user-agent <USER_AGENT>           Custom User-Agent string for HTTP requests
    -o, --output-dir <OUTPUT_DIR>           Directory to store output (default: ./spider-output)
Commit count: 7

cargo fmt