| Crates.io | surly-spider |
| lib.rs | surly-spider |
| version | 1.0.2 |
| created_at | 2025-03-05 17:12:54.983925+00 |
| updated_at | 2025-03-05 17:14:57.716968+00 |
| description | A command line interface for crawling websites |
| homepage | |
| repository | https://github.com/thesurlydev/surly-spider/ |
| max_upload_size | |
| id | 1579251 |
| size | 64,215 |
A command line interface for crawling websites and storing their content.
USAGE:
ss [FLAGS] [OPTIONS] --domain <DOMAIN>
FLAGS:
-h, --help Prints help information
-r, --respect-robots Respect robots.txt file and not scrape not allowed files
-V, --version Prints version information
-v, --verbose Turn verbose logging on
OPTIONS:
-c, --concurrency <NUM> How many request can be run simultaneously
-d, --domain <DOMAIN> Domain to crawl
-p, --polite-delay <DELAY_IN_MILLIS> Polite crawling delay in milli seconds
-m, --max-depth <DEPTH> Maximum crawl depth from the starting URL
-t, --timeout <SECONDS> Timeout for HTTP requests in seconds
-u, --user-agent <USER_AGENT> Custom User-Agent string for HTTP requests
-o, --output-dir <OUTPUT_DIR> Directory to store output (default: ./spider-output)