| Crates.io | crawn |
| lib.rs | crawn |
| version | 0.1.0 |
| created_at | 2026-01-25 20:41:53.123228+00 |
| updated_at | 2026-01-25 20:41:53.123228+00 |
| description | A utility for web crawling and scraping |
| homepage | |
| repository | https://github.com/Tahaa-Dev/crawn |
| max_upload_size | |
| id | 2069498 |
| size | 89,011 |
Fast async web crawler with smart keyword filtering
Run this command (requires cargo):
cargo install crawn
git clone https://github.com/Tahaa-Dev/crawn.git
cd crawn
cargo build --release
crawn -o output.ndjson https://example.com
crawn -o output.ndjson -l crawler.log https://example.com
crawn -o output.ndjson -v https://example.com
crawn -o output.ndjson -m 3 https://example.com
crawn -o output.ndjson --include-content https://example.com
crawn -o output.ndjson --include-text https://example.com
Results are written as NDJSON (newline-delimited JSON):
{"url":"https://example.com","title":"Example Domain","depth":0}
{"url":"https://example.com/about","title":"About Us","depth":1}
{"url":"https://example.com/contact","title":"Contact","depth":1}
{"url":"https://example.com","title":"Example Domain","depth":0,"text":"Example Domain\nThis domain is..."}
{"url":"https://example.com","title":"Example Domain","depth":0,"content":"<!DOCTYPE html>\n<html>..."}
2026-01-24 02:37:40.351 [INFO]:
Sent request to URL: https://example.com
2026-01-24 02:37:41.123 [WARN]:
Failed to fetch URL: https://example.com/broken-link
Caused by: HTTP 404 Not Found
crawn -o rust-docs.ndjson https://doc.rust-lang.org/book/
crawn -o output.ndjson -l crawler.log -v https://example.com
crawn -o shallow.ndjson -m 2 https://example.com