| Crates.io | webcrawler |
| lib.rs | webcrawler |
| version | 0.1.2 |
| created_at | 2025-01-25 04:18:32.531548+00 |
| updated_at | 2025-01-29 14:10:04.086766+00 |
| description | A simple Rust web crawler |
| homepage | |
| repository | https://github.com/Mintype/webcrawler |
| max_upload_size | |
| id | 1530220 |
| size | 49,864 |
This Rust library is a simple web crawler that checks the validity of routes on a given website. It reads a list of routes from a file, constructs full URLs by appending the routes to a base URL, and sends HTTP GET requests to check whether the routes are valid.
webcrawler to Your Cargo.tomlIf you're using this library in your own Rust project, add the following to your Cargo.toml under [dependencies]:
[dependencies]
webcrawler = { version = "0.1.1" }
reqwest = { version = "0.11", features = ["blocking"] }
futures-io = "0.3.30"
url = "2.2"
In your main.rs or any other Rust file, import the web crawler library:
use webcrawler::WebCrawler;
Here is an example of how to use the WebCrawler to check the validity of routes:
use webcrawler::WebCrawler;
fn main() {
let base_url = "http://example.com"; // Replace with your base URL
let file_path = "src/routes.txt"; // Replace with the path to your routes file
let mut crawler = WebCrawler::new();
match crawler.check_valid_routes(base_url, file_path) {
Ok(valid_routes) => {
println!("Number of valid routes: {}", valid_routes);
}
Err(e) => {
println!("Error: {}", e);
}
}
}
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! If you have any improvements, bug fixes, or feature suggestions, feel free to open an issue or submit a pull request.
You can find this crate and the latest version on crates.io.