Crates.io | goodreads-metadata-scraper |
lib.rs | goodreads-metadata-scraper |
version | 0.2.0 |
source | src |
created_at | 2024-11-15 08:09:57.828683 |
updated_at | 2024-11-17 10:53:48.710096 |
description | A simple library to scrape book metadata from Goodreads |
homepage | https://github.com/tiago-cos/goodreads-metadata-scraper |
repository | https://github.com/tiago-cos/goodreads-metadata-scraper |
max_upload_size | |
id | 1448870 |
size | 69,632 |
An async Rust library to fetch and scrape book metadata from Goodreads by using an ISBN, Goodreads book ID, or a combination of title and author. This library is useful for applications that need book data from Goodreads without access to an official API.
Add this crate to your Cargo.toml
:
[dependencies]
goodreads-metadata-scraper = "0.1.0"
Here are a few examples of how to use the library. The primary entry point is MetadataRequestBuilder
, which allows you to specify search criteria before calling execute
to fetch metadata.
use grscraper::MetadataRequestBuilder;
let isbn = "9780141381473";
let metadata = MetadataRequestBuilder::default()
.with_isbn(isbn)
.execute()
.await?
.expect("Book not found");
assert_eq!(metadata.title, "The Lightning Thief");
println!("{:?}", metadata);
use grscraper::MetadataRequestBuilder;
let goodreads_id = "175254";
let metadata = MetadataRequestBuilder::default()
.with_id(goodreads_id)
.execute()
.await?
.expect("Book not found");
assert_eq!(metadata.title, "Pride and Prejudice");
println!("{:?}", metadata);
For better accuracy, you can also specify an author along with the title:
use grscraper::MetadataRequestBuilder;
let title = "The Last Magician";
let author = "Lisa Maxwell";
let metadata = MetadataRequestBuilder::default()
.with_title(title)
.with_author(author)
.execute()
.await?
.expect("Book not found");
assert_eq!(metadata.title, title);
println!("{:?}", metadata);
This crate uses a custom error type, ScraperError
, which handles errors that may occur during the metadata fetching and parsing process. ScraperError
includes:
FetchError
: Errors during HTTP requests (from reqwest
)ParseError
: HTML parsing errors (from scraper
)SerializeError
: JSON serialization errors (from serde_json
)Note: When running tests, it is highly recommended to run them with the --test-threads=1
flag to avoid rate-limiting issues with Goodreads.
This project is licensed under the GNU General Public License (GPL). See the LICENSE file for more details.
This library provides an accessible alternative for retrieving Goodreads book metadata, enabling developers to integrate Goodreads data without an official API. Feel free to contribute by submitting issues or pull requests!