Crates.io | cvmfs_server_scraper |
lib.rs | cvmfs_server_scraper |
version | |
source | src |
created_at | 2024-06-30 00:05:46.73694 |
updated_at | 2024-10-18 15:06:49.138534 |
description | A scraper for CVMFS servers |
homepage | https://github.com/terjekv/cvmfs-server-scraper-rust |
repository | https://github.com/terjekv/cvmfs-server-scraper-rust |
max_upload_size | |
id | 1287695 |
Cargo.toml error: | TOML parse error at line 18, column 1 | 18 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
This library scrapes the public metadata sources from a CVMFS server and validates the data. The files fetched are:
And for each repository, it fetches:
use cvmfs_server_scraper::{Hostname, Server, ServerBackendType, ServerType,
ScrapedServer, ScraperCommon, Scraper, CVMFSScraperError, DEFAULT_GEOAPI_SERVERS};
#[tokio::main]
async fn main() -> Result<(), CVMFSScraperError> {
let servers = vec![
Server::new(
ServerType::Stratum1,
ServerBackendType::CVMFS,
Hostname::try_from("azure-us-east-s1.eessi.science")?,
),
Server::new(
ServerType::Stratum1,
ServerBackendType::AutoDetect,
Hostname::try_from("aws-eu-central-s1.eessi.science")?,
),
Server::new(
ServerType::SyncServer,
ServerBackendType::S3,
Hostname::try_from("aws-eu-west-s1-sync.eessi.science")?,
),
];
let repolist = vec!["software.eessi.io", "dev.eessi.io", "riscv.eessi.io"];
let ignored_repos = vec!["nope.eessi.io"];
// Build a Scraper and scrape all servers in parallel
let scraped_servers = Scraper::new()
.forced_repositories(repolist)
.ignored_repositories(ignored_repos)
.geoapi_servers(DEFAULT_GEOAPI_SERVERS.clone())? // This is the default list
.with_servers(servers) // Transitions to a WithServer state.
.validate()? // Transitions to a ValidatedAndReady state, now immutable.
.scrape().await; // Perform the scrape, return servers.
for server in scraped_servers {
match server {
ScrapedServer::Populated(populated_server) => {
println!("{}", populated_server);
populated_server.output();
println!();
}
ScrapedServer::Failed(failed_server) => {
panic!("Error! {} failed scraping: {:?}", failed_server.hostname, failed_server.error);
}
}
}
Ok(())
}
There are three valid options for backends for a given server. These are:
CVMFS
: This backend requires cvmfs/info/v1/repositories.json
to be present on the server. Scrape fails if it is missing.S3
: Does not even attempt to fetch cvmfs/info/v1/repositories.json
. Note that if any server has S3 as a backend a list of repositories must be passed to the scraper as there is no other way to determine the list of repositories for S3 servers. Due to the async scraping of all servers, there is currently no support for falling back on repositories detected from other server types (including possibly the Stratum0).AutoDetect
: This backend Aatempts to fetch cvmfs/info/v1/repositories.json
but does not fail if it is missing. If the scraper fails to fetch the file, the backend will be assumed to be S3. If the list of repositories is empty, the scraper will return an empty list. If your S3 server has no repositories, setting the backend to AutoDetect will allow the scraper to continue without failing.For populated servers, the field backend_detected
will be set to the detected backend, which for explicit S3 or CVMFS servers will be the same as requested type.
Licensed under the MIT license. See the LICENSE file for details.