| Crates.io | webpage_quality_analyzer |
| lib.rs | webpage_quality_analyzer |
| version | 1.0.2 |
| created_at | 2025-10-08 21:14:22.382507+00 |
| updated_at | 2025-10-14 08:29:30.563436+00 |
| description | High-performance webpage quality analyzer with 115 comprehensive metrics - Rust library with WASM, C++, and Python bindings |
| homepage | https://github.com/NotGyashu/webpage-quality-analyser |
| repository | https://github.com/NotGyashu/webpage-quality-analyser |
| max_upload_size | |
| id | 1874605 |
| size | 2,022,804 |
High-performance webpage quality analyzer with 115 comprehensive metrics. Analyze web pages for SEO, content quality, technical standards, accessibility, and more - all in milliseconds.
Add this to your Cargo.toml:
[dependencies]
webpage_quality_analyzer = "1.0"
Or use cargo:
cargo add webpage_quality_analyzer
use webpage_quality_analyzer::{analyze, analyze_with_profile};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Analyze with default settings
let report = analyze("https://example.com", None).await?;
println!("Score: {}/100", report.score);
println!("Quality: {}", report.verdict);
println!("Word Count: {}", report.metrics.content_metrics.word_count);
// Analyze with specific profile
let news_report = analyze_with_profile(
"https://example.com",
None,
"news"
).await?;
Ok(())
}
use webpage_quality_analyzer::Analyzer;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Build custom analyzer
let analyzer = Analyzer::builder()
.with_profile_name("blog")?
.with_metric_weight("word_count", 1.5)?
.disable_metric("grammar_score")?
.with_timeout_secs(30)?
.build()?;
let report = analyzer.run("https://example.com", None).await?;
println!("Custom analysis score: {}", report.score);
Ok(())
}
use webpage_quality_analyzer::{from_config_file, analyze_batch_high_performance};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load from YAML/JSON/TOML config file
let analyzer = from_config_file("config.yaml")?;
let report = analyzer.run("https://example.com", None).await?;
// High-performance batch processing
let urls = vec![
"https://site1.com",
"https://site2.com",
"https://site3.com",
];
let json_results = analyze_batch_high_performance(
&urls,
None, // HTML (None = fetch from URLs)
50, // Max concurrent requests
Some("news") // Profile name
).await?;
for report in reports {
println!("{}: {}/100", report.url, report.score);
}
Ok(())
}
let html = r#"
<!DOCTYPE html>
<html lang="en">
<head>
<title>Sample Page</title>
<meta name="description" content="A sample page">
</head>
<body>
<h1>Welcome</h1>
<p>This is a test page with some content.</p>
</body>
</html>
"#;
let report = analyze("https://example.com", Some(html.to_string())).await?;
println!("HTML analysis score: {}", report.score);
All 115 metrics (92 HTML-based + 23 network-based):
See: Complete metrics breakdown
Choose the right profile for your page type (8 built-in profiles):
| Profile | Best For | Content Weight | Key Focus |
|---|---|---|---|
content_article |
Long-form articles | 80% | Word count, structure, comprehensiveness |
blog |
Blog posts | 75% | Content quality, engagement, readability |
news |
News articles | 40% | Content freshness, readability, SEO (30%) |
general |
Any webpage | 35% | Balanced scoring across all categories |
homepage |
Landing pages | 25% | Navigation, structure, balanced (25% each) |
product |
Product pages | 20% | Media (35%), SEO (25%), product details |
portfolio |
Creative showcases | 15% | Media (50%), visual content |
login_page |
Authentication | 10% | Technical (50%), accessibility (20%), security |
Profile Customization: Each profile includes:
Control optional features via Cargo features:
[dependencies]
webpage_quality_analyzer = { version = "1.0", features = ["async", "linkcheck", "nlp"] }
Available features:
async (default) - Async runtime with tokio + reqwestreadability (default) - Mozilla Readability content extractionlinkcheck - External link validationnlp - Language detection and Unicode segmentationgrammar - Grammar checking (via nlprule)wasm - WebAssembly bindings (mutually exclusive with async)ffi - C FFI for C++ integrationcli - Command-line tool binary# Build for npm
wasm-pack build --target bundler --no-default-features --features wasm
# Use in JavaScript/TypeScript
npm install @webpage-quality-analyzer/core
import { WasmAnalyzer } from '@webpage-quality-analyzer/core';
const analyzer = new WasmAnalyzer();
const report = await analyzer.analyze('<html>...</html>');
console.log(`Score: ${report.score}/100`);
#include "webpage_quality_analyzer.hpp"
CAnalyzer* analyzer = wqa_analyzer_new();
CReport* report = wqa_analyze(analyzer, "https://example.com", nullptr);
double score = wqa_report_get_score(report);
# Download binary from releases
wqa analyze https://example.com
wqa batch urls.txt --parallel 10
wqa profiles # List available profiles
let analyzer = Analyzer::builder()
.with_profile_name("blog")?
.with_metric_weight("word_count", 1.5)? // Increase importance
.with_metric_weight("readability_score", 2.0)? // Double weight
.build()?;
let analyzer = Analyzer::builder()
.with_profile_name("blog")?
.set_metric_threshold(
"word_count",
100.0, // min
800.0, // optimal_min
2000.0, // optimal_max
5000.0 // max
)?
.build()?;
use webpage_quality_analyzer::{GlobalPenalty, PenaltyTrigger, PenaltyType};
let analyzer = Analyzer::builder()
.with_profile_name("news")?
.add_penalty(GlobalPenalty {
trigger: PenaltyTrigger::MetricBelow {
metric: "word_count".to_string(),
threshold: 500.0,
},
penalty: PenaltyType::FixedPoints { points: 10.0 },
description: "Content too short".to_string(),
})?
.add_bonus_above("readability_fk", 80.0, 5.0, "Highly readable")?
.build()?;
let analyzer = Analyzer::builder()
.with_profile_name("general")?
.disable_metric("grammar_score")?
.disable_metric("language_detection")?
.build()?;
// Full report (default)
let report = analyzer.run(url, html).await?;
// Compact JSON (20-30% size reduction)
let compact_json = analyzer.run_compact(url, html).await?;
// Minimal output (98.8% size reduction)
let minimal = analyzer.run_with_fields(
url,
html,
vec!["score", "verdict", "url"]
).await?;
// Advanced field selection
use webpage_quality_analyzer::FieldSelector;
let selector = FieldSelector::builder()
.include_sections(vec!["metrics"])
.exclude_section("processed_document")
.build();
let custom = analyzer.run_with_selector(url, html, &selector).await?;
Analysis Speed:
Arc<Semaphore> controlOutput Optimization:
// High-performance batch processing
use webpage_quality_analyzer::analyze_batch_high_performance;
let urls = vec![/* ... 100 URLs ... */];
let json_results = analyze_batch_high_performance(
&urls,
None, // Fetch HTML from URLs
50, // Max 50 concurrent requests
Some("news") // Profile
).await?;
cargo test # Run all tests
cargo test --features linkcheck # With network features
cargo bench # Run benchmarks
Dual licensed under MIT OR Apache-2.0. You can choose either license.
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
@webpage-quality-analyzer/core - JavaScript/TypeScript (WASM){
"score": 7.5,
"verdict": "Very Poor",
"url": "https://example.com",
"metrics": {
"content_metrics": {
"word_count": 10,
"paragraph_count": 1,
"avg_sentence_length": 7.5,
"readability_flesch_kincaid": 68.2
},
"technical_metrics": {
"title_length": 14,
"has_meta_description": true,
"html_size_bytes": 12320
},
"seo_metrics": {
"has_og_tags": true,
"has_schema_org": true,
"canonical_url_present": true
}
},
"phase3_scoring": {
"category_scores": {
"Content": 2.3,
"SEO": 68.5,
"Technical": 45.0
}
}
}
Made with โค๏ธ in Rust | Version 1.0.0 | October 2025