| Crates.io | readability-js-cli |
| lib.rs | readability-js-cli |
| version | 0.1.5 |
| created_at | 2025-09-24 22:25:31.215605+00 |
| updated_at | 2025-10-03 13:07:44.914308+00 |
| description | Command-line interface for readability-js |
| homepage | https://github.com/egemengol/readability-js |
| repository | https://github.com/egemengol/readability-js |
| max_upload_size | |
| id | 1853883 |
| size | 80,861 |
Extract clean, readable content from web pages using Mozilla's Readability.js algorithm.
This crate provides both a Rust library and CLI tool for extracting main content from HTML documents, removing navigation, ads, and other clutter. It uses the same algorithm that powers Firefox Reader Mode.
readability-js?This crate uses Mozilla's actual Readability.js library implementation - the same code that powers Firefox Reader Mode. Creating a Readability instance takes ~30ms while processing a document takes ~10ms which is good enough for most applications, negligible compared to the accuracy benefits.
For more background check out the blog post
cargo install readability-js-cli
Add to your Cargo.toml:
[dependencies]
readability-js = "0.1"
# Extract from URL
readable https://example.com/article > article.md
# Process local file
readable article.html > clean.md
# Use in pipelines
curl -s https://news.site/story | readable | less
use readability_js::Readability;
// Create parser (reuse for multiple documents)
let reader = Readability::new()?;
// Extract content
let html = std::fs::read_to_string("article.html")?;
let article = reader.parse_with_url(&html, "https://example.com")?;
println!("Title: {}", article.title);
println!("Author: {}", article.byline.unwrap_or_default());
println!("Content: {}", article.content);
This crate embeds Mozilla's Readability.js library using the QuickJS JavaScript engine. The JavaScript bundle combines:
The extraction process:
This approach ensures we use the exact same algorithm as Firefox while maintaining excellent performance through the lightweight QuickJS engine.
Licensed under the Universal Permissive License v1.0 (UPL-1.0)
This crate bundles the following JavaScript libraries:
Both dependencies use permissive licenses that fully allow bundling unmodified source code. No modifications were made to either library, ensuring full license compliance.