| Crates.io | html-filter |
| lib.rs | html-filter |
| version | 0.2.1 |
| created_at | 2025-03-13 22:12:40.798683+00 |
| updated_at | 2025-11-08 10:33:44.741259+00 |
| description | Crate to parse, filter, search and edit an HTML file. |
| homepage | |
| repository | https://github.com/t-webber/html-filter |
| max_upload_size | |
| id | 1591531 |
| size | 122,637 |
This is a rust library that parses HTML source files and allows you to search in and filter this HTML with a specific set of rules.
This is a simple lightweight HTML parser, that converts an HTML file (in the &str format) to a tree representing the HTML tags and text.
You can install it with
cargo add html_filter
then us it like this:
use html_filter::*;
let html: &str = r#"
<!DOCTYPE html>
<html lang="en">
<head>
<title>Html sample</title>
</head>
<body>
<p>This is an html sample.</p>
</body>
</html>
"#;
// Parse your HTML
let tree: Html = Html::parse(html).expect("Invalid HTML");
// Now you can use it!
// Beware, this doesn't always work as you can have
// difference ways to write the same HTML.
assert!(format!("{tree}") == html);
You can also use the find and filter methods to manage this HTML. To do this, you need to create your filtering options with the Filter type.
use html_filter::*;
let html: &str = r##"
<section>
<h1>Welcome to My Random Page</h1>
<nav>
<ul>
<li><a href="/home">Home</a></li>
<li><a href="/about">About</a></li>
<li><a href="/services">Services</a></li>
<li><a href="/contact">Contact</a></li>
</ul>
</nav>
</section>
"##;
// Create your filter
let filter = Filter::new().tag_name("li");
// Parse your HTML
let filtered: Html = Html::parse(html).expect("Invalid HTML").filter(&filter);
// `filtered` contains the 4 `<li>`s from the above HTML string
if let Html::Vec(links) = filtered {
assert!(links.len() == 4)
} else {
unreachable!()
}
The finder returns the first element that respects the filter:
use html_filter::*;
let html: &str = r##"
<section>
<h1>Welcome to My Random Page</h1>
<nav>
<ul>
<li><a href="/home">Home</a></li>
<li><a href="/about">About</a></li>
<li><a href="/services">Services</a></li>
<li><a href="/contact">Contact</a></li>
</ul>
</nav>
</section>
"##;
// Create your filter
let filter = Filter::new().tag_name("a");
// Parse your HTML
let link: Html = Html::parse(html).expect("Invalid HTML").find(&filter);
// Check the result: link contains `<a href="/home">Home</a>`
if let Html::Tag { tag, child, .. } = link && let Html::Text(text) = *child {
assert!(tag.as_name() == "a" && text == "Home");
} else {
unreachable!()
}
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
[!IMPORTANT] Do not use this parser to check the syntax of your HTML code. Many HTML files are parsed without any errors by this parser, as the sole objective is to get a parsed version. Only breaking syntax errors raises errors.
Obviously, all valid HTML files work fine with this crate.