# Html parser A simple and general purpose html/xhtml parser lib/bin, using [Pest](https://pest.rs/). ## Features - Parse html & xhtml (not xml processing instructions) - Parse html-documents - Parse html-fragments - Parse empty documents - Parse with the same api for both documents and fragments - Parse custom, non-standard, elements; ``, `` and `` - Removes comments - Removes dangling elements - Iterate over all nodes in the dom three ## What is it not - It's not a high-performance browser-grade parser - It's not suitable for html validation - It's not a parser that includes element selection or dom manipulation If your requirements matches any of the above, then you're most likely looking for one of the crates below: - [html5ever](https://crates.io/crates/html5ever) - [kuchiki](https://crates.io/crates/kuchiki) - [scraper](https://crates.io/crates/scraper) - or other crates using the `html5ever` parser ## Examples bin Parse html file ```shell html_parser index.html ``` Parse stdin with pretty output ```shell curl | html_parser -p ``` ## Examples lib Parse html document ```rust use html_parser::Dom; fn main() { let html = r#" Html parser

Hello world

"#; assert!(Dom::parse(html).is_ok()); } ``` Parse html fragment ```rust use html_parser::Dom; fn main() { let html = "
"; assert!(Dom::parse(html).is_ok()); } ``` Print to json ```rust use html_parser::{Dom, Result}; fn main() -> Result<()> { let html = "
"; let json = Dom::parse(html)?.to_json_pretty()?; println!("{}", json); Ok(()) } ```