html-extractor

Crates.iohtml-extractor
lib.rshtml-extractor
version1.0.0
sourcesrc
created_at2020-04-26 10:40:15.920812
updated_at2020-08-08 08:18:23.893838
descriptionA Rust crate for extracting data from HTML
homepage
repositoryhttps://github.com/mkihr-ojisan/html-extractor
max_upload_size
id234221
size22,932
(mkihr-ojisan)

documentation

README

html-extractor

Rust html-extractor at crates.io html-extractor at docs.rs

A Rust crate for extracting data from HTML.

Examples

Extracting a simple value from HTML

use html_extractor::{html_extractor, HtmlExtractor};
html_extractor! {
    #[derive(Debug, PartialEq)]
    Foo {
        foo: usize = (text of "#foo"),
    }
}

fn main() {
    let input = r#"
        <div id="foo">1</div>
    "#;
    let foo = Foo::extract_from_str(input).unwrap();
    assert_eq!(foo, Foo { foo: 1 });
}

Extracting a collection from HTML

use html_extractor::{html_extractor, HtmlExtractor};
html_extractor! {
    #[derive(Debug, PartialEq)]
    Foo {
        foo: Vec<usize> = (text of ".foo", collect),
    }
}

fn main() {
    let input = r#"
        <div class="foo">1</div>
        <div class="foo">2</div>
        <div class="foo">3</div>
        <div class="foo">4</div>
    "#;
    let foo = Foo::extract_from_str(input).unwrap();
    assert_eq!(foo, Foo { foo: vec![1, 2, 3, 4] });
}

Extracting with regex

use html_extractor::{html_extractor, HtmlExtractor};
html_extractor! {
    #[derive(Debug, PartialEq)]
    Foo {
        (foo: usize,) = (text of "#foo", capture with "^foo=(.*)$"),
    }
}

fn main() {
    let input = r#"
        <div id="foo">foo=1</div>
    "#;
    let foo = Foo::extract_from_str(input).unwrap();
    assert_eq!(foo, Foo { foo: 1 });
}

Changelog

v0.4.0

  • Add presence of .. target specifier

v0.3.0

  • Add parser specifier
  • Add inner_html target specifier
  • Change the behavior when extracting text nodes to remove spaces at both ends.
  • Fix error message

v0.2.1

  • Fix the internal usage of the rust standard library

v0.2.0

  • Rename "collect specifier" to "collector specifier"
  • Add "optional" collector

v0.1.1

  • Fix the links in the documentation
Commit count: 36

cargo fmt