# htmlq Like [`jq`](https://stedolan.github.io/jq/), but for HTML. Uses [CSS selectors](https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Selectors) to extract bits of content from HTML files. ## Installation ### [Cargo](https://crates.io/crates/htmlq) ```sh cargo install htmlq ``` ### [Homebrew](https://formulae.brew.sh/formula/htmlq) ```sh brew install htmlq ``` ## Usage ```console $ htmlq -h htmlq 0.4.0 Michael Maclean Runs CSS selectors on HTML USAGE: htmlq [FLAGS] [OPTIONS] [--] [selector]... FLAGS: -B, --detect-base Try to detect the base URL from the tag in the document. If not found, default to the value of --base, if supplied -h, --help Prints help information -w, --ignore-whitespace When printing text nodes, ignore those that consist entirely of whitespace -p, --pretty Pretty-print the serialised output -t, --text Output only the contents of text nodes inside selected elements -V, --version Prints version information OPTIONS: -a, --attribute Only return this attribute (if present) from selected elements -b, --base Use this URL as the base for links -f, --filename The input file. Defaults to stdin -o, --output The output file. Defaults to stdout -r, --remove-nodes ... Remove nodes matching this expression before output. May be specified multiple times ARGS: ... The CSS expression to select [default: html] $ ``` ## Examples ### Using with cURL to find part of a page by ID ```console $ curl --silent https://www.rust-lang.org/ | htmlq '#get-help'

Get help!

``` ### Find all the links in a page ```console $ curl --silent https://www.rust-lang.org/ | htmlq --attribute href a / /tools/install /learn /tools /governance /community https://blog.rust-lang.org/ /learn/get-started https://blog.rust-lang.org/2019/04/25/Rust-1.34.1.html https://blog.rust-lang.org/2018/12/06/Rust-1.31-and-rust-2018.html [...] ``` ### Get the text content of a post ```console $ curl --silent https://nixos.org/nixos/about.html | htmlq --text .main About NixOS NixOS is a GNU/Linux distribution that aims to improve the state of the art in system configuration management. In existing distributions, actions such as upgrades are dangerous: upgrading a package can cause other packages to break, upgrading an entire system is much less reliable than reinstalling from scratch, you can’t safely test what the results of a configuration change will be, you cannot easily undo changes to the system, and so on. We want to change that. NixOS has many innovative features: [...] ``` ### Remove a node before output There's a big SVG image in this page that I don't need, so here's how to remove it. ```console $ curl --silent https://nixos.org/ | ./target/debug/htmlq '.whynix' --remove-nodes svg
  • Reproducible

    Nix builds packages in isolation from each other. This ensures that they are reproducible and don't have undeclared dependencies, so if a package works on one machine, it will also work on another.

  • Declarative

    Nix makes it trivial to share development and build environments for your projects, regardless of what programming languages and tools you’re using.

  • Reliable

    Nix ensures that installing or upgrading one package cannot break other packages. It allows you to roll back to previous versions, and ensures that no package is in an inconsistent state during an upgrade.

``` ### Pretty print HTML (This is a bit of a work in progress) ```console $ curl --silent https://mgdm.net | htmlq --pretty '#posts'

I write about...

  • Debugging network connections on macOS with nettop

    Using nettop to find out what network connections a program is trying to make.

  • [...] ``` ### Syntax highlighting with [`bat`](https://github.com/sharkdp/bat) ```console $ curl --silent example.com | htmlq 'body' | bat --language html ``` > Syntax highlighted output