| Crates.io | html-to-markdown-cli |
| lib.rs | html-to-markdown-cli |
| version | 2.23.4 |
| created_at | 2025-10-10 17:53:50.692314+00 |
| updated_at | 2026-01-20 18:38:26.121756+00 |
| description | Command-line interface for html-to-markdown - high-performance HTML to Markdown converter |
| homepage | https://github.com/kreuzberg-dev/html-to-markdown |
| repository | https://github.com/kreuzberg-dev/html-to-markdown |
| max_upload_size | |
| id | 1877284 |
| size | 130,273 |
High-performance HTML → Markdown conversion powered by Rust. Shipping as a Rust crate, Python package, PHP extension, Ruby gem, Elixir Rustler NIF, Node.js bindings, WebAssembly, and standalone CLI with identical rendering behavior across all runtimes.
Each language binding provides comprehensive documentation with installation instructions, examples, and best practices. Choose your platform to get started:
Scripting Languages:
JavaScript/TypeScript:
Compiled Languages:
Native:
Command-Line:
Extract comprehensive metadata during conversion: title, description, headers, links, images, structured data (JSON-LD, Microdata, RDFa). Use cases: SEO extraction, table-of-contents generation, link validation, accessibility auditing, content migration.
Customize HTML→Markdown conversion with callbacks for specific elements. Intercept links, images, headings, lists, and more. Use cases: domain-specific Markdown dialects (Obsidian, Notion), content filtering, URL rewriting, accessibility validation, analytics.
Supported in: Rust, Python (sync & async), TypeScript/Node.js (sync & async), Ruby, and PHP.
| Binding | Visitor Support | Async Support | Best For |
|---|---|---|---|
| Rust | ✅ Yes | ✅ Tokio | Core library, performance-critical code |
| Python | ✅ Yes | ✅ asyncio | Server-side, bulk processing |
| TypeScript/Node.js | ✅ Yes | ✅ Promise-based | Server-side Node.js/Bun, best performance |
| Ruby | ✅ Yes | ❌ No | Server-side Ruby on Rails, Sinatra |
| PHP | ✅ Yes | ❌ No | Server-side PHP, content management |
| Go | ❌ No | — | Basic conversion only |
| Java | ❌ No | — | Basic conversion only |
| C# | ❌ No | — | Basic conversion only |
| Elixir | ❌ No | — | Basic conversion only |
| WebAssembly | ❌ No | — | Browser, Edge, Deno (FFI limitations) |
For WASM users needing visitor functionality, see WASM Visitor Alternatives for recommended approaches.
Rust-powered core delivers 150–280 MB/s throughput (10-80× faster than pure Python alternatives). Includes benchmarking tools, memory profiling, streaming strategies, and optimization tips.
Keep specific HTML tags unconverted when Markdown isn't expressive enough. Useful for tables, SVG, custom elements, or when you need mixed HTML/Markdown output.
See language-specific documentation for preserveTags configuration.
Skip all images during conversion using the skip_images option. Useful for text-only extraction or when you want to filter out visual content.
Rust:
use html_to_markdown_rs::{convert, ConversionOptions};
let options = ConversionOptions {
skip_images: true,
..Default::default()
};
let html = r#"<p>Text with <img src="image.jpg" alt="pic"> image</p>"#;
let markdown = convert(html, Some(options))?;
// Output: "Text with image" (image tags are removed)
Python:
from html_to_markdown import convert, ConversionOptions
options = ConversionOptions(skip_images=True)
markdown = convert(html, options)
TypeScript/Node.js:
import { convert, ConversionOptions } from '@kreuzberg/html-to-markdown-node';
const options: ConversionOptions = {
skipImages: true,
};
const markdown = convert(html, options);
Ruby:
require 'html_to_markdown'
options = HtmlToMarkdown::ConversionOptions.new(skip_images: true)
markdown = HtmlToMarkdown.convert(html, options)
PHP:
use Goldziher\HtmlToMarkdown\HtmlToMarkdown;
use Goldziher\HtmlToMarkdown\Options;
$options = new Options(['skip_images' => true]);
$markdown = HtmlToMarkdown::convert($html, $options);
This option is available across all language bindings. When enabled, all <img> tags and their associated markdown image syntax are removed from the output.
Built-in HTML sanitization prevents XSS attacks and malicious content. Powered by ammonia with safe defaults. Configurable via sanitize options.
Contributions are welcome! See CONTRIBUTING.md for guidelines on:
All contributions must follow code quality standards enforced via pre-commit hooks (prek).
MIT License – see LICENSE for details. You can use html-to-markdown freely in both commercial and closed-source products with no obligations, no viral effects, and no licensing restrictions.