turndown

Crates.ioturndown
lib.rsturndown
version0.1.4
created_at2025-11-29 19:21:29.620657+00
updated_at2025-12-27 18:32:15.521085+00
descriptionAn opionated Rust port of Turndown.js
homepage
repositoryhttps://github.com/ravnmail/turndown
max_upload_size
id1957304
size107,717
Michael Wallner (badmike)

documentation

README

turndown

License: MIT Crates.io Rust

An opinionated Rust port of Turndown.js, a robust HTML to Markdown converter. This crate provides a fast, reliable way to transform HTML documents into clean, readable Markdown format.

Motivation

While there are several HTML to Markdown converters available in Rust, many lack the flexibility and configurability needed to handle the complex conversion of varied email HTML content. This version of turndown aims to fill that gap by offering a highly customizable conversion process, allowing users to tailor the output to their specific needs.

This crate was created to handle the mail conversion needs of RAVN Mail, the open-source email client for digital natives. Star RAVN on GitHub if you like it!

Features

  • Drop-in HTML to Markdown converter with sensible defaults for email content.
  • Highly configurable conversion rules and formatting options.
  • Support for multiple Markdown styles:
    • Heading styles: ATX (# Heading) or Setext (Heading\n=======)
    • Code block styles: Fenced (```) or Indented
    • Link styles: Inline or Reference
  • Performs well on email newsletters, marketing emails and human-written emails.
  • Filter tracking pixels and unnecessary elements commonly found in email HTML. (disabled by default)

Usage

Add this to your Cargo.toml:

[dependencies]
turndown = "0.1"

Configuration

Basic Example

use turndown::Turndown;

let turndown = Turndown::new();
let html = "<h1>Hello</h1><p>This is a <strong>test</strong>.</p>";
let markdown = turndown.convert(html);

println!("{}", markdown);
// Output:
// # Hello
// This is a **test**.

Advanced Configuration

use turndown::{Turndown, TurndownOptions, HeadingStyle, CodeBlockStyle};

let mut options = TurndownOptions::default();
options.heading_style = HeadingStyle::Setext;
options.code_block_style = CodeBlockStyle::Indented;

let turndown = Turndown::with_options(options);
let markdown = turndown.convert(html);

Command-line Tool

This crate includes a CLI tool for converting HTML to Markdown from the command line:

# Convert HTML from stdin to Markdown
echo "<h1>Hello</h1>" | turndown

Configuration Options

The TurndownOptions struct provides fine-grained control over the conversion:

Option Type Default Description
heading_style HeadingStyle Atx Heading style: Atx (# Heading) or Setext (Heading\n=======)
hr String * * * String used to render horizontal rules
bullet_list_marker String * Marker used for bullet lists (can be *, +, or -)
code_block_style CodeBlockStyle Fenced Code block style: Fenced (```) or Indented
fence String ``` Delimiter used for fenced code blocks
em_delimiter String _ Delimiter used for emphasis/italics (can be _ or *)
strong_delimiter String ** Delimiter used for strong emphasis/bold
link_style LinkStyle Inlined Link style: Inlined or Referenced
link_reference_style LinkReferenceStyle Full Link reference style: Full, Collapsed, or Shortcut (only for Referenced link style)
br String Two spaces String used to represent line breaks in Markdown
strip_tracking_images bool false Strip tracking pixels and beacons from output
tracking_image_regex Option<Regex> Sensible default Custom regex pattern to identify tracking images
strip_images_without_alt bool false Strip images that lack alt attributes

Configuration Examples

use turndown::{Turndown, TurndownOptions, HeadingStyle, CodeBlockStyle, LinkStyle};

// Use Setext headings and indented code blocks
let mut options = TurndownOptions::default();
options.heading_style = HeadingStyle::Setext;
options.code_block_style = CodeBlockStyle::Indented;

let turndown = Turndown::with_options(options);

// Strip tracking images from email newsletters
let mut options = TurndownOptions::default();
options.strip_tracking_images = true;

let turndown = Turndown::with_options(options);

// Use reference links and customize link reference style
let mut options = TurndownOptions::default();
options.link_style = LinkStyle::Referenced;

let turndown = Turndown::with_options(options);

Architecture

The conversion process works in two main stages:

  1. Parsing: HTML is parsed into a DOM tree using the html5ever parser.
  2. Conversion: The DOM tree is traversed and converted to Markdown using a rule-based system.

The rule-based system allows for customization and extension of the conversion logic.

License

Licensed under either of:

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Commit count: 0

cargo fmt