| Crates.io | markdown-ppp |
| lib.rs | markdown-ppp |
| version | 2.7.1 |
| created_at | 2025-04-25 19:14:21.161553+00 |
| updated_at | 2025-09-23 19:54:19.799538+00 |
| description | Feature-rich Markdown Parsing and Pretty-Printing library |
| homepage | |
| repository | https://github.com/johnlepikhin/markdown-ppp |
| max_upload_size | |
| id | 1649375 |
| size | 705,669 |
markdown-ppp is a feature-rich, flexible, and lightweight Rust library for parsing and processing Markdown documents.
It provides a clean, well-structured Abstract Syntax Tree (AST) for parsed documents, making it suitable for pretty-printing, analyzing, transforming, or rendering Markdown.
Add the crate using Cargo:
cargo add markdown-ppp
If you want only the AST definitions without parsing functionality, disable default features manually:
[dependencies]
markdown-ppp = { version = "2.6.0", default-features = false }
The main entry point for parsing is the parse_markdown function, available at:
pub fn parse_markdown(
state: MarkdownParserState,
input: &str,
) -> Result<Document, nom::Err<nom::error::Error<&str>>>
Example:
use markdown_ppp::parse::parse_markdown;
use markdown_ppp::ast::Document;
use std::rc::Rc;
fn main() {
let state = markdown_ppp::parse::MarkdownParserState::new();
let input = "# Hello, World!";
match parse_markdown(Rc::new(state), input) {
Ok(document) => {
println!("Parsed document: {:?}", document);
}
Err(err) => {
eprintln!("Failed to parse Markdown: {:?}", err);
}
}
}
The MarkdownParserState controls parsing behavior and can be customized.
You can create a default state easily:
use markdown_ppp::parser::config::*;
let state = MarkdownParserState::default();
Alternatively, you can configure it manually by providing a MarkdownParserConfig:
use markdown_ppp::parser::config::*;
let config = MarkdownParserConfig::default()
.with_block_blockquote_behavior(ElementBehavior::Ignore);
let ast = parse_markdown(MarkdownParserState::with_config(config), "hello world")?;
This allows you to control how certain Markdown elements are parsed or ignored.
You can control how individual Markdown elements are parsed at a fine-grained level using the MarkdownParserConfig API.
Each element type (block-level or inline-level) can be configured with an ElementBehavior:
pub enum ElementBehavior<ELT> {
/// The parser will parse the element normally.
Parse,
/// The parser will ignore the element and not parse it. In this case, alternative
/// parsers will be tried.
Ignore,
/// Parse the element but do not include it in the output.
Skip,
/// Parse the element and apply a custom function to it.
Map(ElementMapFn<ELT>),
/// Parse the element and apply a custom function to it which returns an array of elements.
FlatMap(ElementFlatMapFn<ELT>),
}
These behaviors can be set via builder-style methods on the config. For example, to skip parsing of thematic breaks and transform blockquotes:
use markdown_ppp::parser::config::*;
use markdown_ppp::ast::Block;
let config = MarkdownParserConfig::default()
.with_block_thematic_break_behavior(ElementBehavior::Skip)
.with_block_blockquote_behavior(ElementBehavior::Map(|mut bq: Block| {
// Example transformation: replace all blockquotes with empty paragraphs
Block::Paragraph(vec![])
}));
let ast = parse_markdown(MarkdownParserState::with_config(config), input)?;
This mechanism allows you to override, filter, or completely redefine how each Markdown element is treated during parsing, giving you deep control over the resulting AST.
You can also register your own custom block-level or inline-level parsers by providing parser functions via configuration. These parsers are executed before the built-in ones and can be used to support additional syntax or override behavior.
To register a custom block parser:
use markdown_ppp::parser::config::*;
use markdown_ppp::ast::Block;
use std::{rc::Rc, cell::RefCell};
use nom::IResult;
let custom_block: CustomBlockParserFn = Rc::new(RefCell::new(Box::new(|input: &str| {
if input.starts_with("::note") {
let block = Block::Paragraph(vec!["This is a note block".into()]);
Ok(("", block))
} else {
Err(nom::Err::Error(nom::error::Error::new(input, nom::error::ErrorKind::Tag)))
}
})));
let config = MarkdownParserConfig::default()
.with_custom_block_parser(custom_block);
Similarly, to register a custom inline parser:
use markdown_ppp::parser::config::*;
use markdown_ppp::ast::Inline;
use std::{rc::Rc, cell::RefCell};
use nom::IResult;
let custom_inline: CustomInlineParserFn = Rc::new(RefCell::new(Box::new(|input: &str| {
if input.starts_with("@@") {
Ok((&input[2..], Inline::Text("custom-inline".into())))
} else {
Err(nom::Err::Error(nom::error::Error::new(input, nom::error::ErrorKind::Tag)))
}
})));
let config = config.with_custom_inline_parser(custom_inline);
This extensibility allows you to integrate domain-specific syntax and behaviors into the Markdown parser while reusing the base logic and AST structure provided by markdown-ppp., filter, or completely redefine how each Markdown element is treated during parsing.
The complete Markdown Abstract Syntax Tree (AST) is defined inside the module markdown_ppp::ast.
The Document struct represents the root node, and from there you can traverse the full tree of blocks and inlines, such as headings, paragraphs, lists, emphasis, and more.
You can use the AST independently without the parsing functionality by disabling default features.
The ast_specialized module provides pre-built specialized versions of the generic AST for element identification. This feature is disabled by default and must be enabled via the ast-specialized feature.
Enable the feature in your Cargo.toml:
[dependencies]
markdown-ppp = { version = "2.6.0", features = ["ast-specialized"] }
The module provides specialized data types for element identification:
ElementId - Unique identifier for AST elementsPre-defined type aliases make it easy to work with specialized AST:
use markdown_ppp::ast_specialized::with_ids;
// AST with element IDs
let doc_with_ids: with_ids::Document = /* ... */;
Helper functions for common operations:
use markdown_ppp::ast_specialized::utilities::id_utils;
use markdown_ppp::parser::parse_markdown;
// Parse and add IDs to all elements
let doc = parse_markdown(state, "# Hello *world*!").unwrap();
let doc_with_ids = id_utils::add_ids_to_document(doc);
use markdown_ppp::ast_specialized::element_id::{ElementId, IdGenerator};
let mut id_gen = IdGenerator::new();
let id1 = id_gen.generate(); // ElementId(1)
let id2 = id_gen.generate(); // ElementId(2)
// Start from specific ID
let mut custom_gen = IdGenerator::starting_from(100);
let id = custom_gen.generate(); // ElementId(100)
This module is particularly useful for:
The ast_transform module provides a comprehensive toolkit for modifying, querying, and transforming parsed Markdown documents. It supports both 1-to-1 transformations (traditional) and 1-to-many expandable transformations for advanced use cases like content splitting, duplication, or expansion. Generic AST support allows preserving user-defined metadata during transformations. This feature is disabled by default and must be enabled via the ast-transform feature.
Enable the feature in your Cargo.toml:
[dependencies]
markdown-ppp = { version = "2.6.0", features = ["ast-transform"] }
Then use the transformation API:
use markdown_ppp::parse::parse_markdown;
use markdown_ppp::ast_transform::*;
let state = markdown_ppp::parse::MarkdownParserState::new();
let doc = parse_markdown(state, "# Hello *world*!").unwrap();
// Transform all text to uppercase
let doc = doc.transform_text(|text| text.to_uppercase());
// Remove empty elements and normalize whitespace
let doc = doc.remove_empty_text().normalize_whitespace();
The module provides several powerful patterns for working with AST:
use markdown_ppp::ast_transform::{Transform, FilterTransform};
let doc = doc
.transform_text(|text| text.trim().to_string())
.transform_image_urls(|url| format!("https://cdn.example.com{}", url))
.transform_link_urls(|url| url.replace("http://", "https://"))
.remove_empty_paragraphs()
.normalize_whitespace();
use markdown_ppp::ast_transform::{Visitor, VisitWith};
struct LinkCollector {
links: Vec<String>,
}
impl Visitor for LinkCollector {
fn visit_inline(&mut self, inline: &Inline) {
if let Inline::Link(link) = inline {
self.links.push(link.destination.clone());
}
self.walk_inline(inline);
}
}
let mut collector = LinkCollector { links: Vec::new() };
doc.visit_with(&mut collector);
println!("Found {} links", collector.links.len());
use markdown_ppp::ast_transform::Query;
// Find all autolinks
let autolinks = doc.find_all_inlines(|inline| {
matches!(inline, Inline::Autolink(_))
});
// Count code blocks
let code_count = doc.count_blocks(|block| {
matches!(block, Block::CodeBlock(_))
});
// Find first heading
let first_heading = doc.find_first_block(|block| {
matches!(block, Block::Heading(_))
});
use markdown_ppp::ast_transform::{Transformer, TransformWith};
struct CodeHighlighter;
impl Transformer for CodeHighlighter {
fn transform_inline(&mut self, inline: Inline) -> Inline {
match inline {
Inline::Code(code) => {
// Add syntax highlighting classes
Inline::Html(format!("<code class=\"highlight\">{}</code>", code))
}
other => self.walk_transform_inline(other),
}
}
}
let doc = doc.transform_with(&mut CodeHighlighter);
use markdown_ppp::ast_transform::{Transformer, ExpandWith};
struct ParagraphSplitter;
impl Transformer for ParagraphSplitter {
fn walk_expand_block(&mut self, block: Block) -> Vec<Block> {
match block {
Block::Paragraph(inlines) => {
// Split paragraphs at "SPLIT" markers
if let Some(split_pos) = inlines.iter().position(|inline| {
matches!(inline, Inline::Text(text) if text.contains("SPLIT"))
}) {
let (first_half, second_half) = inlines.split_at(split_pos);
let second_half = &second_half[1..]; // Skip SPLIT marker
vec![
Block::Paragraph(first_half.to_vec()),
Block::Paragraph(second_half.to_vec()),
]
} else {
vec![Block::Paragraph(inlines)]
}
}
other => vec![self.transform_block(other)],
}
}
}
// Use the expandable transformer
let mut transformer = ParagraphSplitter;
let expanded_docs = doc.expand_with(&mut transformer); // Returns Vec<Document>
use markdown_ppp::ast::generic::*;
use markdown_ppp::ast_transform::{GenericTransformer, GenericExpandWith};
#[derive(Debug, Clone, Default)]
struct NodeId(u32);
struct IdAssigner {
next_id: u32,
}
impl IdAssigner {
fn next_id(&mut self) -> u32 {
let id = self.next_id;
self.next_id += 1;
id
}
}
impl GenericTransformer<NodeId> for IdAssigner {
fn walk_expand_block(&mut self, block: Block<NodeId>) -> Vec<Block<NodeId>> {
match block {
Block::Paragraph { content, .. } => {
vec![Block::Paragraph {
content: content.into_iter()
.flat_map(|inline| self.walk_expand_inline(inline))
.collect(),
user_data: NodeId(self.next_id()),
}]
}
other => vec![self.transform_block(other)],
}
}
fn walk_expand_inline(&mut self, inline: Inline<NodeId>) -> Vec<Inline<NodeId>> {
match inline {
Inline::Text { content, .. } => {
vec![Inline::Text {
content,
user_data: NodeId(self.next_id()),
}]
}
other => vec![self.transform_inline(other)],
}
}
}
// Transform generic AST while preserving and updating user data
let doc_with_ids: Document<NodeId> = /* ... */;
let mut transformer = IdAssigner { next_id: 1 };
let result = doc_with_ids.expand_with(&mut transformer);
use markdown_ppp::ast_transform::TransformPipeline;
let result = TransformPipeline::new()
.transform_text(|s| s.trim().to_string())
.transform_image_urls(|url| format!("https://cdn.example.com{}", url))
.when(is_production, |pipeline| {
pipeline.transform_link_urls(|url| url.replace("localhost", "example.com"))
})
.normalize_whitespace()
.remove_empty_paragraphs()
.apply(doc);
transform_text, transform_code, transform_htmltransform_image_urls, transform_link_urls, transform_autolink_urlsremove_empty_paragraphs, remove_empty_text, filter_blocksnormalize_whitespacetransform_with, transform_ifexpand_with (via ExpandWith trait)GenericTransformer<T> and GenericExpandWith<T> traitsYou can convert an AST (Document) back into a formatted Markdown string using the render_markdown function from the printer module.
This feature is enabled by default via the printer feature.
use markdown_ppp::printer::render_markdown;
use markdown_ppp::printer::config::Config;
use markdown_ppp::ast::Document;
// Assume you already have a parsed or constructed Document
let document = Document::default();
// Render it back to a Markdown string with default configuration
let markdown_output = render_markdown(&document, Config::default());
println!("{}", markdown_output);
This will format the Markdown with a default line width of 80 characters.
You can control the maximum width of lines in the generated Markdown by customizing the Config:
use markdown_ppp::printer::render_markdown;
use markdown_ppp::printer::config::Config;
use markdown_ppp::ast::Document;
// Set a custom maximum width, e.g., 120 characters
let config = Config::default().with_width(120);
let markdown_output = render_markdown(&Document::default(), config);
println!("{}", markdown_output);
This is useful if you want to control wrapping behavior or generate more compact or expanded Markdown documents.
You can convert an AST (Document) back into a formatted HTML string using the render_html function from the html_printer module.
This feature is enabled by default via the html-printer feature.
use markdown_ppp::html_printer::render_html;
use markdown_ppp::html_printer::config::Config;
use markdown_ppp::ast::Document;
let config = Config::default();
let ast = crate::parser::parse_markdown(crate::parser::MarkdownParserState::default(), "# Hello, World!")
.unwrap();
println!("{}", render_html(&ast, config));
You can convert an AST (Document) into LaTeX format using the render_latex function from the latex_printer module.
This feature is disabled by default and must be enabled via the latex-printer feature.
use markdown_ppp::latex_printer::render_latex;
use markdown_ppp::latex_printer::config::Config;
use markdown_ppp::ast::*;
let doc = Document {
blocks: vec![
Block::Heading(Heading {
kind: HeadingKind::Atx(1),
content: vec![Inline::Text("Hello LaTeX".to_string())],
}),
Block::Paragraph(vec![
Inline::Text("This is ".to_string()),
Inline::Strong(vec![Inline::Text("bold".to_string())]),
Inline::Text(" text.".to_string()),
]),
],
};
let config = Config::default();
let latex_output = render_latex(&doc, config);
println!("{}", latex_output);
The LaTeX printer supports various configuration options for different output styles:
use markdown_ppp::latex_printer::config::{Config, TableStyle};
// Use booktabs for professional tables
let config = Config::default().with_table_style(TableStyle::Booktabs);
// Use longtabu for tables that span multiple pages
let config = Config::default().with_table_style(TableStyle::Longtabu);
use markdown_ppp::latex_printer::config::{Config, CodeBlockStyle};
// Use minted for syntax highlighting (requires minted package)
let config = Config::default().with_code_block_style(CodeBlockStyle::Minted);
// Use listings package for code blocks
let config = Config::default().with_code_block_style(CodeBlockStyle::Listings);
let config = Config::default().with_width(100);
let latex_output = render_latex(&doc, config);
| Feature | Description |
|---|---|
parser |
Enables Markdown parsing support. Enabled by default. |
printer |
Enables AST → Markdown string conversion. Enabled by default. |
html-printer |
Enables AST → HTML string conversion. Enabled by default. |
latex-printer |
Enables AST → LaTeX string conversion. Disabled by default. |
ast-transform |
Enables AST transformation, query, and visitor functionality. Disabled by default. |
ast-specialized |
Provides specialized AST types with element IDs. Disabled by default. |
ast-serde |
Adds Serialize and Deserialize traits to all AST types via serde. Disabled by default. |
If you only need the AST types without parsing functionality, you can add the crate without default features:
cargo add --no-default-features markdown-ppp
If you want to disable Markdown generation (AST → Markdown string conversion), disable the printer feature manually:
cargo add markdown-ppp --no-default-features --features parser
To enable LaTeX output support:
cargo add markdown-ppp --features latex-printer
Licensed under the MIT License.