rtf-parser-tt

Crates.iortf-parser-tt
lib.rsrtf-parser-tt
version0.5.0
created_at2025-12-17 16:53:54.436501+00
updated_at2025-12-17 16:53:54.436501+00
descriptionRTF parser with special character support (emdash, smart quotes, etc). Fork of rtf-parser.
homepage
repositoryhttps://github.com/TwelveTake-Studios/rtf-parser-tt
max_upload_size
id1990702
size85,244
(david-cilluffo)

documentation

https://docs.rs/rtf-parser-tt

README

rtf-parser-tt

Crates.io Documentation License

RTF parser with special character support. Fork of rtf-parser by @d0rianb.

Why This Fork?

The upstream rtf-parser crate silently drops special character control words like \emdash, \endash, and smart quotes. This causes data loss when parsing RTF from applications like Scrivener, Microsoft Word, and others.

This fork adds support for these characters.

Support

If you find this crate useful, consider supporting development:

Installation

[dependencies]
rtf-parser-tt = "0.5"

What's Added

Control Word Unicode Character
\emdash U+2014
\endash U+2013
\bullet U+2022
\lquote U+2018 '
\rquote U+2019 '
\ldblquote U+201C "
\rdblquote U+201D "
\tab U+0009 (tab)
\line U+000A (newline)

Usage

use rtf_parser_tt::RtfDocument;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let rtf = r#"{\rtf1\ansi Hello\emdash world}"#;
    let doc = RtfDocument::try_from(rtf)?;
    let text = doc.to_text();
    assert!(text.contains("—")); // Em-dash preserved!
    Ok(())
}

API

The API is identical to rtf-parser. See the original documentation for full details.

Key types:

  • RtfDocument - Parsed RTF document
  • Lexer - Tokenizes RTF input
  • Parser - Converts tokens to document
  • StyleBlock - Text with formatting info
  • Painter - Text style (bold, italic, etc.)
  • Paragraph - Layout info (alignment, spacing)

Upstream Status

A PR has been submitted to the upstream rtf-parser repository. Once merged, this fork may be deprecated in favor of the upstream version.

License

MIT License - same as upstream.

Credits

Commit count: 0

cargo fmt