| Crates.io | turbovault-parser |
| lib.rs | turbovault-parser |
| version | 1.2.6 |
| created_at | 2025-10-24 16:20:13.493205+00 |
| updated_at | 2025-12-16 18:24:09.455756+00 |
| description | Obsidian Flavored Markdown (OFM) parser |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1898720 |
| size | 219,602 |
Obsidian Flavored Markdown (OFM) parser for extracting structured data from Obsidian vault files.
This crate provides fast, production-ready parsing of Obsidian markdown files, extracting:
--- delimited blocks[[Note]], [[folder/Note#Heading]], [[Note#^block]]![[Image.png]], ![[OtherNote]]#tag, #parent/child- [ ] Todo, - [x] Done> [!NOTE], > [!WARNING]+, etc.# H1 through ###### H6 with anchor generationBuilt on battle-tested libraries (pulldown-cmark, nom, regex) with comprehensive test coverage.
The parser uses a multi-pass approach:
All parsing is zero-allocation where possible, returning structured types from turbovault-core.
parsers/frontmatter_parser.rs)Extracts YAML metadata from the beginning of files:
use TurboVault_parser::parsers::extract_frontmatter;
let content = r#"---
title: My Note
tags: [rust, parser]
---
Content here"#;
let (frontmatter, rest) = extract_frontmatter(content)?;
assert_eq!(frontmatter, Some("title: My Note\ntags: [rust, parser]".to_string()));
assert_eq!(rest, "Content here");
Features:
None if no frontmatter present---)parsers/wikilinks.rs)Parses Obsidian's internal linking syntax:
use TurboVault_parser::Parser;
use std::path::PathBuf;
let parser = Parser::new(PathBuf::from("/vault"));
let content = "See [[Note#Heading]] and [[folder/SubNote]]";
// Links are extracted during parse_file()
// Supports:
// - [[Note]] - Basic link
// - [[folder/Note]] - Folder paths
// - [[Note#Heading]] - Section links
// - [[Note#^block-id]] - Block references
// - [[Note|Display Text]] - Custom display text (not yet implemented)
Features:
![[...]])The parser classifies links into the following types:
use turbovault_core::LinkType;
// WikiLink - basic wikilink: [[Note]]
// HeadingRef - cross-file heading: [[Note#Heading]] or file.md#section
// BlockRef - block reference: [[Note#^blockid]] or #^blockid
// Anchor - same-document anchor: [[#Heading]] or #section
// Embed - embedded content: ![[Note]]
// MarkdownLink - markdown link to file: [text](./file.md)
// ExternalLink - external URL: [text](https://...)
Detection examples:
[[Note]] → WikiLink[[Note#Heading]] → HeadingRef[[Note#^blockid]] → BlockRef[[#Heading]] → Anchor (same-document heading)[[#^blockid]] → BlockRef (same-document block)[text](#section) → Anchor[text](file.md#section) → HeadingRef[text](note.md#^block) → BlockRefparsers/embeds.rs)Extracts embedded files and notes:
// Parses: ![[Image.png]], ![[attachments/diagram.svg]], ![[OtherNote]]
let content = "See diagram: ![[assets/architecture.png]]";
// Extracted as Link with type_ = LinkType::Embed
Features:
.png, .jpg, .svg, etc.)parsers/tags.rs)Extracts hashtags and nested tags:
let content = "This is #rust and #project/obsidian code";
// Extracts:
// - Tag { name: "rust", is_nested: false }
// - Tag { name: "project/obsidian", is_nested: true }
Features:
#rust#parent/child/grandchildparsers/tasks.rs)Parses task list items with completion status:
let content = r#"
- [ ] Write documentation
- [x] Implement parser
- [ ] Nested task
"#;
// Extracts TaskItem with:
// - content: "Write documentation"
// - is_completed: false
// - position: SourcePosition
// - due_date: None (TODO)
Features:
- [ ]- [x]parsers/callouts.rs)Parses Obsidian's callout/admonition syntax:
let content = r#"
> [!NOTE] Important info
> This is the callout content
> [!WARNING]- Expandable warning
> Hidden by default
"#;
// Supported types:
// NOTE, TIP, INFO, TODO, IMPORTANT, SUCCESS, QUESTION,
// WARNING, FAILURE, DANGER, BUG, EXAMPLE, QUOTE
Features:
CalloutType enum)[!TYPE]+ (expanded) or [!TYPE]- (collapsed)parsers/headings.rs)Extracts markdown headings with automatic anchor generation:
let content = "# Main Title\n## Sub Section";
// Extracts:
// - Heading { text: "Main Title", level: 1, anchor: Some("main-title") }
// - Heading { text: "Sub Section", level: 2, anchor: Some("sub-section") }
Features:
# through ######)use TurboVault_parser::Parser;
use std::path::PathBuf;
let parser = Parser::new(PathBuf::from("/vault"));
let content = r#"---
title: Project Plan
tags: [project, planning]
---
# Overview
This project uses #rust and #obsidian.
## Tasks
- [ ] Set up repository
- [x] Create parser
See [[Architecture]] for details.
![[diagram.png]]
"#;
let vault_file = parser.parse_file(&PathBuf::from("Plan.md"), content)?;
// Access parsed elements:
assert_eq!(vault_file.frontmatter.unwrap().data.get("title"), Some(&json!("Project Plan")));
assert_eq!(vault_file.headings.len(), 2);
assert_eq!(vault_file.tags.len(), 2);
assert_eq!(vault_file.tasks.len(), 2);
assert_eq!(vault_file.links.len(), 2); // 1 wikilink + 1 embed
use TurboVault_parser::Parser;
use TurboVault_core::LinkType;
let parser = Parser::new(PathBuf::from("/vault"));
let content = "See [[Note A]] and [[Note B#Section]]";
let file = parser.parse_file(&PathBuf::from("source.md"), content)?;
for link in file.links {
match link.type_ {
LinkType::WikiLink => println!("Link to: {}", link.target),
LinkType::Embed => println!("Embeds: {}", link.target),
_ => {}
}
}
let parser = Parser::new(PathBuf::from("/vault"));
let content = r#"
# Daily Tasks
- [x] Morning standup
- [ ] Code review
- [ ] Write tests
"#;
let file = parser.parse_file(&PathBuf::from("daily.md"), content)?;
let completed = file.tasks.iter().filter(|t| t.is_completed).count();
let total = file.tasks.len();
println!("Progress: {}/{} tasks completed", completed, total);
This parser handles Obsidian's unique markdown extensions:
! prefix for transclusion#^block-id syntax for linking to specific blocks/ separated hierarchical tagsThese features go beyond standard CommonMark and are essential for Obsidian vault operations.
This crate depends on turbovault-core for:
VaultFile, Link, Tag, TaskItem, Callout, Heading, FrontmatterResult<T, Error> for consistent error handlingVaultConfig for vault-specific parsing settingsRe-exported Types: For convenience, this crate re-exports commonly used types from turbovault-core:
// No need to depend on turbovault-core separately
use turbovault_parser::{ContentBlock, InlineElement, LinkType, ListItem, TableAlignment};
use turbovault_parser::{LineIndex, SourcePosition};
The parser produces structured data that can be consumed by:
turbovault-vault: Vault management and indexingturbovault-server: MCP server tool implementations# All tests
cargo test
# Specific parser tests
cargo test --test frontmatter_parser
cargo test --test wikilinks
# With output
cargo test -- --nocapture
src/parsers/my_parser.rsVec<MyType>turbovault-core/src/models.rsParser::parse_content() in src/parsers.rs.md extension)lazy_static for one-time regex compilationpulldown-cmark: CommonMark parsing foundation (currently unused, reserved for future)frontmatter-gen: Frontmatter extraction (currently using custom regex)regex: Pattern matching for Obsidian syntaxnom: Parser combinators (available for complex parsing)serde, serde_yaml, serde_json: Frontmatter deserializationCurrent limitations:
[[Target|Display]] not yet parsedPlanned improvements:
See workspace license.
turbovault-core: Core data models and typesturbovault-vault: Vault management using parsed dataturbovault-server: MCP server tools built on parser output