yamp

Crates.ioyamp
lib.rsyamp
version0.1.0
created_at2025-09-17 06:20:51.046503+00
updated_at2025-09-17 06:20:51.046503+00
descriptionYet Another Minimal Parser - A safe, predictable YAML parser that treats all values as strings
homepage
repositoryhttps://github.com/sanjeevprasad/yamp
max_upload_size
id1842807
size137,546
(sanjeevprasad)

documentation

https://docs.rs/yamp

README

YAMP - Yet Another Minimal Parser

A lightweight, efficient YAML parser written in Rust that prioritizes comment preservation, simplicity, and avoiding common YAML pitfalls.

Philosophy

YAMP takes a radically simplified approach to YAML parsing with two core principles:

1. Comment Preservation First

Unlike most YAML parsers that discard comments, YAMP treats comments as first-class citizens. Every comment in your YAML file is preserved during parsing and re-emitted exactly where it belongs. This makes YAMP ideal for configuration files that need human-readable documentation.

2. Simplicity Through String-Only Values

  • All scalar values are strings - no implicit type conversion
  • No type confusion - values like NO, 3.10, 0755 are always strings
  • Predictable behavior - what you write is what you get

Features

  • Full comment preservation - both inline and standalone comments
  • Comments survive round-trips - parse, modify, and emit without losing documentation
  • ✅ All scalar values treated as strings (no type guessing)
  • ✅ Basic YAML structures (key-value pairs, arrays, nested objects)
  • ✅ Indentation-based structure parsing
  • ✅ Quoted and unquoted strings
  • ✅ Multiline strings (literal | and folded >)
  • ✅ Simple, clean API
  • ✅ Zero dependencies
  • ✅ Avoids ALL YAML "footguns" by design

What's Supported

  • Simple key-value pairs (values are always strings)
  • Nested objects (maps)
  • Arrays (sequences)
  • Comments (preserved during parsing and emitting)
  • Both quoted and unquoted strings
  • Multiline strings with literal (|) and folded (>) styles
  • Chomping modes for multiline strings (strip -, clip default, keep +)

What's NOT Supported

  • Multi-document YAML files (---)
  • Anchors and aliases (&, *)
  • Tags (!!str, !!int, etc.)
  • Flow style collections ({}, [])
  • Complex key types
  • Merge keys (<<)
  • Any form of implicit typing - by design!

Usage

Comment Preservation

YAMP's killer feature is preserving comments through parse/emit cycles:

use yamp::{parse, emit};

fn main() {
    let yaml = r#"# Application configuration
name: MyApp        # Application name
version: 1.2.3     # Semantic version

# Server settings
server:
  host: localhost  # Bind address
  port: 8080       # Listen port
"#;

    // Parse the YAML - all comments are preserved
    let parsed = parse(yaml).expect("Failed to parse");

    // Emit it back - comments remain intact!
    let output = emit(&parsed);
    println!("{}", output);
    // Output includes all the original comments
}

String-Only Values

All values are treated as strings, avoiding YAML's type confusion:

use yamp::{parse, emit, YamlValue};
use std::borrow::Cow;

fn main() {
    let yaml = r#"
name: John Doe
age: 30
active: true
version: 3.10
country: NO
permissions: 0755
"#;

    // Parse YAML
    let parsed = parse(yaml).expect("Failed to parse YAML");

    // Easy access with helper methods
    assert_eq!(parsed.get("age").and_then(|n| n.as_str()), Some("30"));
    assert_eq!(parsed.get("active").and_then(|n| n.as_str()), Some("true"));
    assert_eq!(parsed.get("version").and_then(|n| n.as_str()), Some("3.10")); // Not 3.1!
    assert_eq!(parsed.get("country").and_then(|n| n.as_str()), Some("NO")); // Not false!
    assert_eq!(parsed.get("permissions").and_then(|n| n.as_str()), Some("0755")); // Not 493!

    // Or using traditional approach for more control
    if let YamlValue::Object(map) = &parsed.value {
        let age = &map.get(&Cow::Borrowed("age")).unwrap().value;
        assert_eq!(age, &YamlValue::String(Cow::Borrowed("30")));
    }

    // Emit back to YAML
    let output = emit(&parsed);
    println!("{}", output);
}

Multiline Strings

YAMP supports YAML multiline strings while maintaining the all-strings philosophy:

use yamp::{parse, YamlValue};
use std::borrow::Cow;

fn main() {
    let yaml = r#"
description: |
  This is a multiline string
  that preserves line breaks.

summary: >
  This is a folded multiline string
  that joins lines with spaces.
"#;

    let parsed = parse(yaml).expect("Failed to parse YAML");

    if let YamlValue::Object(map) = &parsed.value {
        // Literal style (|) preserves line breaks
        let desc = &map.get(&Cow::Borrowed("description")).unwrap().value;
        if let YamlValue::String(s) = desc {
            assert!(s.contains("\n"));
        }

        // Folded style (>) joins lines with spaces
        let summary = &map.get(&Cow::Borrowed("summary")).unwrap().value;
        if let YamlValue::String(s) = summary {
            assert!(!s.contains("\n"));
            assert!(s.contains("string that joins"));
        }
    }
}

Why No Type System?

YAML's implicit typing leads to countless surprising behaviors and security issues:

The Problems We Avoid

  1. The Norway Problem: Country code NO becomes false
  2. Version Numbers: 3.10 becomes 3.1
  3. Octal Confusion: 0755 becomes 493
  4. Scientific Notation: 2e5a8d9 partially parsed as 200000
  5. Boolean Chaos: yes, on, y, true, True, TRUE all mean true
  6. Null Confusion: ~ in paths like ~/.ssh/config becomes null
  7. Time Values: 12:34:56 becomes 45296 (base-60)
  8. Special Floats: .inf, .nan cause portability issues

The YAMP Solution

Everything is a string. Period.

If you need typed data, parse the strings in your application where you have full control over the conversion rules. This approach is:

  • Predictable: No surprises based on value format
  • Secure: No injection attacks through type confusion
  • Portable: Same behavior across all systems
  • Simple: One rule to remember

Design Decisions

  • Comment preservation: Comments are treated as first-class citizens, not discarded metadata
  • No implicit typing: All scalar values are strings to avoid confusion and bugs
  • Simplicity over completeness: Only the most common YAML features are supported
  • Round-trip fidelity: Parse and emit cycles preserve structure and comments
  • Performance: Efficient parsing with minimal allocations
  • Error messages: Clear, helpful error messages for debugging
  • No external dependencies: Pure Rust implementation
  • Predictable behavior: No surprises with type conversion

When to Use YAMP

YAMP is perfect when you:

  • Need to preserve comments in configuration files
  • Want to programmatically update YAML while keeping human documentation
  • Want predictable, safe YAML parsing
  • Need to preserve exact values (version numbers, permissions, etc.)
  • Are building configuration management tools
  • Are building security-sensitive applications
  • Want to avoid YAML's numerous footguns
  • Prefer explicit type conversion in your application

When NOT to Use YAMP

Don't use YAMP if you:

  • Need full YAML 1.2 specification compliance
  • Require automatic type inference
  • Need anchors/aliases for data deduplication
  • Must parse existing YAML files that rely on implicit typing
  • Need multi-document YAML support

Installation

Add to your Cargo.toml:

[dependencies]
yamp = "0.1.0"

Contributing

Contributions are welcome! However, please note that we will NOT accept PRs that:

  • Add any form of implicit typing or type inference
  • Compromise the simplicity or predictability of the parser
  • Add complex YAML features that increase the attack surface

License

MIT

Acknowledgments

This parser is inspired by the StrictYAML project and the numerous articles documenting YAML's problematic "features".

YAMP was created to solve a specific problem: the need for a YAML parser that preserves comments while avoiding YAML's type system complexities. Most YAML parsers treat comments as throwaway metadata, but in configuration files, comments are often as important as the data itself.

Commit count: 14

cargo fmt