xml-3dm

Crates.ioxml-3dm
lib.rsxml-3dm
version0.1.0
created_at2025-11-26 21:47:30.415427+00
updated_at2025-11-26 21:47:30.415427+00
description3-way XML diff and merge tool
homepagehttps://github.com/06chaynes/3dm-rs
repositoryhttps://github.com/06chaynes/3dm-rs
max_upload_size
id1952417
size376,124
Christian Haynes (06chaynes)

documentation

https://docs.rs/xml-3dm

README

xml-3dm

A Rust implementation of the 3DM (3-way Diff/Merge) algorithm for structure-aware XML differencing and merging.

Overview

xml-3dm is a library for performing 3-way merges on XML documents. The library operates on XML tree structure and handles the following scenarios:

  • Merging changes when one branch moves content and another edits it
  • Detecting and preserving copy operations across the tree
  • Resolving conflicts at the element and attribute level
  • Generating compact diff representations with copy detection

This is a Rust port of the original 3DM tool created by Tancred Lindholm as part of his Master's thesis at Helsinki University of Technology.

Features

  • Structure-aware merging: Understands XML tree structure, not just text lines
  • Heuristic matching: Uses Q-gram distance and tree structure to match corresponding nodes
  • Copy detection: Identifies copied subtrees and encodes them efficiently
  • Conflict resolution: Provides detailed conflict and warning logs
  • Full XML support: Handles elements, attributes, text nodes, and mixed content

Installation

Add this to your Cargo.toml:

[dependencies]
xml-3dm = "0.1"

Usage

3-Way Merge

use xml_3dm::{parse_file, BaseNodeFactory, BranchNodeFactory, XmlParser, Merge, TriMatching};
use std::io;

// Parse the three XML files
let base_parser = XmlParser::new(BaseNodeFactory);
let branch_parser = XmlParser::new(BranchNodeFactory);

let base = base_parser.parse_file("base.xml")?;
let left = branch_parser.parse_file("branch1.xml")?;
let right = branch_parser.parse_file("branch2.xml")?;

// Build matching between the three trees
let matching = TriMatching::new(left, base, right);

// Perform the merge
let mut merger = Merge::new(matching);
merger.merge(&mut io::stdout())?;

// Check for conflicts
if merger.conflict_log.has_conflicts() {
    eprintln!("Conflicts detected:");
    for conflict in merger.conflict_log.conflicts() {
        eprintln!("  {}", conflict);
    }
}

Diff Generation

use xml_3dm::{parse_file, BaseNodeFactory, BranchNodeFactory, XmlParser, Diff, DiffMatching};
use std::io;

// Parse base and modified versions
let base_parser = XmlParser::new(BaseNodeFactory);
let branch_parser = XmlParser::new(BranchNodeFactory);

let base = base_parser.parse_file("base.xml")?;
let branch = branch_parser.parse_file("modified.xml")?;

// Build matching optimized for diff
let matching = DiffMatching::new(base, branch);

// Generate diff
if let Some(diff) = Diff::new(&matching) {
    diff.diff(&mut io::stdout())?;
}

Patch Application

use xml_3dm::{parse_file, BaseNodeFactory, XmlParser, Patch};
use std::io;

// Parse base and diff
let parser = XmlParser::new(BaseNodeFactory);
let base = parser.parse_file("base.xml")?;
let diff = parser.parse_file("changes.xml")?;

// Apply patch
let patcher = Patch::new();
patcher.patch(&base, &diff, &mut io::stdout())?;

Algorithm

The 3DM algorithm works in several phases:

  1. Parsing: Builds tree representations of the XML documents
  2. Matching: Uses heuristic matching to find corresponding nodes across trees
    • Exact matching by content hash
    • Fuzzy matching using Q-gram distance
    • Structural matching of children
  3. Merge: Combines changes from both branches relative to the base
    • Detects inserts, deletes, moves, and updates
    • Resolves operation conflicts using a conflict table
    • Merges element content and attributes
  4. Output: Serializes the merged tree back to XML

Configuration

Copy Detection Threshold

The minimum size (in information bytes) for copy detection can be configured:

use xml_3dm::TriMatching;

// Disable copy detection
let matching = TriMatching::with_copy_threshold(left, base, right, 0);

// Increase threshold (default is 128)
let matching = TriMatching::with_copy_threshold(left, base, right, 256);

Custom Matching Algorithms

You can provide custom matching implementations:

use xml_3dm::{TriMatching, HeuristicMatching};

let left_matching = HeuristicMatching::new(128);
let right_matching = HeuristicMatching::new(128);

let matching = TriMatching::with_matching(
    left,
    base,
    right,
    left_matching,
    right_matching,
);

Limitations

  • No namespace awareness (attributes treated as simple name-value pairs)
  • Processing instructions are not preserved (comments ARE preserved)
  • Limited support for mixed content (element + text siblings)
  • Requires well-formed XML input

License

Licensed under the GNU Lesser General Public License v2.1 or later (LGPL-2.1-or-later), matching the original 3DM implementation.

Credits

Original 3DM implementation by Tancred Lindholm (Helsinki University of Technology, 2001).

Commit count: 0

cargo fmt