natural-xml-diff

namenatural-xml-diff
created_at2022-12-19 15:53:36.03268
updated_at2023-01-03 14:44:35.984734
downloads168
descriptionNatural diffing between XML documents
homepagehttps://github.com/faassen/natural-xml-diff
repositoryhttp://github.com/faassen/natural-xml-diff
max_upload_size
id741383
Martijn Faassen

documentation

https://docs.rs/natural-xml-diff

readme

# natural-xml-diff [![Crates.io](https://img.shields.io/crates/v/natural-xml-diff.svg)](https://crates.io/crates/natural-xml-diff) [![Documentation](https://docs.rs/natural-xml-diff/badge.svg)](https://docs.rs/natural-xml-diff) The `natural-xml-diff` crate implements a diffing algorithm that attempts to produce correct and human readable differences between two XML documents. [API Documentation](https://docs.rs/natural-xml-diff) ## Algorithm The algorithm implemented by this library is based on the paper ["Bridging the gap between tracking and detecting changes on XML"](https://www.researchgate.net/publication/3943331_Detecting_changes_in_XML_documents). It is also implemented by the Java-based [jndiff library](https://jndiff.sourceforge.net/). ## Work in progress This is still a work in progress! ## Credits [Paligo](https://paligo.net) ## Structural diffing Let's consider the following XML document, taken from the "Bridging the Gap" paper: ```xml Text 1 Text 2 Text 4 Text 5 Text 6 Text 7Text 8 Text 9 Text 10 Text 11 Text 12 ``` We'll call that "document A", the "before" of the diffing. Here's the "after", "document B": ```xml Text 2 Text 4 Text 25 Text 11 Text 6 Text 7Text 8 Text 9 Text 10 Text 12 ``` Let's present both as trees with numbered nodes (the root node, 0, is not shown). Here's document A: ```mermaid graph TD; 1[1 book]-->2 2[2 chapter]-->3 2-->5 3[3 title]-->4 4[4 Text 1] 5[5 para]-->6 6[6 Text 2] 1-->7 7[7 chapter] --> 8 8[8 title] --> 9 9[9 Text 4] 7 --> 10 10[10 para] --> 11 11[11 Text 5] 1 --> 12 12[12 chapter] --> 13 13[13 title] --> 14 14[14 Text 6] 12-->15 15[15 para] --> 16 15 --> 17 15 --> 18 16[16 Text 7] 17[18 img] 18[19 Text 8] 1 --> 19 19[19 chapter] 19 --> 20 20[20 title] --> 21 21[21 Text 9] 19 --> 22 22[22 para] --> 23 23[23 Text 10] 1 --> 24 24[24 chapter] --> 25 25[25 para] --> 26 26[26 Text 11] 24 --> 27 27[27 para] --> 28 28[28 Text 12] ``` ## Maintaining the tests Some tests use `test_generator` to generate tests from the `testdata` directory. New tests in that directory aren't automatically picked up however; you have to force a recompile of the `.rs` files that run the tests to do so. You can do this by using a non-significant whitespace edit in each `.rs` file that uses `test_generator` and saving. I hope there's a better solution.
Commit count: 0

cargo fmt