# natural-xml-diff
[![Crates.io](https://img.shields.io/crates/v/natural-xml-diff.svg)](https://crates.io/crates/natural-xml-diff)
[![Documentation](https://docs.rs/natural-xml-diff/badge.svg)](https://docs.rs/natural-xml-diff)
The `natural-xml-diff` crate implements a diffing algorithm that attempts to
produce correct and human readable differences between two XML documents.
[API Documentation](https://docs.rs/natural-xml-diff)
## Algorithm
The algorithm implemented by this library is based on the paper ["Bridging the
gap between tracking and detecting changes on
XML"](https://www.researchgate.net/publication/3943331_Detecting_changes_in_XML_documents).
It is also implemented by the Java-based [jndiff
library](https://jndiff.sourceforge.net/).
## Work in progress
This is still a work in progress!
## Credits
[Paligo](https://paligo.net)
## Structural diffing
Let's consider the following XML document, taken from the "Bridging the Gap" paper:
```xml
Text 1
Text 2
Text 4
Text 5
Text 6
Text 7Text 8
Text 9
Text 10
Text 11
Text 12
```
We'll call that "document A", the "before" of the diffing. Here's the "after", "document B":
```xml
Text 2
Text 4
Text 25
Text 11
Text 6
Text 7Text 8
Text 9
Text 10
Text 12
```
Let's present both as trees with numbered nodes (the root node, 0, is not shown).
Here's document A:
```mermaid
graph TD;
1[1 book]-->2
2[2 chapter]-->3
2-->5
3[3 title]-->4
4[4 Text 1]
5[5 para]-->6
6[6 Text 2]
1-->7
7[7 chapter] --> 8
8[8 title] --> 9
9[9 Text 4]
7 --> 10
10[10 para] --> 11
11[11 Text 5]
1 --> 12
12[12 chapter] --> 13
13[13 title] --> 14
14[14 Text 6]
12-->15
15[15 para] --> 16
15 --> 17
15 --> 18
16[16 Text 7]
17[18 img]
18[19 Text 8]
1 --> 19
19[19 chapter]
19 --> 20
20[20 title] --> 21
21[21 Text 9]
19 --> 22
22[22 para] --> 23
23[23 Text 10]
1 --> 24
24[24 chapter] --> 25
25[25 para] --> 26
26[26 Text 11]
24 --> 27
27[27 para] --> 28
28[28 Text 12]
```
## Maintaining the tests
Some tests use `test_generator` to generate tests from the `testdata` directory.
New tests in that directory aren't automatically picked up however; you have
to force a recompile of the `.rs` files that run the tests to do so. You can
do this by using a non-significant whitespace edit in each `.rs` file that
uses `test_generator` and saving. I hope there's a better solution.