LibXDiff
The LibXDiff library implements basic and yet complete
functionalities to create file differences/patches to both binary and
text files. The library uses memory files as file abstraction to
achieve
both performance and portability. For binary files, LibXDiff
implements both (with some modification) the algorithm described in File
System Support for Delta Compression by Joshua P. MacDonald,
and the algorithm described in Fingerprinting
By Random Polynomials by Michael
O. Rabin. While for text files it follows directives described
in An O(ND)
Difference Algorithm and Its Variations by Eugene W. Myers.
Memory files used by the library are basically a collection of buffers
that store the file content. There are two different requirements for
memory files when passed to diff/patch functions. Text files for
diff/patch functions require that a single line do not have to spawn
across two different memory file blocks. Binary diff/patch functions
require memory files to be compact. A compact memory files is a file
whose content is stored inside a single block. Functionalities inside
the library are available to satisfy these rules. Using the XDL_MMF_ATOMIC
memory file flag it is possible to make writes to not split the written
record across different blocks, while the functions xdl_mmfile_iscompact()
, xdl_mmfile_compact() and xdl_mmfile_writeallocate()
are usefull to test if the file is compact and to create a compacted
version of the file itself. The text file differential output uses the
raw unified output format, by omitting the file header since the result
is always relative to a single compare operation (between two files).
The output format of the binary patch file is proprietary (and binary)
and it is basically a collection of copy and insert commands, like
described inside the MacDonald paper. The library is compatible with
almost every Unix implementation (configure script) and it is also
compatible with Windows through custom (nmake) make files. Examples are
available inside the test
subdirectory of the distribution tarball that show how to use the
library. Also, inside the same subdirectory, a regression test in
available that tests the library with random data by requiring a diff
followed by a patch and comparing results. Regression tests ran
successfully for days on my Linux, Solaris, FreeBSD and Windows boxes,
and this makes me believe that the library itself is completely ready
for production (despite the version number).
Documentation
License and Software
LibXDiff is made available through the GNU LGPL license together with the complete sources. Please read carefully the license before using the software. The latest library package is available here :
Links And Docs