LibXDiff

 
The LibXDiff library implements basic and yet complete functionalities to create file differences/patches to both binary and text files. The library uses memory files as file abstraction to achieve both performance and portability. For binary files, LibXDiff implements both (with some modification) the algorithm described in File System Support for Delta Compression by Joshua P. MacDonald, and the algorithm described in Fingerprinting By Random Polynomials by Michael O. Rabin. While for text files it follows directives described in An O(ND) Difference Algorithm and Its Variations by Eugene W. Myers. Memory files used by the library are basically a collection of buffers that store the file content. There are two different requirements for memory files when passed to diff/patch functions. Text files for diff/patch functions require that a single line do not have to spawn across two different memory file blocks. Binary diff/patch functions require memory files to be compact. A compact memory files is a file whose content is stored inside a single block. Functionalities inside the library are available to satisfy these rules. Using the XDL_MMF_ATOMIC memory file flag it is possible to make writes to not split the written record across different blocks, while the functions xdl_mmfile_iscompact() , xdl_mmfile_compact() and xdl_mmfile_writeallocate() are usefull to test if the file is compact and to create a compacted version of the file itself. The text file differential output uses the raw unified output format, by omitting the file header since the result is always relative to a single compare operation (between two files). The output format of the binary patch file is proprietary (and binary) and it is basically a collection of copy and insert commands, like described inside the MacDonald paper. The library is compatible with almost every Unix implementation (configure script) and it is also compatible with Windows through custom (nmake) make files. Examples are available inside the test subdirectory of the distribution tarball that show how to use the library. Also, inside the same subdirectory, a regression test in available that tests the library with random data by requiring a diff followed by a patch and comparing results. Regression tests ran successfully for days on my Linux, Solaris, FreeBSD and Windows boxes, and this makes me believe that the library itself is completely ready for production (despite the version number).


Documentation


The LibXDiff library man page is available : HTML   TXT    PDF


License and Software

LibXDiff  is made available through the GNU LGPL license together with the complete sources. Please read carefully the license before using the software. The latest library package is available here :

Version 0.23


Links And Docs

LibXDiff FreshMeat Home Page
GNU DiffUtil

An O(ND) Difference Algorithm and Its Variations by Eugene W. Myers
File System Support for Delta Compression by Joshua P. MacDonald
Fingerprinting By Random Polynomials by Michael O. Rabin
Fingerprinting Using Polynomial by Calvin Chan and Hahua Lu
Some Applications of Rabin's Fingerprinting Method by Andrei Z. Broder



Back Home