Directory tree: Makefile Builds and runs tests. include/ Public API. src/ Scripts, C implementation and internal headers. build/ Generated object files, executables etc. test/ Test files. generated/ Files generated by tests. Suggested setup for testing: Checkout ghostpdl and mupdf into the same directory. Inside ghostpdl: ln -s ../mupdf/thirdparty/extract extract Then either: Inside ghostpdl: ./autogen.sh --with-extract-dir=extract make -j 8 debug DEBUGDIRPREFIX=debug-extract- Inside mupdf: make -j 8 debug or: make test-rebuild-dependent-binaries (for the first time) make test-build-dependent-binaries (for incremental builds) Then build and run tests from inside mupdf/thirdparty/extract as below. Build and run tests with: make Conventions: Errors: Functions return zero on success or -1 with errno set. Identifier/symbol names: All identifiers that can be seen by client code (generally things defined in include/) start with 'extract_'. Similarly global symbols in generated .o files all start with 'extract_'; this is tested by target 'test-obj'. Other identifiers and symbols do not have an 'extract_' prefix - not necessary because client code cannot see these names. Header names in include/ start with 'extract_'. Allocation: Functions that free a data structure generally take a double pointer so that they can set the pointer to NULL before returning, which helps avoid stray invalid non-NULL pointers. E.g.: extract_span_free(extract_alloc_t* alloc, span_t** pspan); /* Frees a span_t, returning with *pspan set to NULL. */ This double-pointer approach is also used for raw allocation - see include/extract_alloc.h. Lists: Lists of data items are generally implemented using an array of pointers and an int 'foo_num' entry, e.g.: line_t** lines; int lines_num;