diff mupdf-source/thirdparty/extract/README @ 2:b50eed0cc0ef upstream

ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4. The directory name has changed: no version number in the expanded directory now.
author Franz Glasner <fzglas.hg@dom66.de>
date Mon, 15 Sep 2025 11:43:07 +0200
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/mupdf-source/thirdparty/extract/README	Mon Sep 15 11:43:07 2025 +0200
@@ -0,0 +1,69 @@
+Directory tree:
+
+    Makefile        Builds and runs tests.
+    include/        Public API.
+    src/            Scripts, C implementation and internal headers.
+        build/      Generated object files, executables etc.
+    test/           Test files.
+        generated/  Files generated by tests.
+
+Suggested setup for testing:
+    Checkout ghostpdl and mupdf into the same directory.
+    Inside ghostpdl:
+        ln -s ../mupdf/thirdparty/extract extract
+
+    Then either:
+        Inside ghostpdl:
+            ./autogen.sh --with-extract-dir=extract
+            make -j 8 debug DEBUGDIRPREFIX=debug-extract-
+        Inside mupdf:
+            make -j 8 debug
+    or:
+        make test-rebuild-dependent-binaries (for the first time)
+	make test-build-dependent-binaries (for incremental builds)
+
+
+    Then build and run tests from inside mupdf/thirdparty/extract
+    as below.
+
+Build and run tests with:
+    make
+
+Conventions:
+
+    Errors:
+    
+        Functions return zero on success or -1 with errno set.
+
+    Identifier/symbol names:
+
+        All identifiers that can be seen by client code (generally things
+        defined in include/) start with 'extract_'.
+        
+        Similarly global symbols in generated .o files all start with
+        'extract_'; this is tested by target 'test-obj'.
+
+        Other identifiers and symbols do not have an 'extract_' prefix - not
+        necessary because client code cannot see these names.
+        
+        Header names in include/ start with 'extract_'.
+
+    Allocation:
+
+        Functions that free a data structure generally take a double pointer
+        so that they can set the pointer to NULL before returning, which helps
+        avoid stray invalid non-NULL pointers. E.g.:
+
+            extract_span_free(extract_alloc_t* alloc, span_t** pspan);
+            /* Frees a span_t, returning with *pspan set to NULL. */
+
+        This double-pointer approach is also used for raw allocation - see
+        include/extract_alloc.h.
+
+    Lists:
+        Lists of data items are generally implemented using an array of
+        pointers and an int 'foo_num' entry, e.g.:
+
+            line_t**    lines;
+            int         lines_num;
+