diff mupdf-source/thirdparty/gumbo-parser/DEBUGGING.md @ 2:b50eed0cc0ef upstream

ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4. The directory name has changed: no version number in the expanded directory now.
author Franz Glasner <fzglas.hg@dom66.de>
date Mon, 15 Sep 2025 11:43:07 +0200
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/mupdf-source/thirdparty/gumbo-parser/DEBUGGING.md	Mon Sep 15 11:43:07 2025 +0200
@@ -0,0 +1,107 @@
+These are a couple of debugging notes that may be helpful for anyone developing
+Gumbo or trying to diagnose a tricky problem.  They will probably not be
+necessary for normal clients of this library - Gumbo is relatively stable, and
+bugs are often rare and obscure.  However, they're handy to have as a reference,
+and may also provide useful Google fodder to people searching for these tools.
+
+Standard disclaimer: I use all of these techniques on my Ubuntu 14.04 computer
+with gcc 4.8.2, clang 3.4, and gtest 1.6.0, but make no warranty about them
+working on other systems.  In particular, they're almost certain not to work on
+Windows.
+
+Debug output
+============
+
+Gumbo has a compile-time switch to dump lots of debug output onto stdout.
+Compile with the GUMBO_DEBUG define enabled:
+
+```bash
+$ make CFLAGS='-DGUMBO_DEBUG'
+```
+
+Note that this spits *a lot* of debug information to the console and makes the
+program run significantly slower, so it's usually helpful to isolate only the
+specific HTML file or fragment that causes the bug.  It lets us trace the
+operation of each of the tokenizer & parser's state machines in depth, though.
+
+Unit tests
+==========
+
+As mentioned in the README, Gumbo relies on [googletest][] for unit tests.
+Unzip the gtest ZIP distribution inside the Gumbo root and rename it 'gtest'.
+'make check' runs the tests, as normal.
+
+```bash
+$ make check
+$ cat test-suite.log
+```
+
+If you need to debug a core dump, you'll probably want to run the test binary
+directly:
+
+```bash
+$ ulimit -c unlimited
+$ make check
+$ .libs/lt-gumbo_test
+$ gdb .libs/lt-gumbo_test core
+```
+
+The same goes for core dumps in other example binaries.
+
+To run only a single unit test, pass the --gtest_filter='TestName' flag to the
+lt-gumbo_test binary.
+
+Assertions
+==========
+
+Gumbo relies pretty heavily on assertions.  By default they're enabled at
+run-time: to turn them off, define NDEBUG:
+
+```bash
+$ make CFLAGS='-DNDEBUG'
+```
+
+ASAN
+====
+
+Google's [address-sanitizer][] is a helpful tool that lets you find memory
+errors with relatively low overhead: enough that you can often run it in
+production.  Enabling it for C/C++ binaries is pretty standard and described on
+the ASAN documentation pages.  It requires Clang >=3.1 or GCC >= 4.8.
+
+```bash
+$ make \
+    CFLAGS='-fsanitize=address -fno-omit-frame-pointer -fno-inline' \
+    LDFLAGS='-fsanitize=address'
+```
+
+ASAN can also be used when Gumbo is compiled as a shared library and linked into
+a scripting language via FFI, but this use-case is unsupported by the ASAN
+authors.  To do it, use LD_PRELOAD to ensure the ASAN runtime support is
+included in the process:
+
+```bash
+$ LD_PRELOAD=libasan.so.0 python -c 'import gumbo; gumbo.parse(problem_text)'
+```
+
+Getting clean stack traces from this requires the use of the llvm-symbolizer
+binary, included with clang:
+
+```bash
+$ export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.4
+$ export ASAN_OPTIONS=symbolize=1
+$ LD_PRELOAD=libasan.so.0 python -c \
+  'import gumbo; gumbo.parse(problem_text)' 2>&1 | head -100
+$ killall llvm-symbolizer-3.4
+$ killall llvm-symbolizer-3.4
+$ killall llvm-symbolizer-3.4
+```
+
+This use case is even less officially supported than using it with dynamic
+shared objects; on my machine, it led to a recursive ASAN error about a
+use-after-free in llvm-symbolizer, effectively fork-bombing the machine.  Have
+the killalls ready, and avoid letting the process run for too long (eg. piping
+it to 'less').
+
+[googletest]: https://code.google.com/p/googletest/
+[address-sanitizer]: https://code.google.com/p/address-sanitizer/