comparison mupdf-source/docs/reference/swig.rst @ 2:b50eed0cc0ef upstream

ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4. The directory name has changed: no version number in the expanded directory now.
author Franz Glasner <fzglas.hg@dom66.de>
date Mon, 15 Sep 2025 11:43:07 +0200
parents
children
comparison
equal deleted inserted replaced
1:1d09e1dec1d9 2:b50eed0cc0ef
1 .. Copyright (C) 2001-2025 Artifex Software, Inc.
2 .. All Rights Reserved.
3
4
5 .. meta::
6 :description: MuPDF documentation
7 :keywords: MuPDF, pdf, epub
8
9
10 C++, Python, and C#
11 ===============================================================
12
13 ..
14 We define crude substitutions that implement simple expand/contract blocks
15 in html. Unfortunately it doesn't seem possible to pass parameters to
16 substitutions so we can't specify text to be shown next to html's details
17 triangle.
18
19 .. |expand_begin| raw:: html
20
21 <details>
22 <summary><strong>Show/hide</strong></summary>
23
24 .. |expand_end| raw:: html
25
26 </details>
27
28
29 Overview
30 ---------------------------------------------------------------
31
32 Auto-generated abstracted :title:`C++`, :title:`Python` and :title:`C#`
33 versions of the :title:`MuPDF C API` are available.
34
35 *
36 The C++ API is machine-generated from the C API header files and adds various
37 abstractions such as automatic contexts and automatic reference counting.
38
39 *
40 The Python and C# APIs are generated from the C++ API using SWIG, so
41 automatically include the C++ API's abstractions.
42
43 .. graphviz::
44
45 digraph
46 {
47 size="4,4";
48 labeljust=l;
49
50 "MuPDF C API" [shape="rectangle"]
51 "MuPDF C++ API" [shape="rectangle"]
52 "SWIG" [shape="oval"]
53 "MuPDF Python API" [shape="rectangle"]
54 "MuPDF C# API" [shape="rectangle"]
55
56 "MuPDF C API" -> "MuPDF C++ API" [label=" Parse C headers with libclang,\l generate abstractions.\l"]
57
58 "MuPDF C++ API" -> "SWIG" [label=" Parse C++ headers with SWIG."]
59 "SWIG" -> "MuPDF Python API"
60 "SWIG" -> "MuPDF C# API"
61 }
62
63
64 The C++ MuPDF API
65 ---------------------------------------------------------------
66
67 Basics
68 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69
70 * Auto-generated from the MuPDF C API's header files.
71
72 * Everything is in C++ namespace ``mupdf``.
73
74 * All functions and methods do not take ``fz_context*`` arguments.
75 (Automatically-generated per-thread contexts are used internally.)
76
77 * All MuPDF ``setjmp()``/``longjmp()``-based exceptions are converted into C++ exceptions.
78
79 Low-level C++ API
80 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81
82 The MuPDF C API is provided as low-level C++ functions with ``ll_`` prefixes.
83
84 * No ``fz_context*`` arguments.
85
86 * MuPDF exceptions are converted into C++ exceptions.
87
88 Class-aware C++ API
89 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90
91 C++ wrapper classes wrap most ``fz_*`` and ``pdf_*`` C structs:
92
93 * Class names are camel-case versions of the wrapped struct's
94 name, for example ``fz_document``'s wrapper class is ``mupdf::FzDocument``.
95
96 * Classes automatically handle reference counting of the underlying C structs,
97 so there is no need for manual calls to ``fz_keep_*()`` and ``fz_drop_*()``, and
98 class instances can be treated as values and copied arbitrarily.
99
100 Class-aware functions and methods take and return wrapper class instances
101 instead of MuPDF C structs:
102
103 * No ``fz_context*`` arguments.
104
105 * MuPDF exceptions are converted into C++ exceptions.
106
107 * Class-aware functions have the same names as the underlying C API function.
108
109 * Args that are pointers to a MuPDF struct will be changed to take a reference to
110 the corresponding wrapper class.
111
112 * Where a MuPDF function returns a pointer to a struct, the class-aware C++
113 wrapper will return a wrapper class instance by value.
114
115 * Class-aware functions that have a C++ wrapper class as their first parameter
116 are also provided as a member function of the wrapper class, with the same
117 name as the class-aware function.
118
119 * Wrapper classes are defined in ``mupdf/platform/c++/include/mupdf/classes.h``.
120
121 * Class-aware functions are declared in ``mupdf/platform/c++/include/mupdf/classes2.h``.
122
123 *
124 Wrapper classes for reference-counted MuPDF structs:
125
126 *
127 The C++ wrapper classes will have a public ``m_internal`` member that is a
128 pointer to the underlying MuPDF struct.
129
130 *
131 If a MuPDF C function returns a null pointer to a MuPDF struct, the
132 class-aware C++ wrapper will return an instance of the wrapper class with a
133 null ``m_internal`` member.
134
135 *
136 The C++ wrapper class will have an ``operator bool()`` that returns true if
137 the ``m_internal`` member is non-null.
138
139 [Introduced 2024-07-08.]
140
141 Usually it is more convenient to use the class-aware C++ API rather than the
142 low-level C++ API.
143
144 C++ Exceptions
145 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146
147 C++ exceptions use classes for each ``FZ_ERROR_*`` enum, all derived from a class
148 ``mupdf::FzErrorBase`` which in turn derives from ``std::exception``.
149
150 For example if MuPDF C code does ``fz_throw(ctx, FZ_ERROR_GENERIC,
151 "something failed")``, this will appear as a C++ exception with type
152 ``mupdf::FzErrorGeneric``. Its ``what()`` method will return ``code=2: something
153 failed``, and it will have a public member ``m_code`` set to ``FZ_ERROR_GENERIC``.
154
155 Example wrappers
156 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157
158 The MuPDF C API function ``fz_new_buffer_from_page()`` is available as these
159 C++ functions/methods:
160
161 .. code-block:: c++
162
163 // MuPDF C function.
164 fz_buffer *fz_new_buffer_from_page(fz_context *ctx, fz_page *page, const fz_stext_options *options);
165
166 // MuPDF C++ wrappers.
167 namespace mupdf
168 {
169 // Low-level wrapper:
170 ::fz_buffer *ll_fz_new_buffer_from_page(::fz_page *page, const ::fz_stext_options *options);
171
172 // Class-aware wrapper:
173 FzBuffer fz_new_buffer_from_page(const FzPage& page, FzStextOptions& options);
174
175 // Method in wrapper class FzPage:
176 struct FzPage
177 {
178 ...
179 FzBuffer fz_new_buffer_from_page(FzStextOptions& options);
180 ...
181 };
182 }
183
184
185 Extensions beyond the basic C API
186 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
187
188 * Some generated classes have extra ``begin()`` and ``end()`` methods to allow
189 standard C++ iteration:
190
191 |expand_begin|
192
193 .. code-block:: c++
194
195 #include "mupdf/classes.h"
196 #include "mupdf/functions.h"
197
198 #include <iostream>
199
200 void show_stext(mupdf::FzStextPage& page)
201 {
202 for (mupdf::FzStextPage::iterator it_page: page)
203 {
204 mupdf::FzStextBlock block = *it_page;
205 for (mupdf::FzStextBlock::iterator it_block: block)
206 {
207 mupdf::FzStextLine line = *it_block;
208 for (mupdf::FzStextLine::iterator it_line: line)
209 {
210 mupdf::FzStextChar stextchar = *it_line;
211 fz_stext_char* c = stextchar.m_internal;
212 using namespace mupdf;
213 std::cout << "FzStextChar("
214 << "c=" << c->c
215 << " color=" << c->color
216 << " origin=" << c->origin
217 << " quad=" << c->quad
218 << " size=" << c->size
219 << " font_name=" << c->font->name
220 << "\n";
221 }
222 }
223 }
224 }
225
226 |expand_end|
227
228 * There are various custom class methods and constructors.
229
230 * There are extra functions for generating a text representation of 'POD'
231 (plain old data) structs and their C++ wrapper classes.
232
233 For example for ``fz_rect`` we provide these functions:
234
235 .. code-block:: c++
236
237 std::ostream& operator<< (std::ostream& out, const fz_rect& rhs);
238 std::ostream& operator<< (std::ostream& out, const FzRect& rhs);
239 std::string to_string_fz_rect(const fz_rect& s);
240 std::string to_string(const fz_rect& s);
241 std::string Rect::to_string() const;
242
243 These each generate text such as: ``(x0=90.51 y0=160.65 x1=501.39 y1=1215.6)``
244
245 Runtime environmental variables
246 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
247
248 All builds
249 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
250
251 * **MUPDF_mt_ctx**
252
253 Controls support for multi-threading on startup.
254
255 * If set with value ``0``, a single ``fz_context*`` is used for all threads; this
256 might give a small performance increase in single-threaded programmes, but
257 will be unsafe in multi-threaded programmes.
258
259 * Otherwise each thread has its own ``fz_context*``.
260
261 One can instead call ``mupdf::reinit_singlethreaded()`` on startup to force
262 single-threaded mode. This should be done before any other use of MuPDF.
263
264 Debug builds only
265 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
266
267 Debug builds contain diagnostics/checking code that is activated via these
268 environmental variables:
269
270 * **MUPDF_check_refs**
271
272 If ``1``, generated code checks MuPDF struct reference counts at
273 runtime.
274
275 * **MUPDF_check_error_stack**
276
277 If ``1``, generated code outputs a diagnostic if a MuPDF function changes the
278 current ``fz_context``'s error stack depth.
279
280 * **MUPDF_trace**
281
282 If ``1`` or ``2``, class-aware code outputs a diagnostic each time it calls a
283 MuPDF function (apart from keep/drop functions).
284
285 If ``2``, low-level wrappers output a diagnostic each time they are
286 called. We also show arg POD and pointer values.
287
288 * **MUPDF_trace_director**
289
290 If ``1``, generated code outputs a diagnostic when doing special
291 handling of MuPDF structs containing function pointers.
292
293 * **MUPDF_trace_exceptions**
294
295 If ``1``, generated code outputs diagnostics when it converts MuPDF
296 ``setjmp()``/``longjmp()`` exceptions into C++ exceptions.
297
298 * **MUPDF_trace_keepdrop**
299
300 If ``1``, generated code outputs diagnostics for calls to ``*_keep_*()`` and
301 ``*_drop_*()``.
302
303 Limitations
304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
305
306 * Global instances of C++ wrapper classes are not supported.
307
308 This is because:
309
310 * C++ wrapper class destructors generally call MuPDF functions (for example
311 ``fz_drop_*()``).
312
313 * The C++ bindings use internal thread-local objects to allow per-thread
314 ``fz_context``'s to be efficiently obtained for use with underlying MuPDF
315 functions.
316
317 * C++ globals are destructed *after* thread-local objects are destructed.
318
319 So if a global instance of a C++ wrapper class is created, its destructor
320 will attempt to get a ``fz_context*`` using internal thread-local objects
321 which will have already been destroyed.
322
323 We attempt to display a diagnostic when this happens, but this cannot be
324 relied on as behaviour is formally undefined.
325
326
327 The Python and C# MuPDF APIs
328 ---------------------------------------------------------------
329
330 * A Python module called ``mupdf``.
331 * A C# namespace called ``mupdf``.
332
333 * Auto-generated from the C++ MuPDF API using SWIG, so inherits the abstractions of the C++ API:
334
335 * No ``fz_context*`` arguments.
336 * Automatic reference counting, so no need to call ``fz_keep_*()`` or ``fz_drop_*()``, and we have value-semantics for class instances.
337 * Native Python and C# exceptions.
338 * Output parameters are returned as tuples.
339
340 For example MuPDF C function ``fz_read_best()`` has prototype::
341
342 fz_buffer *fz_read_best(fz_context *ctx, fz_stream *stm, size_t initial, int *truncated);
343
344 The class-aware Python wrapper is::
345
346 mupdf.fz_read_best(stm, initial)
347
348 and returns ``(buffer, truncated)``, where ``buffer`` is a SWIG proxy for a
349 ``mupdf::FzBuffer`` instance and ``truncated`` is an integer.
350
351 * Allows implementation of mutool in Python - see
352 `mupdf:scripts/mutool.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool.py>`_
353 and
354 `mupdf:scripts/mutool_draw.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool_draw.py>`_.
355
356 * Provides text representation of simple 'POD' structs:
357
358 .. code-block:: python
359
360 rect = mupdf.FzRect(...)
361 print(rect) # Will output text such as: (x0=90.51 y0=160.65 x1=501.39 y1=215.6)
362
363 * This works for classes where the C++ API defines a ``to_string()`` method as described above.
364
365 * Python classes will have a ``__str__()` method, and an identical `__repr__()`` method.
366 * C# classes will have a ``ToString()`` method.
367
368 * Uses SWIG Director classes to allow C function pointers in MuPDF structs to call Python code.
369
370
371 Installing the Python mupdf module using ``pip``
372 ---------------------------------------------------------------
373
374 The Python ``mupdf`` module is available on the `Python Package Index (PyPI) website <https://pypi.org/>`_.
375
376 * Install with ``pip install mupdf``.
377 * Pre-built Wheels (binary Python packages) are provided for Windows and Linux.
378 * For more information on the latest release, see changelog below and: https://pypi.org/project/mupdf/
379
380 Doxygen/Pydoc API documentation
381 ---------------------------------------------------------------
382
383 Auto-generated documentation for the C, C++ and Python APIs is available at:
384 https://ghostscript.com/~julian/mupdf-bindings/
385
386 * All content is generated from the comments in MuPDF header files.
387
388 * This documentation is generated from an internal development tree, so may
389 contain features that are not yet publicly available.
390
391 * It is updated only intermittently.
392
393 Example client code
394 ---------------------------------------------------------------
395
396 Using the Python API
397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
398
399 Minimal Python code that uses the ``mupdf`` module::
400
401 import mupdf
402 document = mupdf.FzDocument('foo.pdf')
403
404 A simple example Python test script (run by ``scripts/mupdfwrap.py -t``) is:
405
406 * `scripts/mupdfwrap_test.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_test.py>`_
407
408 More detailed usage of the Python API can be found in:
409
410 * `scripts/mutool.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool.py>`_
411 * `scripts/mutool_draw.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool_draw.py>`_
412
413
414 **Example Python code that shows all available information about a document's Stext blocks, lines and characters**:
415
416 |expand_begin|
417 ::
418
419 #!/usr/bin/env python3
420
421 import mupdf
422
423 def show_stext(document):
424 '''
425 Shows all available information about Stext blocks, lines and characters.
426 '''
427 for p in range(document.fz_count_pages()):
428 page = document.fz_load_page(p)
429 stextpage = mupdf.FzStextPage(page, mupdf.FzStextOptions())
430 for block in stextpage:
431 block_ = block.m_internal
432 log(f'block: type={block_.type} bbox={block_.bbox}')
433 for line in block:
434 line_ = line.m_internal
435 log(f' line: wmode={line_.wmode}'
436 + f' dir={line_.dir}'
437 + f' bbox={line_.bbox}'
438 )
439 for char in line:
440 char_ = char.m_internal
441 log(f' char: {chr(char_.c)!r} c={char_.c:4} color={char_.color}'
442 + f' origin={char_.origin}'
443 + f' quad={char_.quad}'
444 + f' size={char_.size:6.2f}'
445 + f' font=('
446 + f'is_mono={char_.font.flags.is_mono}'
447 + f' is_bold={char_.font.flags.is_bold}'
448 + f' is_italic={char_.font.flags.is_italic}'
449 + f' ft_substitute={char_.font.flags.ft_substitute}'
450 + f' ft_stretch={char_.font.flags.ft_stretch}'
451 + f' fake_bold={char_.font.flags.fake_bold}'
452 + f' fake_italic={char_.font.flags.fake_italic}'
453 + f' has_opentype={char_.font.flags.has_opentype}'
454 + f' invalid_bbox={char_.font.flags.invalid_bbox}'
455 + f' name={char_.font.name}'
456 + f')'
457 )
458
459 document = mupdf.FzDocument('foo.pdf')
460 show_stext(document)
461
462 |expand_end|
463
464 Basic PDF viewers written in Python and C#
465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466
467 * `scripts/mupdfwrap_gui.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_gui.py>`_
468 * `scripts/mupdfwrap_gui.cs <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_gui.cs>`_
469 * Build and run with:
470
471 * ``./scripts/mupdfwrap.py -b all --test-python-gui``
472 * ``./scripts/mupdfwrap.py -b --csharp all --test-csharp-gui``
473
474
475 Building the C++, Python and C# MuPDF APIs from source
476 ---------------------------------------------------------------
477
478
479 General requirements
480 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
481
482 * Windows, Linux, MacOS or OpenBSD.
483
484 *
485 Build should take place inside a Python `venv <https://docs.python.org/3.8/library/venv.html>`_.
486
487 *
488 `libclang Python interface onto <https://libclang.readthedocs.io/en/latest/index.html>`_ the `clang C/C++ parser <https://clang.llvm.org/>`_.
489
490 * `swig <https://swig.org/>`_, for Python and C# bindings.
491
492 *
493 `Mono <https://www.mono-project.com/>`_, for C# bindings on platforms
494 other than Windows.
495
496
497 Setting up
498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
499
500 Windows only
501 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
502
503 * Install Python.
504
505 *
506 Use the Python Windows installer from the python.org website:
507 http://www.python.org/downloads
508
509 * Don't use other installers such as the Microsoft Store Python package.
510
511 *
512 If Microsoft Store Python is already installed, leave it in place and install
513 from python.org on top of it - uninstalling before running the python.org
514 installer has been known to cause problems.
515
516 * A default installation is sufficient.
517
518 * Debug binaries are required for debug builds of the MuPDF Python API.
519
520 *
521 If "Customize Installation" is chosen, make sure to include "py launcher" so
522 that the ``py`` command will be available.
523
524 * Also see: https://docs.python.org/3/using/windows.html
525
526 *
527 Install Visual Studio 2019. Later versions may not work with MuPDF's
528 solution and build files.
529
530
531 All platforms
532 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
533
534 * Get the latest version of MuPDF in git.
535
536 .. code-block:: shell
537
538 git clone --recursive git://git.ghostscript.com/mupdf.git
539
540 *
541 Create and enter a `Python venv <https://docs.python.org/3.8/library/venv.html>`_ and upgrade pip.
542
543 * Windows.
544
545 .. code-block:: bat
546
547 py -m venv pylocal
548 .\pylocal\Scripts\activate
549 python -m pip install --upgrade pip
550
551 * Linux, MacOS, OpenBSD
552
553 .. code-block:: shell
554
555 python3 -m venv pylocal
556 . pylocal/bin/activate
557 python -m pip install --upgrade pip
558
559
560 General build flags
561 ~~~~~~~~~~~~~~~~~~~
562
563 In all of the commands below, one can set environmental variables to control
564 the build of the underlying MuPDF C API, for example ``USE_SYSTEM_LIBJPEG=yes``.
565
566 In addition, ``XCXXFLAGS`` can be used to set additional C++ compiler flags when
567 building the C++ and Python bindings (the name is analogous to the ``XCFLAGS``
568 used by MuPDF's makefile when compiling the core library).
569
570
571 Building and installing the Python bindings using ``pip``
572 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
573
574 * Windows, Linux, MacOS.
575
576 .. code-block:: shell
577
578 cd mupdf && pip install -vv .
579
580 * OpenBSD.
581
582 Building using ``pip`` is not supported because ``libclang`` is not
583 available from pypi.org so pip will fail to install prerequisites from
584 ``pypackage.toml``.
585
586 Instead one can run ``setup.py`` directly:
587
588 .. code-block:: shell
589
590 cd mupdf && setup.py install
591
592
593 Building the Python bindings
594 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
595
596 * Windows, Linux, MacOS.
597
598 .. code-block:: shell
599
600 pip install libclang swig setuptools
601 cd mupdf && python scripts/mupdfwrap.py -b all
602
603 * OpenBSD.
604
605 ``libclang`` is not available from pypi.org, but we can instead use
606 the system ``py3-llvm`` package.
607
608 .. code-block:: shell
609
610 sudo pkg_add py3-llvm
611 pip install swig setuptools
612 cd mupdf && python scripts/mupdfwrap.py -b all
613
614 Building the C++ bindings
615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
616
617 * Windows, Linux, MacOS.
618
619 .. code-block:: shell
620
621 pip install libclang setuptools
622 cd mupdf && python scripts/mupdfwrap.py -b m01
623
624 * OpenBSD.
625
626 ``libclang`` is not available from pypi.org, but we can instead use
627 the system ``py3-llvm`` package.
628
629 .. code-block:: shell
630
631 sudo pkg_add py3-llvm
632 pip install setuptools
633 cd mupdf && python scripts/mupdfwrap.py -b m01
634
635
636 Building the C# bindings
637 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
638
639 * Windows.
640
641 .. code-block:: shell
642
643 pip install libclang swig setuptools
644 cd mupdf && python scripts/mupdfwrap.py -b --csharp all
645
646 * Linux.
647
648 .. code-block:: shell
649
650 sudo apt install mono-devel
651 pip install libclang swig
652 cd mupdf && python scripts/mupdfwrap.py -b --csharp all
653
654 * MacOS.
655
656 Building the C# bindings on MacOS is not currently supported.
657
658 * OpenBSD.
659
660 .. code-block:: shell
661
662 sudo pkg_add py3-llvm mono
663 pip install swig setuptools
664 cd mupdf && python scripts/mupdfwrap.py -b --csharp all
665
666
667 Using the bindings
668 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
669
670 To use the bindings, one has to tell the OS where to find the MuPDF
671 runtime files.
672
673 * C++ and C# bindings:
674
675 * Windows.
676
677 .. code-block:: shell
678
679 set PATH=.../mupdf/build/shared-release-x64-py3.11;%PATH%
680
681 * Replace ``x64`` with ``x32`` if using 32-bit.
682
683 * Replace ``3.11`` with the appropriate python version number.
684
685
686 * Linux, OpenBSD.
687
688 .. code-block:: shell
689
690 LD_LIBRARY_PATH=.../mupdf/build/shared-release
691
692 (``LD_LIBRARY_PATH`` must be an absolute path.)
693
694 * MacOS.
695
696 .. code-block:: shell
697
698 DYLD_LIBRARY_PATH=.../mupdf/build/shared-release
699
700 * Python bindings:
701
702 If the bindings have been built and installed using ``pip install``,
703 they will already be available within the venv.
704
705 Otherwise:
706
707 * Windows.
708
709 .. code-block:: shell
710
711 PYTHONPATH=.../mupdf/build/shared-release-x64-py3.11
712
713 * Replace ``x64`` with ``x32`` if using 32-bit.
714
715 * Replace ``3.11`` with the appropriate python version number.
716
717 * Linux, MacOS, OpenBSD.
718
719 .. code-block:: shell
720
721 PYTHONPATH=.../mupdf/build/shared-release
722
723
724 Notes
725 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
726
727 * Running tests.
728
729 Basic tests can be run by appending args to the ``scripts/mupdfwrap.py``
730 command.
731
732 This will also demonstrate how to set environment variables such as
733 ``PYTHONPATH`` or ``LD_LIBRARY_PATH`` to the MuPDF build directory.
734
735 * Python tests.
736
737 * ``--test-python``
738 * ``--test-python-gui``
739
740 * C# tests.
741
742 * ``--test-csharp``
743 * ``--test-csharp-gui``
744
745 * C++ tests.
746
747 * ``--test-cpp``
748
749 * C++ bindings and ``NDEBUG``.
750
751 When building client code that uses the C++ bindings, ``NDEBUG`` must
752 be defined/undefined to match how the C++ bindings were built. By
753 default the C++ bindings are a release build with ``NDEBUG`` defined, so
754 usually client code must also be built with ``NDEBUG`` defined. Otherwise
755 there will be build errors for missing C++ destructors, for example
756 ``mupdf::FzMatrix::~FzMatrix()``.
757
758 [This is because we define some destructors in debug builds only; this allows
759 internal reference counting checks.]
760
761 * Specifying the location of Visual Studio's ``devenv.com`` on Windows.
762
763 ``scripts/mupdfwrap.py`` looks for Visual Studio's ``devenv.com`` in
764 standard locations; this can be overridden with:
765
766 .. code-block:: shell
767
768 python scripts/mupdfwrap.py -b --devenv <devenv.com-location> ...
769
770 * Specifying compilers.
771
772 On non-Windows, we use ``cc`` and ``c++`` as default C and C++ compilers;
773 override by setting environment variables ``$CC`` and ``$CXX``.
774
775 * OpenBSD ``libclang``.
776
777 *
778 ``libclang`` cannot be installed with pip on OpenBSD - wheels are not
779 available and building from source fails.
780
781 However unlike on other platforms, the system python-clang package
782 (``py3-llvm``) is integrated with the system's libclang and can be
783 used directly.
784
785 So the above examples use ``pkg_add py3-llvm``.
786
787 * Alternatives to Python package ``libclang`` generally do not work.
788
789 For example pypi.org's `clang <https://pypi.org/project/clang/>`_, or
790 Debian's `python-clang <https://packages.debian.org/search?keywords=python+clang&searchon=names&suite=stable&section=all>`_.
791
792 These are inconvenient to use because they require explicit setting of
793 ``LD_LIBRARY_PATH`` to point to the correct libclang dynamic library.
794
795 * Debug builds.
796
797 One can specify a debug build using the ``-d <build-directory>`` arg
798 before ``-b``.
799
800 .. code-block:: shell
801
802 python ./scripts/mupdfwrap.py -d build/shared-debug -b ...
803
804 *
805 Debug builds of the Python and C# bindings on Windows have not been
806 tested. There may be issues with requiring a debug version of the Python
807 interpreter, for example ``python311_d.lib``.
808
809 *
810 C# build failure: ``cstring.i not implemented for this target`` and/or
811 ``Unknown directive '%cstring_output_allocate'``.
812
813 This is probably because SWIG does not include support for C#. This
814 has been seen in the past but as of 2023-07-19 pypi.org's default swig
815 seems ok.
816
817 A possible solution is to install SWIG using the system package
818 manager, for example ``sudo apt install swig`` on Linux, or use
819 ``./scripts/mupdfwrap.py --swig-windows-auto ...`` on Windows.
820
821
822 * More information about running ``scripts/mupdfwrap.py``.
823
824 * Run ``python ./scripts/mupdfwrap.py -h``.
825 * Read the doc-string at beginning of ``scripts/wrap/__main__.py+``.
826
827
828 How ``scripts/mupdfwrap.py`` builds the APIs
829 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
830
831 Building the MuPDF C API
832 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
833
834 * On Unix, runs ``make`` on MuPDF's ``Makefile`` with ``shared=yes``.
835
836 * On Windows, runs ``devenv.com`` on ``.sln`` and ``.vcxproj`` files within MuPDF's `platform/win32/ <https://git.ghostscript.com/?p=mupdf.git;a=tree;f=platform/win32>`_
837 directory.
838
839 Generation of the MuPDF C++ API
840 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
841
842 * Uses clang-python to parse MuPDF's C API.
843
844 * Generates C++ code that wraps the basic C interface, converting MuPDF
845 ``setjmp()``/``longjmp()`` exceptions into C++ exceptions and automatically
846 handling ``fz_context``'s internally.
847
848 * Generates C++ wrapper classes for each ``fz_*`` and ``pdf_*`` struct, and uses various
849 heuristics to define constructors, methods and static methods that call
850 ``fz_*()`` and ``pdf_*()`` functions. These classes' constructors and destructors
851 automatically handle reference counting so class instances can be copied
852 arbitrarily.
853
854 * C header file comments are copied into the generated C++ header files.
855
856 * Compile and link the generated C++ code to create shared libraries.
857
858
859 Generation of the MuPDF Python and C# APIs
860 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
861
862 * Uses SWIG to parse the previously-generated C++ headers and generate C++,
863 Python and C# code.
864
865 *
866 Defines some custom-written Python and C# functions and methods, for
867 example so that out-params are returned as tuples.
868
869 * If SWIG is version 4+, C++ comments are converted into Python doc-comments.
870
871 * Compile and link the SWIG-generated C++ code to create shared libraries.
872
873
874 Building auto-generated MuPDF API documentation
875 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
876
877 Build HTML documentation for the C, C++ and Python APIs (using Doxygen and pydoc):
878
879 .. code-block:: shell
880
881 python ./scripts/mupdfwrap.py --doc all
882
883 This will generate the following tree:
884
885 .. code-block:: text
886
887 mupdf/docs/generated/
888 index.html
889 c/
890 c++/
891 python/
892
893 All content is ultimately generated from the MuPDF C header file comments.
894
895 As of 2022-2-5, it looks like ``swig -doxygen`` (swig-4.02) ignores
896 single-line ``/** ... */`` comments, so the generated Python code (and
897 hence also Pydoc documentation) is missing information.
898
899 Generated files
900 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
901
902 All generated files are within the MuPDF checkout.
903
904 * C++ headers for the MuPDF C++ API are in ``platform/c++/include/``.
905
906 * Files required at runtime are in ``build/shared-release/``.
907
908 **Details**
909
910 .. code-block:: text
911
912 mupdf/
913 build/
914 shared-release/ [Unix runtime files.]
915 libmupdf.so [MuPDF C API, not MacOS.]
916 libmupdf.dylib [MuPDF C API, MacOS.]
917 libmupdfcpp.so [MuPDF C++ API.]
918 mupdf.py [MuPDF Python API.]
919 _mupdf.so [MuPDF Python API internals.]
920 mupdf.cs [MuPDF C# API.]
921 mupdfcsharp.so [MuPDF C# API internals.]
922
923 shared-debug/
924 [as shared-release but debug build.]
925
926 shared-release-x*-py*/ [Windows runtime files.]
927 mupdfcpp.dll [MuPDF C and C++ API, x32.]
928 mupdfcpp64.dll [MuPDF C and C++ API, x64.]
929 mupdf.py [MuPDF Python API.]
930 _mupdf.pyd [MuPDF Python API internals.]
931 mupdf.cs [MuPDF C# API.]
932 mupdfcsharp.dll [MuPDF C# API internals.]
933
934 platform/
935 c++/
936 include/ [MuPDF C++ API header files.]
937 mupdf/
938 classes.h
939 classes2.h
940 exceptions.h
941 functions.h
942 internal.h
943
944 implementation/ [MuPDF C++ implementation source files.]
945 classes.cpp
946 classes2.cpp
947 exceptions.cpp
948 functions.cpp
949 internal.cpp
950
951 generated.pickle [Information from clang parse step, used by later stages.]
952 windows_mupdf.def [List of MuPDF public global data, used when linking mupdfcpp.dll.]
953
954 python/ [SWIG Python files.]
955 mupdfcpp_swig.i [SWIG input file.]
956 mupdfcpp_swig.i.cpp [SWIG output file.]
957
958 csharp/ [SWIG C# files.]
959 mupdf.cs [SWIG output file, no out-params helpers.]
960 mupdfcpp_swig.i [SWIG input file.]
961 mupdfcpp_swig.i.cpp [SWIG output file.]
962
963 win32/
964 Release/ [Windows 32-bit .dll, .lib, .exp, .pdb etc.]
965 x64/
966 Release/ [Windows 64-bit .dll, .lib, .exp, .pdb etc.]
967 mupdfcpp64.dll [Copied to build/shared-release*/mupdfcpp64.dll]
968 mupdfpyswig.dll [Copied to build/shared-release*/_mupdf.pyd]
969 mupdfcpp64.lib
970 mupdfpyswig.lib
971
972 win32-vs-upgrade/ [used instead of win32/ if PYMUPDF_SETUP_MUPDF_VS_UPGRADE is '1'.]
973
974
975 Windows-specifics
976 ---------------------------------------------------------------
977
978 Required predefined macros
979 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
980
981 Code that will use the MuPDF DLL must be built with ``FZ_DLL_CLIENT``
982 predefined.
983
984 The MuPDF DLL itself is built with ``FZ_DLL`` predefined.
985
986 DLLs
987 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
988
989 There is no separate C library, instead the C and C++ APIs are
990 both in ``mupdfcpp.dll``, which is built by running devenv on
991 ``platform/win32/mupdf.sln``.
992
993 The Python SWIG library is called ``_mupdf.pyd`` which, despite the name, is a
994 standard Windows DLL, built from ``platform/python/mupdfcpp_swig.i.cpp``.
995
996 DLL export of functions and data
997 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
998
999 On Windows, ``include/mupdf/fitz/export.h`` defines ``FZ_FUNCTION`` and
1000 ``FZ_DATA` to `__declspec(dllexport)` and/or `__declspec(dllimport)``
1001 depending on whether ``FZ_DLL`` or ``FZ_DLL_CLIENT`` are defined.
1002
1003 All MuPDF C headers prefix declarations of public global data with ``FZ_DATA``.
1004
1005 In generated C++ code:
1006
1007 * Data declarations and definitions are prefixed with ``FZ_DATA``.
1008 * Function declarations and definitions are prefixed with ``FZ_FUNCTION``.
1009 * Class method declarations and definitions are prefixed with ``FZ_FUNCTION``.
1010
1011 When building ``mupdfcpp.dll`` on Windows we link with the auto-generated
1012 ``platform/c++/windows_mupdf.def`` file; this lists all C public global data.
1013
1014 For reasons that are not fully understood, we don't seem to need to tag
1015 C functions with ``FZ_FUNCTION``, but this is required for C++ functions
1016 otherwise we get unresolved symbols when building MuPDF client code.
1017
1018 Building the DLLs
1019 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1020
1021 We build Windows binaries by running ``devenv.com`` directly.
1022
1023 Building ``_mupdf.pyd`` is tricky because it needs to be built with a
1024 specific ``Python.h`` and linked with a specific ``python.lib``. This is
1025 done by setting environmental variables ``MUPDF_PYTHON_INCLUDE_PATH`` and
1026 ``MUPDF_PYTHON_LIBRARY_PATH`` when running ``devenv.com``, which are referenced
1027 by ``platform/win32/mupdfpyswig.vcxproj``. Thus one cannot easily build
1028 ``_mupdf.pyd`` directly from the Visual Studio GUI.
1029
1030 [In the git history there is code that builds ``_mupdf.pyd`` by running the
1031 Windows compiler and linker ``cl.exe`` and ``link.exe`` directly, which avoids
1032 the complications of going via devenv, at the expense of needing to know where
1033 ``cl.exe`` and ``link.exe`` are.]
1034
1035
1036 C++ bindings details
1037 ---------------------------------------------------------------
1038
1039 Wrapper functions
1040 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1041
1042 Wrappers for a MuPDF function ``fz_foo()`` are available in multiple forms:
1043
1044 * Functions in the ``mupdf`` namespace.
1045
1046 * ``mupdf::ll_fz_foo()``
1047
1048 * Low-level wrapper:
1049
1050 * Does not take ``fz_context*`` arg.
1051 * Translates MuPDF exceptions into C++ exceptions.
1052 * Takes/returns pointers to MuPDF structs.
1053 * Code that uses these functions will need to make explicit calls to
1054 ``fz_keep_*()`` and ``fz_drop_*()``.
1055
1056 * ``mupdf::fz_foo()``
1057
1058 * High-level class-aware wrapper:
1059
1060 * Does not take ``fz_context*`` arg.
1061 * Translates MuPDF exceptions into C++ exceptions.
1062 * Takes references to C++ wrapper class instances instead of pointers to
1063 MuPDF structs.
1064 * Where applicable, returns C++ wrapper class instances instead of
1065 pointers to MuPDF structs.
1066 * Code that uses these functions does not need to call ``fz_keep_*()``
1067 and ``fz_drop_*()`` - C++ wrapper class instances take care of reference
1068 counting internally.
1069
1070 * Class methods
1071
1072 * Where ``fz_foo()`` has a first arg (ignoring any ``fz_context*`` arg) that
1073 takes a pointer to a MuPDF struct ``foo_bar``, it is generally available as a
1074 member function of the wrapper class ``mupdf::FooBar``:
1075
1076 * ``mupdf::FooBar::fz_foo()``
1077
1078 * Apart from being a member function, this is identical to class-aware
1079 wrapper ``mupdf::fz_foo()``, for example taking references to wrapper classes
1080 instead of pointers to MuPDF structs.
1081
1082
1083 Constructors using MuPDF functions
1084 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1085
1086 Wrapper class constructors are created for each MuPDF function that returns an
1087 instance of a MuPDF struct.
1088
1089 Sometimes two such functions do not have different arg types so C++
1090 overloading cannot distinguish between them as constructors (because C++
1091 constructors do not have names).
1092
1093 We cope with this in two ways:
1094
1095 * Create a static method that returns a new instance of the wrapper class
1096 by value.
1097
1098 * This is not possible if the underlying MuPDF struct is not copyable - i.e.
1099 not reference counted and not POD.
1100
1101 * Define an enum within the wrapper class, and provide a constructor that takes
1102 an instance of this enum to specify which MuPDF function to use.
1103
1104
1105 Default constructors
1106 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1107
1108 All wrapper classes have a default constructor.
1109
1110 * For POD classes each member is set to a default value with ``this->foo =
1111 {};``. Arrays are initialised by setting all bytes to zero using
1112 ``memset()``.
1113 * For non-POD classes, class member ``m_internal`` is set to ``nullptr``.
1114 * Some classes' default constructors are customized, for example:
1115
1116 * The default constructor for ``fz_color_params`` wrapper
1117 ``mupdf::FzColorParams`` sets state to a copy of
1118 ``fz_default_color_params``.
1119 * The default constructor for ``fz_md5`` wrapper ``mupdf::FzMd5`` sets
1120 state using ``fz_md5_init()``.
1121 * These are described in class definition comments in
1122 ``platform/c++/include/mupdf/classes.h``.
1123
1124
1125 Raw constructors
1126 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1127
1128 Many wrapper classes have constructors that take a pointer to the underlying
1129 MuPDF C struct. These are usually for internal use only. They do not call
1130 ``fz_keep_*()`` - it is expected that any supplied MuPDF struct is already
1131 owned.
1132
1133
1134 POD wrapper classes
1135 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1136
1137 Class wrappers for MuPDF structs default to having a ``m_internal`` member which
1138 points to an instance of the wrapped struct. This works well for MuPDF structs
1139 which support reference counting, because we can automatically create copy
1140 constructors, ``operator=`` functions and destructors that call the associated
1141 ``fz_keep_*()`` and ``fz_drop_*()`` functions.
1142
1143 However where a MuPDF struct does not support reference counting and contains
1144 simple data, it is not safe to copy a pointer to the struct, so the class
1145 wrapper will be a POD class. This is done in one of two ways:
1146
1147 * ``m_internal`` is an instance of the MuPDF struct, not a pointer.
1148
1149 * Sometimes we provide members that give direct access to fields in
1150 ``m_internal``.
1151
1152 * An 'inline' POD - there is no ``m_internal`` member; instead the wrapper class
1153 contains the same members as the MuPDF struct. This can be a little more
1154 convenient to use.
1155
1156
1157 Extra static methods
1158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1159
1160 Where relevant, wrapper class can have static methods that wrap selected MuPDF
1161 functions. For example ``FzMatrix`` does this for ``fz_concat()``, ``fz_scale()`` etc,
1162 because these return the result by value rather than modifying a ``fz_matrix``
1163 instance.
1164
1165
1166 Miscellaneous custom wrapper classes
1167 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1168
1169 The wrapper for ``fz_outline_item`` does not contain a ``fz_outline_item`` by
1170 value or pointer. Instead it defines C++-style member equivalents to
1171 ``fz_outline_item``'s fields, to simplify usage from C++ and Python/C#.
1172
1173 The fields are initialised from a ``fz_outline_item`` when the wrapper class
1174 is constructed. In this particular case there is no need to hold on to a
1175 ``fz_outline_item``, and the use of ``std::string`` ensures that value semantics
1176 can work.
1177
1178
1179 Extra functions in C++, Python and C#
1180 ---------------------------------------------------------------
1181
1182 [These functions are available as low-level functions, class-aware
1183 functions and class methods.]
1184
1185 .. code-block:: c++
1186
1187 /**
1188 C++ alternative to ``fz_lookup_metadata()`` that returns a ``std::string``
1189 or calls ``fz_throw()`` if not found.
1190 */
1191 FZ_FUNCTION std::string fz_lookup_metadata2(fz_context* ctx, fz_document* doc, const char* key);
1192
1193 /**
1194 C++ alternative to ``pdf_lookup_metadata()`` that returns a ``std::string``
1195 or calls ``fz_throw()`` if not found.
1196 */
1197 FZ_FUNCTION std::string pdf_lookup_metadata2(fz_context* ctx, pdf_document* doc, const char* key);
1198
1199 /**
1200 C++ alternative to ``fz_md5_pixmap()`` that returns the digest by value.
1201 */
1202 FZ_FUNCTION std::vector<unsigned char> fz_md5_pixmap2(fz_context* ctx, fz_pixmap* pixmap);
1203
1204 /**
1205 C++ alternative to fz_md5_final() that returns the digest by value.
1206 */
1207 FZ_FUNCTION std::vector<unsigned char> fz_md5_final2(fz_md5* md5);
1208
1209 /** */
1210 FZ_FUNCTION long long fz_pixmap_samples_int(fz_context* ctx, fz_pixmap* pixmap);
1211
1212 /**
1213 Provides simple (but slow) access to pixmap data from Python and C#.
1214 */
1215 FZ_FUNCTION int fz_samples_get(fz_pixmap* pixmap, int offset);
1216
1217 /**
1218 Provides simple (but slow) write access to pixmap data from Python and
1219 C#.
1220 */
1221 FZ_FUNCTION void fz_samples_set(fz_pixmap* pixmap, int offset, int value);
1222
1223 /**
1224 C++ alternative to fz_highlight_selection() that returns quads in a
1225 std::vector.
1226 */
1227 FZ_FUNCTION std::vector<fz_quad> fz_highlight_selection2(fz_context* ctx, fz_stext_page* page, fz_point a, fz_point b, int max_quads);
1228
1229 struct fz_search_page2_hit
1230 {{
1231 fz_quad quad;
1232 int mark;
1233 }};
1234
1235 /**
1236 C++ alternative to fz_search_page() that returns information in a std::vector.
1237 */
1238 FZ_FUNCTION std::vector<fz_search_page2_hit> fz_search_page2(fz_context* ctx, fz_document* doc, int number, const char* needle, int hit_max);
1239
1240 /**
1241 C++ alternative to fz_string_from_text_language() that returns information in a std::string.
1242 */
1243 FZ_FUNCTION std::string fz_string_from_text_language2(fz_text_language lang);
1244
1245 /**
1246 C++ alternative to fz_get_glyph_name() that returns information in a std::string.
1247 */
1248 FZ_FUNCTION std::string fz_get_glyph_name2(fz_context* ctx, fz_font* font, int glyph);
1249
1250 /**
1251 Extra struct containing fz_install_load_system_font_funcs()'s args,
1252 which we wrap with virtual_fnptrs set to allow use from Python/C# via
1253 Swig Directors.
1254 */
1255 typedef struct fz_install_load_system_font_funcs_args
1256 {{
1257 fz_load_system_font_fn* f;
1258 fz_load_system_cjk_font_fn* f_cjk;
1259 fz_load_system_fallback_font_fn* f_fallback;
1260 }} fz_install_load_system_font_funcs_args;
1261
1262 /**
1263 Alternative to fz_install_load_system_font_funcs() that takes args in a
1264 struct, to allow use from Python/C# via Swig Directors.
1265 */
1266 FZ_FUNCTION void fz_install_load_system_font_funcs2(fz_context* ctx, fz_install_load_system_font_funcs_args* args);
1267
1268 /** Internal singleton state to allow Swig Director class to find
1269 fz_install_load_system_font_funcs_args class wrapper instance. */
1270 FZ_DATA extern void* fz_install_load_system_font_funcs2_state;
1271
1272 /** Helper for calling ``fz_document_handler::open`` function pointer via
1273 Swig from Python/C#. */
1274 FZ_FUNCTION fz_document* fz_document_handler_open(fz_context* ctx, const fz_document_handler *handler, fz_stream* stream, fz_stream* accel, fz_archive* dir, void* recognize_state);
1275
1276 /** Helper for calling a ``fz_document_handler::recognize`` function
1277 pointer via Swig from Python/C#. */
1278 FZ_FUNCTION int fz_document_handler_recognize(fz_context* ctx, const fz_document_handler *handler, const char *magic);
1279
1280 /** Swig-friendly wrapper for pdf_choice_widget_options(), returns the
1281 options directly in a vector. */
1282 FZ_FUNCTION std::vector<std::string> pdf_choice_widget_options2(fz_context* ctx, pdf_annot* tw, int exportval);
1283
1284 /** Swig-friendly wrapper for fz_new_image_from_compressed_buffer(),
1285 uses specified ``decode`` and ``colorkey`` if they are not null (in which
1286 case we assert that they have size ``2*fz_colorspace_n(colorspace)``). */
1287 FZ_FUNCTION fz_image* fz_new_image_from_compressed_buffer2(
1288 fz_context* ctx,
1289 int w,
1290 int h,
1291 int bpc,
1292 fz_colorspace* colorspace,
1293 int xres,
1294 int yres,
1295 int interpolate,
1296 int imagemask,
1297 const std::vector<float>& decode,
1298 const std::vector<int>& colorkey,
1299 fz_compressed_buffer* buffer,
1300 fz_image* mask
1301 );
1302
1303 /** Swig-friendly wrapper for pdf_rearrange_pages(). */
1304 void pdf_rearrange_pages2(
1305 fz_context* ctx,
1306 pdf_document* doc,
1307 const std::vector<int>& pages,
1308 pdf_clean_options_structure structure
1309 );
1310
1311 /** Swig-friendly wrapper for pdf_subset_fonts(). */
1312 void pdf_subset_fonts2(fz_context *ctx, pdf_document *doc, const std::vector<int>& pages);
1313
1314 /** Swig-friendly and typesafe way to do fz_snprintf(fmt, value). ``fmt``
1315 must end with one of 'efg' otherwise we throw an exception. */
1316 std::string fz_format_double(fz_context* ctx, const char* fmt, double value);
1317
1318 struct fz_font_ucs_gid
1319 {{
1320 unsigned long ucs;
1321 unsigned int gid;
1322 }};
1323
1324 /** SWIG-friendly wrapper for fz_enumerate_font_cmap(). */
1325 std::vector<fz_font_ucs_gid> fz_enumerate_font_cmap2(fz_context* ctx, fz_font* font);
1326
1327 /** SWIG-friendly wrapper for pdf_set_annot_callout_line(). */
1328 void pdf_set_annot_callout_line2(fz_context *ctx, pdf_annot *annot, std::vector<fz_point>& callout);
1329
1330 /** SWIG-friendly wrapper for fz_decode_barcode_from_display_list(),
1331 avoiding leak of the returned string. */
1332 std::string fz_decode_barcode_from_display_list2(fz_context *ctx, fz_barcode_type *type, fz_display_list *list, fz_rect subarea, int rotate);
1333
1334 /** SWIG-friendly wrapper for fz_decode_barcode_from_pixmap(), avoiding
1335 leak of the returned string. */
1336 std::string fz_decode_barcode_from_pixmap2(fz_context *ctx, fz_barcode_type *type, fz_pixmap *pix, int rotate);
1337
1338 /** SWIG-friendly wrapper for fz_decode_barcode_from_page(), avoiding
1339 leak of the returned string. */
1340 std::string fz_decode_barcode_from_page2(fz_context *ctx, fz_barcode_type *type, fz_page *page, fz_rect subarea, int rotate);
1341
1342
1343 Python/C# bindings details
1344 ---------------------------------------------------------------
1345
1346 Extra Python functions
1347 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1348
1349 Access to raw C arrays
1350 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1351
1352 The following functions can be used from Python to get access to raw data:
1353
1354 *
1355 ``mupdf.bytes_getitem(array, index)``: Gives access to individual items
1356 in an array of ``unsigned char``'s, for example in the data returned by
1357 ``mupdf::FzPixmap``'s ``samples()`` method.
1358
1359 *
1360 ``mupdf.floats_getitem(array, index)``: Gives access to individual items in an
1361 array of ``float``'s, for example in ``fz_stroke_state``'s ``float dash_list[32]``
1362 array. Generated with SWIG code ``carrays.i`` and ``array_functions(float,
1363 floats);``.
1364
1365 *
1366 ``mupdf.python_buffer_data(b)``: returns a SWIG wrapper for a ``const unsigned
1367 char*`` pointing to a Python buffer instance's raw data. For example ``b`` can
1368 be a Python ``bytes`` or ``bytearray`` instance.
1369
1370 *
1371 ``mupdfpython_mutable_buffer_data(b)``: returns a SWIG wrapper for an ``unsigned
1372 char*`` pointing to a Python buffer instance's raw data. For example ``b`` can
1373 be a Python ``bytearray`` instance.
1374
1375 [These functions are implemented internally using SWIG's ``carrays.i`` and
1376 ``pybuffer.i``.
1377
1378
1379 Python differences from C API
1380 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1381
1382 [The functions described below are also available as class methods.]
1383
1384
1385 Custom methods
1386 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1387
1388 Python and C# code does not easily handle functions that return raw data, for example
1389 as an ``unsigned char*`` that is not a zero-terminated string. Sometimes we provide a
1390 C++ method that returns a ``std::vector`` by value, so that Python and C# code can
1391 wrap it in a systematic way.
1392
1393 For example ``Md5::fz_md5_final2()``.
1394
1395 For all functions described below, there is also a ``ll_*`` variant that
1396 takes/returns raw MuPDF structs instead of wrapper classes.
1397
1398
1399 New functions
1400 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1401
1402 * ``fz_buffer_extract_copy()``: Returns copy of buffer data as a Python ``bytes``.
1403 * ``fz_buffer_storage_memoryview(buffer, writable)``: Returns a readonly/writable Python memoryview onto ``buffer``.
1404 Relies on ``buffer`` existing and not changing size while the memory view is used.
1405 * ``fz_pixmap_samples_memoryview()``: Returns Python ``memoryview`` onto ``fz_pixmap`` data.
1406
1407 * ``fz_lookup_metadata2(fzdocument, key)``: Return key value or raise an exception if not found:
1408 * ``pdf_lookup_metadata2(pdfdocument, key)``: Return key value or raise an exception if not found:
1409
1410 Implemented in Python
1411 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1412
1413 * ``fz_format_output_path()``
1414 * ``fz_story_positions()``
1415 * ``pdf_dict_getl()``
1416 * ``pdf_dict_putl()``
1417
1418 Non-standard API or implementation
1419 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1420
1421 * ``fz_buffer_extract()``: Returns a *copy* of the original buffer data as a Python ``bytes``. Still clears the buffer.
1422 * ``fz_buffer_storage()``: Returns ``(size, data)`` where ``data`` is a low-level SWIG representation of the buffer's storage.
1423 * ``fz_convert_color()``: No ``float* fv`` param, instead returns ``(rgb0, rgb1, rgb2, rgb3)``.
1424 * ``fz_fill_text()``: ``color`` arg is tuple/list of 1-4 floats.
1425 * ``fz_lookup_metadata(fzdocument, key)``: Return key value or None if not found:
1426 * ``fz_new_buffer_from_copied_data()``: Takes a Python ``bytes`` (or other Python buffer) instance.
1427 * ``fz_set_error_callback()``: Takes a Python callable; no ``void* user`` arg.
1428 * ``fz_set_warning_callback()``: Takes a Python callable; no ``void* user`` arg.
1429 * ``fz_warn()``: Takes single Python ``str`` arg.
1430 * ``pdf_dict_putl_drop()``: Always raises exception because not useful with automatic ref-counts.
1431 * ``pdf_load_field_name()``: Uses extra C++ function ``pdf_load_field_name2()`` which returns ``std::string`` by value.
1432 * ``pdf_lookup_metadata(pdfdocument, key)``: Return key value or None if not found:
1433 * ``pdf_set_annot_color()``: Takes single ``color`` arg which must be float or tuple of 1-4 floats.
1434 * ``pdf_set_annot_interior_color()``: Takes single ``color`` arg which must be float or tuple of 1-4 floats.
1435 * ``fz_install_load_system_font_funcs()``: Takes Python callbacks with no ``ctx`` arg,
1436 which can return ``None``, ``fz_font*`` or a ``mupdf.FzFont``.
1437
1438 Example usage (from ``scripts/mupdfwrap_test.py:test_install_load_system_font()``)::
1439
1440 def font_f(name, bold, italic, needs_exact_metrics):
1441 print(f'font_f(): Looking for font: {name=} {bold=} {italic=} {needs_exact_metrics=}.')
1442 return mupdf.fz_new_font_from_file(...)
1443 def f_cjk(name, ordering, serif):
1444 print(f'f_cjk(): Looking for font: {name=} {ordering=} {serif=}.')
1445 return None
1446 def f_fallback(script, language, serif, bold, italic):
1447 print(f'f_fallback(): looking for font: {script=} {language=} {serif=} {bold=} {italic=}.')
1448 return None
1449 mupdf.fz_install_load_system_font_funcs(font_f, f_cjk, f_fallback)
1450
1451
1452 Making MuPDF function pointers call Python code
1453 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1454
1455 Overview
1456 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1457
1458 For MuPDF structs with function pointers, we provide a second C++ wrapper
1459 class for use by the Python bindings.
1460
1461 * The second wrapper class has a ``2`` suffix, for example ``PdfFilterOptions2``.
1462
1463 * This second wrapper class has a virtual method for each function pointer, so
1464 it can be used as a `SWIG Director class <https://swig.org/Doc4.0/SWIGDocumentation.html#SWIGPlus_target_language_callbacks>`_.
1465
1466 * Overriding a virtual method in Python results in the Python method being
1467 called when MuPDF C code calls the corresponding function pointer.
1468
1469 * One needs to activate the use of a Python method as a callback by calling the
1470 special method ``use_virtual_<method-name>()``. [It might be possible in future
1471 to remove the need to do this.]
1472
1473 * It may be possible to use similar techniques in C# but this has not been
1474 tried.
1475
1476
1477 Callback args
1478 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1479
1480 Python callbacks have args that are more low-level than in the rest of the
1481 Python API:
1482
1483 * Callbacks generally have a first arg that is a SWIG representation of a MuPDF
1484 ``fz_context*``.
1485
1486 * Where the underlying MuPDF function pointer has an arg that is a pointer to
1487 an MuPDF struct, unlike elsewhere in the MuPDF bindings we do not translate
1488 this into an instance of the corresponding wrapper class. Instead Python
1489 callbacks will see a SWIG representation of the low-level C pointer.
1490
1491 * It is not safe to construct a Python wrapper class instance directly from
1492 such a SWIG representation of a C pointer, because it will break MuPDF's
1493 reference counting - Python/C++ constructors that take a raw pointer to a
1494 MuPDF struct do not call ``fz_keep_*()`` but the corresponding Python/C++
1495 destructor will call ``fz_drop_*()``.
1496
1497 * It might be safe to create an wrapper class instance using an explicit call
1498 to ``mupdf.fz_keep_*()``, but this has not been tried.
1499
1500 * As of 2023-02-03, exceptions from Python callbacks are propagated back
1501 through the Python, C++, C, C++ and Python layers. The resulting Python
1502 exception will have the original exception text, but the original Python
1503 backtrace is lost.
1504
1505
1506 Exceptions in callbacks
1507 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1508
1509 Python exceptions in Director callbacks are propagated back through the
1510 language layers (from Python to C++ to C, then back to C++ and finally to
1511 Python).
1512
1513 For convenience we add a text representation of the original Python backtrace
1514 to the exception text, but the C layer's fz_try/catch exception handling only
1515 holds 256 characters of exception text, so this backtrace information may be
1516 truncated by the time the exception reaches the original Python code's ``except ...`` block.
1517
1518 Example
1519 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1520
1521 Here is an example PDF filter written in Python that removes alternating items:
1522
1523 **Details**
1524
1525 |expand_begin|
1526
1527 .. code-block::
1528
1529 import mupdf
1530
1531 def test_filter(path):
1532 class MyFilter( mupdf.PdfFilterOptions2):
1533 def __init__( self):
1534 super().__init__()
1535 self.use_virtual_text_filter()
1536 self.recurse = 1
1537 self.sanitize = 1
1538 self.state = 1
1539 self.ascii = True
1540 def text_filter( self, ctx, ucsbuf, ucslen, trm, ctm, bbox):
1541 print( f'text_filter(): ctx={ctx} ucsbuf={ucsbuf} ucslen={ucslen} trm={trm} ctm={ctm} bbox={bbox}')
1542 # Remove every other item.
1543 self.state = 1 - self.state
1544 return self.state
1545
1546 filter_ = MyFilter()
1547
1548 document = mupdf.PdfDocument(path)
1549 for p in range(document.pdf_count_pages()):
1550 page = document.pdf_load_page(p)
1551 print( f'Running document.pdf_filter_page_contents on page {p}')
1552 document.pdf_begin_operation('test filter')
1553 document.pdf_filter_page_contents(page, filter_)
1554 document.pdf_end_operation()
1555
1556 document.pdf_save_document('foo.pdf', mupdf.PdfWriteOptions())
1557
1558 |expand_end|
1559
1560
1561
1562
1563
1564
1565
1566
1567 .. External links