Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/docs/reference/swig.rst @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 .. Copyright (C) 2001-2025 Artifex Software, Inc. | |
| 2 .. All Rights Reserved. | |
| 3 | |
| 4 | |
| 5 .. meta:: | |
| 6 :description: MuPDF documentation | |
| 7 :keywords: MuPDF, pdf, epub | |
| 8 | |
| 9 | |
| 10 C++, Python, and C# | |
| 11 =============================================================== | |
| 12 | |
| 13 .. | |
| 14 We define crude substitutions that implement simple expand/contract blocks | |
| 15 in html. Unfortunately it doesn't seem possible to pass parameters to | |
| 16 substitutions so we can't specify text to be shown next to html's details | |
| 17 triangle. | |
| 18 | |
| 19 .. |expand_begin| raw:: html | |
| 20 | |
| 21 <details> | |
| 22 <summary><strong>Show/hide</strong></summary> | |
| 23 | |
| 24 .. |expand_end| raw:: html | |
| 25 | |
| 26 </details> | |
| 27 | |
| 28 | |
| 29 Overview | |
| 30 --------------------------------------------------------------- | |
| 31 | |
| 32 Auto-generated abstracted :title:`C++`, :title:`Python` and :title:`C#` | |
| 33 versions of the :title:`MuPDF C API` are available. | |
| 34 | |
| 35 * | |
| 36 The C++ API is machine-generated from the C API header files and adds various | |
| 37 abstractions such as automatic contexts and automatic reference counting. | |
| 38 | |
| 39 * | |
| 40 The Python and C# APIs are generated from the C++ API using SWIG, so | |
| 41 automatically include the C++ API's abstractions. | |
| 42 | |
| 43 .. graphviz:: | |
| 44 | |
| 45 digraph | |
| 46 { | |
| 47 size="4,4"; | |
| 48 labeljust=l; | |
| 49 | |
| 50 "MuPDF C API" [shape="rectangle"] | |
| 51 "MuPDF C++ API" [shape="rectangle"] | |
| 52 "SWIG" [shape="oval"] | |
| 53 "MuPDF Python API" [shape="rectangle"] | |
| 54 "MuPDF C# API" [shape="rectangle"] | |
| 55 | |
| 56 "MuPDF C API" -> "MuPDF C++ API" [label=" Parse C headers with libclang,\l generate abstractions.\l"] | |
| 57 | |
| 58 "MuPDF C++ API" -> "SWIG" [label=" Parse C++ headers with SWIG."] | |
| 59 "SWIG" -> "MuPDF Python API" | |
| 60 "SWIG" -> "MuPDF C# API" | |
| 61 } | |
| 62 | |
| 63 | |
| 64 The C++ MuPDF API | |
| 65 --------------------------------------------------------------- | |
| 66 | |
| 67 Basics | |
| 68 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 69 | |
| 70 * Auto-generated from the MuPDF C API's header files. | |
| 71 | |
| 72 * Everything is in C++ namespace ``mupdf``. | |
| 73 | |
| 74 * All functions and methods do not take ``fz_context*`` arguments. | |
| 75 (Automatically-generated per-thread contexts are used internally.) | |
| 76 | |
| 77 * All MuPDF ``setjmp()``/``longjmp()``-based exceptions are converted into C++ exceptions. | |
| 78 | |
| 79 Low-level C++ API | |
| 80 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 81 | |
| 82 The MuPDF C API is provided as low-level C++ functions with ``ll_`` prefixes. | |
| 83 | |
| 84 * No ``fz_context*`` arguments. | |
| 85 | |
| 86 * MuPDF exceptions are converted into C++ exceptions. | |
| 87 | |
| 88 Class-aware C++ API | |
| 89 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 90 | |
| 91 C++ wrapper classes wrap most ``fz_*`` and ``pdf_*`` C structs: | |
| 92 | |
| 93 * Class names are camel-case versions of the wrapped struct's | |
| 94 name, for example ``fz_document``'s wrapper class is ``mupdf::FzDocument``. | |
| 95 | |
| 96 * Classes automatically handle reference counting of the underlying C structs, | |
| 97 so there is no need for manual calls to ``fz_keep_*()`` and ``fz_drop_*()``, and | |
| 98 class instances can be treated as values and copied arbitrarily. | |
| 99 | |
| 100 Class-aware functions and methods take and return wrapper class instances | |
| 101 instead of MuPDF C structs: | |
| 102 | |
| 103 * No ``fz_context*`` arguments. | |
| 104 | |
| 105 * MuPDF exceptions are converted into C++ exceptions. | |
| 106 | |
| 107 * Class-aware functions have the same names as the underlying C API function. | |
| 108 | |
| 109 * Args that are pointers to a MuPDF struct will be changed to take a reference to | |
| 110 the corresponding wrapper class. | |
| 111 | |
| 112 * Where a MuPDF function returns a pointer to a struct, the class-aware C++ | |
| 113 wrapper will return a wrapper class instance by value. | |
| 114 | |
| 115 * Class-aware functions that have a C++ wrapper class as their first parameter | |
| 116 are also provided as a member function of the wrapper class, with the same | |
| 117 name as the class-aware function. | |
| 118 | |
| 119 * Wrapper classes are defined in ``mupdf/platform/c++/include/mupdf/classes.h``. | |
| 120 | |
| 121 * Class-aware functions are declared in ``mupdf/platform/c++/include/mupdf/classes2.h``. | |
| 122 | |
| 123 * | |
| 124 Wrapper classes for reference-counted MuPDF structs: | |
| 125 | |
| 126 * | |
| 127 The C++ wrapper classes will have a public ``m_internal`` member that is a | |
| 128 pointer to the underlying MuPDF struct. | |
| 129 | |
| 130 * | |
| 131 If a MuPDF C function returns a null pointer to a MuPDF struct, the | |
| 132 class-aware C++ wrapper will return an instance of the wrapper class with a | |
| 133 null ``m_internal`` member. | |
| 134 | |
| 135 * | |
| 136 The C++ wrapper class will have an ``operator bool()`` that returns true if | |
| 137 the ``m_internal`` member is non-null. | |
| 138 | |
| 139 [Introduced 2024-07-08.] | |
| 140 | |
| 141 Usually it is more convenient to use the class-aware C++ API rather than the | |
| 142 low-level C++ API. | |
| 143 | |
| 144 C++ Exceptions | |
| 145 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 146 | |
| 147 C++ exceptions use classes for each ``FZ_ERROR_*`` enum, all derived from a class | |
| 148 ``mupdf::FzErrorBase`` which in turn derives from ``std::exception``. | |
| 149 | |
| 150 For example if MuPDF C code does ``fz_throw(ctx, FZ_ERROR_GENERIC, | |
| 151 "something failed")``, this will appear as a C++ exception with type | |
| 152 ``mupdf::FzErrorGeneric``. Its ``what()`` method will return ``code=2: something | |
| 153 failed``, and it will have a public member ``m_code`` set to ``FZ_ERROR_GENERIC``. | |
| 154 | |
| 155 Example wrappers | |
| 156 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 157 | |
| 158 The MuPDF C API function ``fz_new_buffer_from_page()`` is available as these | |
| 159 C++ functions/methods: | |
| 160 | |
| 161 .. code-block:: c++ | |
| 162 | |
| 163 // MuPDF C function. | |
| 164 fz_buffer *fz_new_buffer_from_page(fz_context *ctx, fz_page *page, const fz_stext_options *options); | |
| 165 | |
| 166 // MuPDF C++ wrappers. | |
| 167 namespace mupdf | |
| 168 { | |
| 169 // Low-level wrapper: | |
| 170 ::fz_buffer *ll_fz_new_buffer_from_page(::fz_page *page, const ::fz_stext_options *options); | |
| 171 | |
| 172 // Class-aware wrapper: | |
| 173 FzBuffer fz_new_buffer_from_page(const FzPage& page, FzStextOptions& options); | |
| 174 | |
| 175 // Method in wrapper class FzPage: | |
| 176 struct FzPage | |
| 177 { | |
| 178 ... | |
| 179 FzBuffer fz_new_buffer_from_page(FzStextOptions& options); | |
| 180 ... | |
| 181 }; | |
| 182 } | |
| 183 | |
| 184 | |
| 185 Extensions beyond the basic C API | |
| 186 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 187 | |
| 188 * Some generated classes have extra ``begin()`` and ``end()`` methods to allow | |
| 189 standard C++ iteration: | |
| 190 | |
| 191 |expand_begin| | |
| 192 | |
| 193 .. code-block:: c++ | |
| 194 | |
| 195 #include "mupdf/classes.h" | |
| 196 #include "mupdf/functions.h" | |
| 197 | |
| 198 #include <iostream> | |
| 199 | |
| 200 void show_stext(mupdf::FzStextPage& page) | |
| 201 { | |
| 202 for (mupdf::FzStextPage::iterator it_page: page) | |
| 203 { | |
| 204 mupdf::FzStextBlock block = *it_page; | |
| 205 for (mupdf::FzStextBlock::iterator it_block: block) | |
| 206 { | |
| 207 mupdf::FzStextLine line = *it_block; | |
| 208 for (mupdf::FzStextLine::iterator it_line: line) | |
| 209 { | |
| 210 mupdf::FzStextChar stextchar = *it_line; | |
| 211 fz_stext_char* c = stextchar.m_internal; | |
| 212 using namespace mupdf; | |
| 213 std::cout << "FzStextChar(" | |
| 214 << "c=" << c->c | |
| 215 << " color=" << c->color | |
| 216 << " origin=" << c->origin | |
| 217 << " quad=" << c->quad | |
| 218 << " size=" << c->size | |
| 219 << " font_name=" << c->font->name | |
| 220 << "\n"; | |
| 221 } | |
| 222 } | |
| 223 } | |
| 224 } | |
| 225 | |
| 226 |expand_end| | |
| 227 | |
| 228 * There are various custom class methods and constructors. | |
| 229 | |
| 230 * There are extra functions for generating a text representation of 'POD' | |
| 231 (plain old data) structs and their C++ wrapper classes. | |
| 232 | |
| 233 For example for ``fz_rect`` we provide these functions: | |
| 234 | |
| 235 .. code-block:: c++ | |
| 236 | |
| 237 std::ostream& operator<< (std::ostream& out, const fz_rect& rhs); | |
| 238 std::ostream& operator<< (std::ostream& out, const FzRect& rhs); | |
| 239 std::string to_string_fz_rect(const fz_rect& s); | |
| 240 std::string to_string(const fz_rect& s); | |
| 241 std::string Rect::to_string() const; | |
| 242 | |
| 243 These each generate text such as: ``(x0=90.51 y0=160.65 x1=501.39 y1=1215.6)`` | |
| 244 | |
| 245 Runtime environmental variables | |
| 246 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 247 | |
| 248 All builds | |
| 249 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 250 | |
| 251 * **MUPDF_mt_ctx** | |
| 252 | |
| 253 Controls support for multi-threading on startup. | |
| 254 | |
| 255 * If set with value ``0``, a single ``fz_context*`` is used for all threads; this | |
| 256 might give a small performance increase in single-threaded programmes, but | |
| 257 will be unsafe in multi-threaded programmes. | |
| 258 | |
| 259 * Otherwise each thread has its own ``fz_context*``. | |
| 260 | |
| 261 One can instead call ``mupdf::reinit_singlethreaded()`` on startup to force | |
| 262 single-threaded mode. This should be done before any other use of MuPDF. | |
| 263 | |
| 264 Debug builds only | |
| 265 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 266 | |
| 267 Debug builds contain diagnostics/checking code that is activated via these | |
| 268 environmental variables: | |
| 269 | |
| 270 * **MUPDF_check_refs** | |
| 271 | |
| 272 If ``1``, generated code checks MuPDF struct reference counts at | |
| 273 runtime. | |
| 274 | |
| 275 * **MUPDF_check_error_stack** | |
| 276 | |
| 277 If ``1``, generated code outputs a diagnostic if a MuPDF function changes the | |
| 278 current ``fz_context``'s error stack depth. | |
| 279 | |
| 280 * **MUPDF_trace** | |
| 281 | |
| 282 If ``1`` or ``2``, class-aware code outputs a diagnostic each time it calls a | |
| 283 MuPDF function (apart from keep/drop functions). | |
| 284 | |
| 285 If ``2``, low-level wrappers output a diagnostic each time they are | |
| 286 called. We also show arg POD and pointer values. | |
| 287 | |
| 288 * **MUPDF_trace_director** | |
| 289 | |
| 290 If ``1``, generated code outputs a diagnostic when doing special | |
| 291 handling of MuPDF structs containing function pointers. | |
| 292 | |
| 293 * **MUPDF_trace_exceptions** | |
| 294 | |
| 295 If ``1``, generated code outputs diagnostics when it converts MuPDF | |
| 296 ``setjmp()``/``longjmp()`` exceptions into C++ exceptions. | |
| 297 | |
| 298 * **MUPDF_trace_keepdrop** | |
| 299 | |
| 300 If ``1``, generated code outputs diagnostics for calls to ``*_keep_*()`` and | |
| 301 ``*_drop_*()``. | |
| 302 | |
| 303 Limitations | |
| 304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 305 | |
| 306 * Global instances of C++ wrapper classes are not supported. | |
| 307 | |
| 308 This is because: | |
| 309 | |
| 310 * C++ wrapper class destructors generally call MuPDF functions (for example | |
| 311 ``fz_drop_*()``). | |
| 312 | |
| 313 * The C++ bindings use internal thread-local objects to allow per-thread | |
| 314 ``fz_context``'s to be efficiently obtained for use with underlying MuPDF | |
| 315 functions. | |
| 316 | |
| 317 * C++ globals are destructed *after* thread-local objects are destructed. | |
| 318 | |
| 319 So if a global instance of a C++ wrapper class is created, its destructor | |
| 320 will attempt to get a ``fz_context*`` using internal thread-local objects | |
| 321 which will have already been destroyed. | |
| 322 | |
| 323 We attempt to display a diagnostic when this happens, but this cannot be | |
| 324 relied on as behaviour is formally undefined. | |
| 325 | |
| 326 | |
| 327 The Python and C# MuPDF APIs | |
| 328 --------------------------------------------------------------- | |
| 329 | |
| 330 * A Python module called ``mupdf``. | |
| 331 * A C# namespace called ``mupdf``. | |
| 332 | |
| 333 * Auto-generated from the C++ MuPDF API using SWIG, so inherits the abstractions of the C++ API: | |
| 334 | |
| 335 * No ``fz_context*`` arguments. | |
| 336 * Automatic reference counting, so no need to call ``fz_keep_*()`` or ``fz_drop_*()``, and we have value-semantics for class instances. | |
| 337 * Native Python and C# exceptions. | |
| 338 * Output parameters are returned as tuples. | |
| 339 | |
| 340 For example MuPDF C function ``fz_read_best()`` has prototype:: | |
| 341 | |
| 342 fz_buffer *fz_read_best(fz_context *ctx, fz_stream *stm, size_t initial, int *truncated); | |
| 343 | |
| 344 The class-aware Python wrapper is:: | |
| 345 | |
| 346 mupdf.fz_read_best(stm, initial) | |
| 347 | |
| 348 and returns ``(buffer, truncated)``, where ``buffer`` is a SWIG proxy for a | |
| 349 ``mupdf::FzBuffer`` instance and ``truncated`` is an integer. | |
| 350 | |
| 351 * Allows implementation of mutool in Python - see | |
| 352 `mupdf:scripts/mutool.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool.py>`_ | |
| 353 and | |
| 354 `mupdf:scripts/mutool_draw.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool_draw.py>`_. | |
| 355 | |
| 356 * Provides text representation of simple 'POD' structs: | |
| 357 | |
| 358 .. code-block:: python | |
| 359 | |
| 360 rect = mupdf.FzRect(...) | |
| 361 print(rect) # Will output text such as: (x0=90.51 y0=160.65 x1=501.39 y1=215.6) | |
| 362 | |
| 363 * This works for classes where the C++ API defines a ``to_string()`` method as described above. | |
| 364 | |
| 365 * Python classes will have a ``__str__()` method, and an identical `__repr__()`` method. | |
| 366 * C# classes will have a ``ToString()`` method. | |
| 367 | |
| 368 * Uses SWIG Director classes to allow C function pointers in MuPDF structs to call Python code. | |
| 369 | |
| 370 | |
| 371 Installing the Python mupdf module using ``pip`` | |
| 372 --------------------------------------------------------------- | |
| 373 | |
| 374 The Python ``mupdf`` module is available on the `Python Package Index (PyPI) website <https://pypi.org/>`_. | |
| 375 | |
| 376 * Install with ``pip install mupdf``. | |
| 377 * Pre-built Wheels (binary Python packages) are provided for Windows and Linux. | |
| 378 * For more information on the latest release, see changelog below and: https://pypi.org/project/mupdf/ | |
| 379 | |
| 380 Doxygen/Pydoc API documentation | |
| 381 --------------------------------------------------------------- | |
| 382 | |
| 383 Auto-generated documentation for the C, C++ and Python APIs is available at: | |
| 384 https://ghostscript.com/~julian/mupdf-bindings/ | |
| 385 | |
| 386 * All content is generated from the comments in MuPDF header files. | |
| 387 | |
| 388 * This documentation is generated from an internal development tree, so may | |
| 389 contain features that are not yet publicly available. | |
| 390 | |
| 391 * It is updated only intermittently. | |
| 392 | |
| 393 Example client code | |
| 394 --------------------------------------------------------------- | |
| 395 | |
| 396 Using the Python API | |
| 397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 398 | |
| 399 Minimal Python code that uses the ``mupdf`` module:: | |
| 400 | |
| 401 import mupdf | |
| 402 document = mupdf.FzDocument('foo.pdf') | |
| 403 | |
| 404 A simple example Python test script (run by ``scripts/mupdfwrap.py -t``) is: | |
| 405 | |
| 406 * `scripts/mupdfwrap_test.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_test.py>`_ | |
| 407 | |
| 408 More detailed usage of the Python API can be found in: | |
| 409 | |
| 410 * `scripts/mutool.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool.py>`_ | |
| 411 * `scripts/mutool_draw.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mutool_draw.py>`_ | |
| 412 | |
| 413 | |
| 414 **Example Python code that shows all available information about a document's Stext blocks, lines and characters**: | |
| 415 | |
| 416 |expand_begin| | |
| 417 :: | |
| 418 | |
| 419 #!/usr/bin/env python3 | |
| 420 | |
| 421 import mupdf | |
| 422 | |
| 423 def show_stext(document): | |
| 424 ''' | |
| 425 Shows all available information about Stext blocks, lines and characters. | |
| 426 ''' | |
| 427 for p in range(document.fz_count_pages()): | |
| 428 page = document.fz_load_page(p) | |
| 429 stextpage = mupdf.FzStextPage(page, mupdf.FzStextOptions()) | |
| 430 for block in stextpage: | |
| 431 block_ = block.m_internal | |
| 432 log(f'block: type={block_.type} bbox={block_.bbox}') | |
| 433 for line in block: | |
| 434 line_ = line.m_internal | |
| 435 log(f' line: wmode={line_.wmode}' | |
| 436 + f' dir={line_.dir}' | |
| 437 + f' bbox={line_.bbox}' | |
| 438 ) | |
| 439 for char in line: | |
| 440 char_ = char.m_internal | |
| 441 log(f' char: {chr(char_.c)!r} c={char_.c:4} color={char_.color}' | |
| 442 + f' origin={char_.origin}' | |
| 443 + f' quad={char_.quad}' | |
| 444 + f' size={char_.size:6.2f}' | |
| 445 + f' font=(' | |
| 446 + f'is_mono={char_.font.flags.is_mono}' | |
| 447 + f' is_bold={char_.font.flags.is_bold}' | |
| 448 + f' is_italic={char_.font.flags.is_italic}' | |
| 449 + f' ft_substitute={char_.font.flags.ft_substitute}' | |
| 450 + f' ft_stretch={char_.font.flags.ft_stretch}' | |
| 451 + f' fake_bold={char_.font.flags.fake_bold}' | |
| 452 + f' fake_italic={char_.font.flags.fake_italic}' | |
| 453 + f' has_opentype={char_.font.flags.has_opentype}' | |
| 454 + f' invalid_bbox={char_.font.flags.invalid_bbox}' | |
| 455 + f' name={char_.font.name}' | |
| 456 + f')' | |
| 457 ) | |
| 458 | |
| 459 document = mupdf.FzDocument('foo.pdf') | |
| 460 show_stext(document) | |
| 461 | |
| 462 |expand_end| | |
| 463 | |
| 464 Basic PDF viewers written in Python and C# | |
| 465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 466 | |
| 467 * `scripts/mupdfwrap_gui.py <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_gui.py>`_ | |
| 468 * `scripts/mupdfwrap_gui.cs <https://git.ghostscript.com/?p=mupdf.git;a=blob;f=scripts/mupdfwrap_gui.cs>`_ | |
| 469 * Build and run with: | |
| 470 | |
| 471 * ``./scripts/mupdfwrap.py -b all --test-python-gui`` | |
| 472 * ``./scripts/mupdfwrap.py -b --csharp all --test-csharp-gui`` | |
| 473 | |
| 474 | |
| 475 Building the C++, Python and C# MuPDF APIs from source | |
| 476 --------------------------------------------------------------- | |
| 477 | |
| 478 | |
| 479 General requirements | |
| 480 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 481 | |
| 482 * Windows, Linux, MacOS or OpenBSD. | |
| 483 | |
| 484 * | |
| 485 Build should take place inside a Python `venv <https://docs.python.org/3.8/library/venv.html>`_. | |
| 486 | |
| 487 * | |
| 488 `libclang Python interface onto <https://libclang.readthedocs.io/en/latest/index.html>`_ the `clang C/C++ parser <https://clang.llvm.org/>`_. | |
| 489 | |
| 490 * `swig <https://swig.org/>`_, for Python and C# bindings. | |
| 491 | |
| 492 * | |
| 493 `Mono <https://www.mono-project.com/>`_, for C# bindings on platforms | |
| 494 other than Windows. | |
| 495 | |
| 496 | |
| 497 Setting up | |
| 498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 499 | |
| 500 Windows only | |
| 501 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 502 | |
| 503 * Install Python. | |
| 504 | |
| 505 * | |
| 506 Use the Python Windows installer from the python.org website: | |
| 507 http://www.python.org/downloads | |
| 508 | |
| 509 * Don't use other installers such as the Microsoft Store Python package. | |
| 510 | |
| 511 * | |
| 512 If Microsoft Store Python is already installed, leave it in place and install | |
| 513 from python.org on top of it - uninstalling before running the python.org | |
| 514 installer has been known to cause problems. | |
| 515 | |
| 516 * A default installation is sufficient. | |
| 517 | |
| 518 * Debug binaries are required for debug builds of the MuPDF Python API. | |
| 519 | |
| 520 * | |
| 521 If "Customize Installation" is chosen, make sure to include "py launcher" so | |
| 522 that the ``py`` command will be available. | |
| 523 | |
| 524 * Also see: https://docs.python.org/3/using/windows.html | |
| 525 | |
| 526 * | |
| 527 Install Visual Studio 2019. Later versions may not work with MuPDF's | |
| 528 solution and build files. | |
| 529 | |
| 530 | |
| 531 All platforms | |
| 532 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 533 | |
| 534 * Get the latest version of MuPDF in git. | |
| 535 | |
| 536 .. code-block:: shell | |
| 537 | |
| 538 git clone --recursive git://git.ghostscript.com/mupdf.git | |
| 539 | |
| 540 * | |
| 541 Create and enter a `Python venv <https://docs.python.org/3.8/library/venv.html>`_ and upgrade pip. | |
| 542 | |
| 543 * Windows. | |
| 544 | |
| 545 .. code-block:: bat | |
| 546 | |
| 547 py -m venv pylocal | |
| 548 .\pylocal\Scripts\activate | |
| 549 python -m pip install --upgrade pip | |
| 550 | |
| 551 * Linux, MacOS, OpenBSD | |
| 552 | |
| 553 .. code-block:: shell | |
| 554 | |
| 555 python3 -m venv pylocal | |
| 556 . pylocal/bin/activate | |
| 557 python -m pip install --upgrade pip | |
| 558 | |
| 559 | |
| 560 General build flags | |
| 561 ~~~~~~~~~~~~~~~~~~~ | |
| 562 | |
| 563 In all of the commands below, one can set environmental variables to control | |
| 564 the build of the underlying MuPDF C API, for example ``USE_SYSTEM_LIBJPEG=yes``. | |
| 565 | |
| 566 In addition, ``XCXXFLAGS`` can be used to set additional C++ compiler flags when | |
| 567 building the C++ and Python bindings (the name is analogous to the ``XCFLAGS`` | |
| 568 used by MuPDF's makefile when compiling the core library). | |
| 569 | |
| 570 | |
| 571 Building and installing the Python bindings using ``pip`` | |
| 572 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 573 | |
| 574 * Windows, Linux, MacOS. | |
| 575 | |
| 576 .. code-block:: shell | |
| 577 | |
| 578 cd mupdf && pip install -vv . | |
| 579 | |
| 580 * OpenBSD. | |
| 581 | |
| 582 Building using ``pip`` is not supported because ``libclang`` is not | |
| 583 available from pypi.org so pip will fail to install prerequisites from | |
| 584 ``pypackage.toml``. | |
| 585 | |
| 586 Instead one can run ``setup.py`` directly: | |
| 587 | |
| 588 .. code-block:: shell | |
| 589 | |
| 590 cd mupdf && setup.py install | |
| 591 | |
| 592 | |
| 593 Building the Python bindings | |
| 594 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 595 | |
| 596 * Windows, Linux, MacOS. | |
| 597 | |
| 598 .. code-block:: shell | |
| 599 | |
| 600 pip install libclang swig setuptools | |
| 601 cd mupdf && python scripts/mupdfwrap.py -b all | |
| 602 | |
| 603 * OpenBSD. | |
| 604 | |
| 605 ``libclang`` is not available from pypi.org, but we can instead use | |
| 606 the system ``py3-llvm`` package. | |
| 607 | |
| 608 .. code-block:: shell | |
| 609 | |
| 610 sudo pkg_add py3-llvm | |
| 611 pip install swig setuptools | |
| 612 cd mupdf && python scripts/mupdfwrap.py -b all | |
| 613 | |
| 614 Building the C++ bindings | |
| 615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 616 | |
| 617 * Windows, Linux, MacOS. | |
| 618 | |
| 619 .. code-block:: shell | |
| 620 | |
| 621 pip install libclang setuptools | |
| 622 cd mupdf && python scripts/mupdfwrap.py -b m01 | |
| 623 | |
| 624 * OpenBSD. | |
| 625 | |
| 626 ``libclang`` is not available from pypi.org, but we can instead use | |
| 627 the system ``py3-llvm`` package. | |
| 628 | |
| 629 .. code-block:: shell | |
| 630 | |
| 631 sudo pkg_add py3-llvm | |
| 632 pip install setuptools | |
| 633 cd mupdf && python scripts/mupdfwrap.py -b m01 | |
| 634 | |
| 635 | |
| 636 Building the C# bindings | |
| 637 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 638 | |
| 639 * Windows. | |
| 640 | |
| 641 .. code-block:: shell | |
| 642 | |
| 643 pip install libclang swig setuptools | |
| 644 cd mupdf && python scripts/mupdfwrap.py -b --csharp all | |
| 645 | |
| 646 * Linux. | |
| 647 | |
| 648 .. code-block:: shell | |
| 649 | |
| 650 sudo apt install mono-devel | |
| 651 pip install libclang swig | |
| 652 cd mupdf && python scripts/mupdfwrap.py -b --csharp all | |
| 653 | |
| 654 * MacOS. | |
| 655 | |
| 656 Building the C# bindings on MacOS is not currently supported. | |
| 657 | |
| 658 * OpenBSD. | |
| 659 | |
| 660 .. code-block:: shell | |
| 661 | |
| 662 sudo pkg_add py3-llvm mono | |
| 663 pip install swig setuptools | |
| 664 cd mupdf && python scripts/mupdfwrap.py -b --csharp all | |
| 665 | |
| 666 | |
| 667 Using the bindings | |
| 668 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 669 | |
| 670 To use the bindings, one has to tell the OS where to find the MuPDF | |
| 671 runtime files. | |
| 672 | |
| 673 * C++ and C# bindings: | |
| 674 | |
| 675 * Windows. | |
| 676 | |
| 677 .. code-block:: shell | |
| 678 | |
| 679 set PATH=.../mupdf/build/shared-release-x64-py3.11;%PATH% | |
| 680 | |
| 681 * Replace ``x64`` with ``x32`` if using 32-bit. | |
| 682 | |
| 683 * Replace ``3.11`` with the appropriate python version number. | |
| 684 | |
| 685 | |
| 686 * Linux, OpenBSD. | |
| 687 | |
| 688 .. code-block:: shell | |
| 689 | |
| 690 LD_LIBRARY_PATH=.../mupdf/build/shared-release | |
| 691 | |
| 692 (``LD_LIBRARY_PATH`` must be an absolute path.) | |
| 693 | |
| 694 * MacOS. | |
| 695 | |
| 696 .. code-block:: shell | |
| 697 | |
| 698 DYLD_LIBRARY_PATH=.../mupdf/build/shared-release | |
| 699 | |
| 700 * Python bindings: | |
| 701 | |
| 702 If the bindings have been built and installed using ``pip install``, | |
| 703 they will already be available within the venv. | |
| 704 | |
| 705 Otherwise: | |
| 706 | |
| 707 * Windows. | |
| 708 | |
| 709 .. code-block:: shell | |
| 710 | |
| 711 PYTHONPATH=.../mupdf/build/shared-release-x64-py3.11 | |
| 712 | |
| 713 * Replace ``x64`` with ``x32`` if using 32-bit. | |
| 714 | |
| 715 * Replace ``3.11`` with the appropriate python version number. | |
| 716 | |
| 717 * Linux, MacOS, OpenBSD. | |
| 718 | |
| 719 .. code-block:: shell | |
| 720 | |
| 721 PYTHONPATH=.../mupdf/build/shared-release | |
| 722 | |
| 723 | |
| 724 Notes | |
| 725 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 726 | |
| 727 * Running tests. | |
| 728 | |
| 729 Basic tests can be run by appending args to the ``scripts/mupdfwrap.py`` | |
| 730 command. | |
| 731 | |
| 732 This will also demonstrate how to set environment variables such as | |
| 733 ``PYTHONPATH`` or ``LD_LIBRARY_PATH`` to the MuPDF build directory. | |
| 734 | |
| 735 * Python tests. | |
| 736 | |
| 737 * ``--test-python`` | |
| 738 * ``--test-python-gui`` | |
| 739 | |
| 740 * C# tests. | |
| 741 | |
| 742 * ``--test-csharp`` | |
| 743 * ``--test-csharp-gui`` | |
| 744 | |
| 745 * C++ tests. | |
| 746 | |
| 747 * ``--test-cpp`` | |
| 748 | |
| 749 * C++ bindings and ``NDEBUG``. | |
| 750 | |
| 751 When building client code that uses the C++ bindings, ``NDEBUG`` must | |
| 752 be defined/undefined to match how the C++ bindings were built. By | |
| 753 default the C++ bindings are a release build with ``NDEBUG`` defined, so | |
| 754 usually client code must also be built with ``NDEBUG`` defined. Otherwise | |
| 755 there will be build errors for missing C++ destructors, for example | |
| 756 ``mupdf::FzMatrix::~FzMatrix()``. | |
| 757 | |
| 758 [This is because we define some destructors in debug builds only; this allows | |
| 759 internal reference counting checks.] | |
| 760 | |
| 761 * Specifying the location of Visual Studio's ``devenv.com`` on Windows. | |
| 762 | |
| 763 ``scripts/mupdfwrap.py`` looks for Visual Studio's ``devenv.com`` in | |
| 764 standard locations; this can be overridden with: | |
| 765 | |
| 766 .. code-block:: shell | |
| 767 | |
| 768 python scripts/mupdfwrap.py -b --devenv <devenv.com-location> ... | |
| 769 | |
| 770 * Specifying compilers. | |
| 771 | |
| 772 On non-Windows, we use ``cc`` and ``c++`` as default C and C++ compilers; | |
| 773 override by setting environment variables ``$CC`` and ``$CXX``. | |
| 774 | |
| 775 * OpenBSD ``libclang``. | |
| 776 | |
| 777 * | |
| 778 ``libclang`` cannot be installed with pip on OpenBSD - wheels are not | |
| 779 available and building from source fails. | |
| 780 | |
| 781 However unlike on other platforms, the system python-clang package | |
| 782 (``py3-llvm``) is integrated with the system's libclang and can be | |
| 783 used directly. | |
| 784 | |
| 785 So the above examples use ``pkg_add py3-llvm``. | |
| 786 | |
| 787 * Alternatives to Python package ``libclang`` generally do not work. | |
| 788 | |
| 789 For example pypi.org's `clang <https://pypi.org/project/clang/>`_, or | |
| 790 Debian's `python-clang <https://packages.debian.org/search?keywords=python+clang&searchon=names&suite=stable§ion=all>`_. | |
| 791 | |
| 792 These are inconvenient to use because they require explicit setting of | |
| 793 ``LD_LIBRARY_PATH`` to point to the correct libclang dynamic library. | |
| 794 | |
| 795 * Debug builds. | |
| 796 | |
| 797 One can specify a debug build using the ``-d <build-directory>`` arg | |
| 798 before ``-b``. | |
| 799 | |
| 800 .. code-block:: shell | |
| 801 | |
| 802 python ./scripts/mupdfwrap.py -d build/shared-debug -b ... | |
| 803 | |
| 804 * | |
| 805 Debug builds of the Python and C# bindings on Windows have not been | |
| 806 tested. There may be issues with requiring a debug version of the Python | |
| 807 interpreter, for example ``python311_d.lib``. | |
| 808 | |
| 809 * | |
| 810 C# build failure: ``cstring.i not implemented for this target`` and/or | |
| 811 ``Unknown directive '%cstring_output_allocate'``. | |
| 812 | |
| 813 This is probably because SWIG does not include support for C#. This | |
| 814 has been seen in the past but as of 2023-07-19 pypi.org's default swig | |
| 815 seems ok. | |
| 816 | |
| 817 A possible solution is to install SWIG using the system package | |
| 818 manager, for example ``sudo apt install swig`` on Linux, or use | |
| 819 ``./scripts/mupdfwrap.py --swig-windows-auto ...`` on Windows. | |
| 820 | |
| 821 | |
| 822 * More information about running ``scripts/mupdfwrap.py``. | |
| 823 | |
| 824 * Run ``python ./scripts/mupdfwrap.py -h``. | |
| 825 * Read the doc-string at beginning of ``scripts/wrap/__main__.py+``. | |
| 826 | |
| 827 | |
| 828 How ``scripts/mupdfwrap.py`` builds the APIs | |
| 829 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 830 | |
| 831 Building the MuPDF C API | |
| 832 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 833 | |
| 834 * On Unix, runs ``make`` on MuPDF's ``Makefile`` with ``shared=yes``. | |
| 835 | |
| 836 * On Windows, runs ``devenv.com`` on ``.sln`` and ``.vcxproj`` files within MuPDF's `platform/win32/ <https://git.ghostscript.com/?p=mupdf.git;a=tree;f=platform/win32>`_ | |
| 837 directory. | |
| 838 | |
| 839 Generation of the MuPDF C++ API | |
| 840 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 841 | |
| 842 * Uses clang-python to parse MuPDF's C API. | |
| 843 | |
| 844 * Generates C++ code that wraps the basic C interface, converting MuPDF | |
| 845 ``setjmp()``/``longjmp()`` exceptions into C++ exceptions and automatically | |
| 846 handling ``fz_context``'s internally. | |
| 847 | |
| 848 * Generates C++ wrapper classes for each ``fz_*`` and ``pdf_*`` struct, and uses various | |
| 849 heuristics to define constructors, methods and static methods that call | |
| 850 ``fz_*()`` and ``pdf_*()`` functions. These classes' constructors and destructors | |
| 851 automatically handle reference counting so class instances can be copied | |
| 852 arbitrarily. | |
| 853 | |
| 854 * C header file comments are copied into the generated C++ header files. | |
| 855 | |
| 856 * Compile and link the generated C++ code to create shared libraries. | |
| 857 | |
| 858 | |
| 859 Generation of the MuPDF Python and C# APIs | |
| 860 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 861 | |
| 862 * Uses SWIG to parse the previously-generated C++ headers and generate C++, | |
| 863 Python and C# code. | |
| 864 | |
| 865 * | |
| 866 Defines some custom-written Python and C# functions and methods, for | |
| 867 example so that out-params are returned as tuples. | |
| 868 | |
| 869 * If SWIG is version 4+, C++ comments are converted into Python doc-comments. | |
| 870 | |
| 871 * Compile and link the SWIG-generated C++ code to create shared libraries. | |
| 872 | |
| 873 | |
| 874 Building auto-generated MuPDF API documentation | |
| 875 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 876 | |
| 877 Build HTML documentation for the C, C++ and Python APIs (using Doxygen and pydoc): | |
| 878 | |
| 879 .. code-block:: shell | |
| 880 | |
| 881 python ./scripts/mupdfwrap.py --doc all | |
| 882 | |
| 883 This will generate the following tree: | |
| 884 | |
| 885 .. code-block:: text | |
| 886 | |
| 887 mupdf/docs/generated/ | |
| 888 index.html | |
| 889 c/ | |
| 890 c++/ | |
| 891 python/ | |
| 892 | |
| 893 All content is ultimately generated from the MuPDF C header file comments. | |
| 894 | |
| 895 As of 2022-2-5, it looks like ``swig -doxygen`` (swig-4.02) ignores | |
| 896 single-line ``/** ... */`` comments, so the generated Python code (and | |
| 897 hence also Pydoc documentation) is missing information. | |
| 898 | |
| 899 Generated files | |
| 900 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 901 | |
| 902 All generated files are within the MuPDF checkout. | |
| 903 | |
| 904 * C++ headers for the MuPDF C++ API are in ``platform/c++/include/``. | |
| 905 | |
| 906 * Files required at runtime are in ``build/shared-release/``. | |
| 907 | |
| 908 **Details** | |
| 909 | |
| 910 .. code-block:: text | |
| 911 | |
| 912 mupdf/ | |
| 913 build/ | |
| 914 shared-release/ [Unix runtime files.] | |
| 915 libmupdf.so [MuPDF C API, not MacOS.] | |
| 916 libmupdf.dylib [MuPDF C API, MacOS.] | |
| 917 libmupdfcpp.so [MuPDF C++ API.] | |
| 918 mupdf.py [MuPDF Python API.] | |
| 919 _mupdf.so [MuPDF Python API internals.] | |
| 920 mupdf.cs [MuPDF C# API.] | |
| 921 mupdfcsharp.so [MuPDF C# API internals.] | |
| 922 | |
| 923 shared-debug/ | |
| 924 [as shared-release but debug build.] | |
| 925 | |
| 926 shared-release-x*-py*/ [Windows runtime files.] | |
| 927 mupdfcpp.dll [MuPDF C and C++ API, x32.] | |
| 928 mupdfcpp64.dll [MuPDF C and C++ API, x64.] | |
| 929 mupdf.py [MuPDF Python API.] | |
| 930 _mupdf.pyd [MuPDF Python API internals.] | |
| 931 mupdf.cs [MuPDF C# API.] | |
| 932 mupdfcsharp.dll [MuPDF C# API internals.] | |
| 933 | |
| 934 platform/ | |
| 935 c++/ | |
| 936 include/ [MuPDF C++ API header files.] | |
| 937 mupdf/ | |
| 938 classes.h | |
| 939 classes2.h | |
| 940 exceptions.h | |
| 941 functions.h | |
| 942 internal.h | |
| 943 | |
| 944 implementation/ [MuPDF C++ implementation source files.] | |
| 945 classes.cpp | |
| 946 classes2.cpp | |
| 947 exceptions.cpp | |
| 948 functions.cpp | |
| 949 internal.cpp | |
| 950 | |
| 951 generated.pickle [Information from clang parse step, used by later stages.] | |
| 952 windows_mupdf.def [List of MuPDF public global data, used when linking mupdfcpp.dll.] | |
| 953 | |
| 954 python/ [SWIG Python files.] | |
| 955 mupdfcpp_swig.i [SWIG input file.] | |
| 956 mupdfcpp_swig.i.cpp [SWIG output file.] | |
| 957 | |
| 958 csharp/ [SWIG C# files.] | |
| 959 mupdf.cs [SWIG output file, no out-params helpers.] | |
| 960 mupdfcpp_swig.i [SWIG input file.] | |
| 961 mupdfcpp_swig.i.cpp [SWIG output file.] | |
| 962 | |
| 963 win32/ | |
| 964 Release/ [Windows 32-bit .dll, .lib, .exp, .pdb etc.] | |
| 965 x64/ | |
| 966 Release/ [Windows 64-bit .dll, .lib, .exp, .pdb etc.] | |
| 967 mupdfcpp64.dll [Copied to build/shared-release*/mupdfcpp64.dll] | |
| 968 mupdfpyswig.dll [Copied to build/shared-release*/_mupdf.pyd] | |
| 969 mupdfcpp64.lib | |
| 970 mupdfpyswig.lib | |
| 971 | |
| 972 win32-vs-upgrade/ [used instead of win32/ if PYMUPDF_SETUP_MUPDF_VS_UPGRADE is '1'.] | |
| 973 | |
| 974 | |
| 975 Windows-specifics | |
| 976 --------------------------------------------------------------- | |
| 977 | |
| 978 Required predefined macros | |
| 979 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 980 | |
| 981 Code that will use the MuPDF DLL must be built with ``FZ_DLL_CLIENT`` | |
| 982 predefined. | |
| 983 | |
| 984 The MuPDF DLL itself is built with ``FZ_DLL`` predefined. | |
| 985 | |
| 986 DLLs | |
| 987 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 988 | |
| 989 There is no separate C library, instead the C and C++ APIs are | |
| 990 both in ``mupdfcpp.dll``, which is built by running devenv on | |
| 991 ``platform/win32/mupdf.sln``. | |
| 992 | |
| 993 The Python SWIG library is called ``_mupdf.pyd`` which, despite the name, is a | |
| 994 standard Windows DLL, built from ``platform/python/mupdfcpp_swig.i.cpp``. | |
| 995 | |
| 996 DLL export of functions and data | |
| 997 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 998 | |
| 999 On Windows, ``include/mupdf/fitz/export.h`` defines ``FZ_FUNCTION`` and | |
| 1000 ``FZ_DATA` to `__declspec(dllexport)` and/or `__declspec(dllimport)`` | |
| 1001 depending on whether ``FZ_DLL`` or ``FZ_DLL_CLIENT`` are defined. | |
| 1002 | |
| 1003 All MuPDF C headers prefix declarations of public global data with ``FZ_DATA``. | |
| 1004 | |
| 1005 In generated C++ code: | |
| 1006 | |
| 1007 * Data declarations and definitions are prefixed with ``FZ_DATA``. | |
| 1008 * Function declarations and definitions are prefixed with ``FZ_FUNCTION``. | |
| 1009 * Class method declarations and definitions are prefixed with ``FZ_FUNCTION``. | |
| 1010 | |
| 1011 When building ``mupdfcpp.dll`` on Windows we link with the auto-generated | |
| 1012 ``platform/c++/windows_mupdf.def`` file; this lists all C public global data. | |
| 1013 | |
| 1014 For reasons that are not fully understood, we don't seem to need to tag | |
| 1015 C functions with ``FZ_FUNCTION``, but this is required for C++ functions | |
| 1016 otherwise we get unresolved symbols when building MuPDF client code. | |
| 1017 | |
| 1018 Building the DLLs | |
| 1019 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1020 | |
| 1021 We build Windows binaries by running ``devenv.com`` directly. | |
| 1022 | |
| 1023 Building ``_mupdf.pyd`` is tricky because it needs to be built with a | |
| 1024 specific ``Python.h`` and linked with a specific ``python.lib``. This is | |
| 1025 done by setting environmental variables ``MUPDF_PYTHON_INCLUDE_PATH`` and | |
| 1026 ``MUPDF_PYTHON_LIBRARY_PATH`` when running ``devenv.com``, which are referenced | |
| 1027 by ``platform/win32/mupdfpyswig.vcxproj``. Thus one cannot easily build | |
| 1028 ``_mupdf.pyd`` directly from the Visual Studio GUI. | |
| 1029 | |
| 1030 [In the git history there is code that builds ``_mupdf.pyd`` by running the | |
| 1031 Windows compiler and linker ``cl.exe`` and ``link.exe`` directly, which avoids | |
| 1032 the complications of going via devenv, at the expense of needing to know where | |
| 1033 ``cl.exe`` and ``link.exe`` are.] | |
| 1034 | |
| 1035 | |
| 1036 C++ bindings details | |
| 1037 --------------------------------------------------------------- | |
| 1038 | |
| 1039 Wrapper functions | |
| 1040 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1041 | |
| 1042 Wrappers for a MuPDF function ``fz_foo()`` are available in multiple forms: | |
| 1043 | |
| 1044 * Functions in the ``mupdf`` namespace. | |
| 1045 | |
| 1046 * ``mupdf::ll_fz_foo()`` | |
| 1047 | |
| 1048 * Low-level wrapper: | |
| 1049 | |
| 1050 * Does not take ``fz_context*`` arg. | |
| 1051 * Translates MuPDF exceptions into C++ exceptions. | |
| 1052 * Takes/returns pointers to MuPDF structs. | |
| 1053 * Code that uses these functions will need to make explicit calls to | |
| 1054 ``fz_keep_*()`` and ``fz_drop_*()``. | |
| 1055 | |
| 1056 * ``mupdf::fz_foo()`` | |
| 1057 | |
| 1058 * High-level class-aware wrapper: | |
| 1059 | |
| 1060 * Does not take ``fz_context*`` arg. | |
| 1061 * Translates MuPDF exceptions into C++ exceptions. | |
| 1062 * Takes references to C++ wrapper class instances instead of pointers to | |
| 1063 MuPDF structs. | |
| 1064 * Where applicable, returns C++ wrapper class instances instead of | |
| 1065 pointers to MuPDF structs. | |
| 1066 * Code that uses these functions does not need to call ``fz_keep_*()`` | |
| 1067 and ``fz_drop_*()`` - C++ wrapper class instances take care of reference | |
| 1068 counting internally. | |
| 1069 | |
| 1070 * Class methods | |
| 1071 | |
| 1072 * Where ``fz_foo()`` has a first arg (ignoring any ``fz_context*`` arg) that | |
| 1073 takes a pointer to a MuPDF struct ``foo_bar``, it is generally available as a | |
| 1074 member function of the wrapper class ``mupdf::FooBar``: | |
| 1075 | |
| 1076 * ``mupdf::FooBar::fz_foo()`` | |
| 1077 | |
| 1078 * Apart from being a member function, this is identical to class-aware | |
| 1079 wrapper ``mupdf::fz_foo()``, for example taking references to wrapper classes | |
| 1080 instead of pointers to MuPDF structs. | |
| 1081 | |
| 1082 | |
| 1083 Constructors using MuPDF functions | |
| 1084 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1085 | |
| 1086 Wrapper class constructors are created for each MuPDF function that returns an | |
| 1087 instance of a MuPDF struct. | |
| 1088 | |
| 1089 Sometimes two such functions do not have different arg types so C++ | |
| 1090 overloading cannot distinguish between them as constructors (because C++ | |
| 1091 constructors do not have names). | |
| 1092 | |
| 1093 We cope with this in two ways: | |
| 1094 | |
| 1095 * Create a static method that returns a new instance of the wrapper class | |
| 1096 by value. | |
| 1097 | |
| 1098 * This is not possible if the underlying MuPDF struct is not copyable - i.e. | |
| 1099 not reference counted and not POD. | |
| 1100 | |
| 1101 * Define an enum within the wrapper class, and provide a constructor that takes | |
| 1102 an instance of this enum to specify which MuPDF function to use. | |
| 1103 | |
| 1104 | |
| 1105 Default constructors | |
| 1106 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1107 | |
| 1108 All wrapper classes have a default constructor. | |
| 1109 | |
| 1110 * For POD classes each member is set to a default value with ``this->foo = | |
| 1111 {};``. Arrays are initialised by setting all bytes to zero using | |
| 1112 ``memset()``. | |
| 1113 * For non-POD classes, class member ``m_internal`` is set to ``nullptr``. | |
| 1114 * Some classes' default constructors are customized, for example: | |
| 1115 | |
| 1116 * The default constructor for ``fz_color_params`` wrapper | |
| 1117 ``mupdf::FzColorParams`` sets state to a copy of | |
| 1118 ``fz_default_color_params``. | |
| 1119 * The default constructor for ``fz_md5`` wrapper ``mupdf::FzMd5`` sets | |
| 1120 state using ``fz_md5_init()``. | |
| 1121 * These are described in class definition comments in | |
| 1122 ``platform/c++/include/mupdf/classes.h``. | |
| 1123 | |
| 1124 | |
| 1125 Raw constructors | |
| 1126 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1127 | |
| 1128 Many wrapper classes have constructors that take a pointer to the underlying | |
| 1129 MuPDF C struct. These are usually for internal use only. They do not call | |
| 1130 ``fz_keep_*()`` - it is expected that any supplied MuPDF struct is already | |
| 1131 owned. | |
| 1132 | |
| 1133 | |
| 1134 POD wrapper classes | |
| 1135 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1136 | |
| 1137 Class wrappers for MuPDF structs default to having a ``m_internal`` member which | |
| 1138 points to an instance of the wrapped struct. This works well for MuPDF structs | |
| 1139 which support reference counting, because we can automatically create copy | |
| 1140 constructors, ``operator=`` functions and destructors that call the associated | |
| 1141 ``fz_keep_*()`` and ``fz_drop_*()`` functions. | |
| 1142 | |
| 1143 However where a MuPDF struct does not support reference counting and contains | |
| 1144 simple data, it is not safe to copy a pointer to the struct, so the class | |
| 1145 wrapper will be a POD class. This is done in one of two ways: | |
| 1146 | |
| 1147 * ``m_internal`` is an instance of the MuPDF struct, not a pointer. | |
| 1148 | |
| 1149 * Sometimes we provide members that give direct access to fields in | |
| 1150 ``m_internal``. | |
| 1151 | |
| 1152 * An 'inline' POD - there is no ``m_internal`` member; instead the wrapper class | |
| 1153 contains the same members as the MuPDF struct. This can be a little more | |
| 1154 convenient to use. | |
| 1155 | |
| 1156 | |
| 1157 Extra static methods | |
| 1158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1159 | |
| 1160 Where relevant, wrapper class can have static methods that wrap selected MuPDF | |
| 1161 functions. For example ``FzMatrix`` does this for ``fz_concat()``, ``fz_scale()`` etc, | |
| 1162 because these return the result by value rather than modifying a ``fz_matrix`` | |
| 1163 instance. | |
| 1164 | |
| 1165 | |
| 1166 Miscellaneous custom wrapper classes | |
| 1167 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1168 | |
| 1169 The wrapper for ``fz_outline_item`` does not contain a ``fz_outline_item`` by | |
| 1170 value or pointer. Instead it defines C++-style member equivalents to | |
| 1171 ``fz_outline_item``'s fields, to simplify usage from C++ and Python/C#. | |
| 1172 | |
| 1173 The fields are initialised from a ``fz_outline_item`` when the wrapper class | |
| 1174 is constructed. In this particular case there is no need to hold on to a | |
| 1175 ``fz_outline_item``, and the use of ``std::string`` ensures that value semantics | |
| 1176 can work. | |
| 1177 | |
| 1178 | |
| 1179 Extra functions in C++, Python and C# | |
| 1180 --------------------------------------------------------------- | |
| 1181 | |
| 1182 [These functions are available as low-level functions, class-aware | |
| 1183 functions and class methods.] | |
| 1184 | |
| 1185 .. code-block:: c++ | |
| 1186 | |
| 1187 /** | |
| 1188 C++ alternative to ``fz_lookup_metadata()`` that returns a ``std::string`` | |
| 1189 or calls ``fz_throw()`` if not found. | |
| 1190 */ | |
| 1191 FZ_FUNCTION std::string fz_lookup_metadata2(fz_context* ctx, fz_document* doc, const char* key); | |
| 1192 | |
| 1193 /** | |
| 1194 C++ alternative to ``pdf_lookup_metadata()`` that returns a ``std::string`` | |
| 1195 or calls ``fz_throw()`` if not found. | |
| 1196 */ | |
| 1197 FZ_FUNCTION std::string pdf_lookup_metadata2(fz_context* ctx, pdf_document* doc, const char* key); | |
| 1198 | |
| 1199 /** | |
| 1200 C++ alternative to ``fz_md5_pixmap()`` that returns the digest by value. | |
| 1201 */ | |
| 1202 FZ_FUNCTION std::vector<unsigned char> fz_md5_pixmap2(fz_context* ctx, fz_pixmap* pixmap); | |
| 1203 | |
| 1204 /** | |
| 1205 C++ alternative to fz_md5_final() that returns the digest by value. | |
| 1206 */ | |
| 1207 FZ_FUNCTION std::vector<unsigned char> fz_md5_final2(fz_md5* md5); | |
| 1208 | |
| 1209 /** */ | |
| 1210 FZ_FUNCTION long long fz_pixmap_samples_int(fz_context* ctx, fz_pixmap* pixmap); | |
| 1211 | |
| 1212 /** | |
| 1213 Provides simple (but slow) access to pixmap data from Python and C#. | |
| 1214 */ | |
| 1215 FZ_FUNCTION int fz_samples_get(fz_pixmap* pixmap, int offset); | |
| 1216 | |
| 1217 /** | |
| 1218 Provides simple (but slow) write access to pixmap data from Python and | |
| 1219 C#. | |
| 1220 */ | |
| 1221 FZ_FUNCTION void fz_samples_set(fz_pixmap* pixmap, int offset, int value); | |
| 1222 | |
| 1223 /** | |
| 1224 C++ alternative to fz_highlight_selection() that returns quads in a | |
| 1225 std::vector. | |
| 1226 */ | |
| 1227 FZ_FUNCTION std::vector<fz_quad> fz_highlight_selection2(fz_context* ctx, fz_stext_page* page, fz_point a, fz_point b, int max_quads); | |
| 1228 | |
| 1229 struct fz_search_page2_hit | |
| 1230 {{ | |
| 1231 fz_quad quad; | |
| 1232 int mark; | |
| 1233 }}; | |
| 1234 | |
| 1235 /** | |
| 1236 C++ alternative to fz_search_page() that returns information in a std::vector. | |
| 1237 */ | |
| 1238 FZ_FUNCTION std::vector<fz_search_page2_hit> fz_search_page2(fz_context* ctx, fz_document* doc, int number, const char* needle, int hit_max); | |
| 1239 | |
| 1240 /** | |
| 1241 C++ alternative to fz_string_from_text_language() that returns information in a std::string. | |
| 1242 */ | |
| 1243 FZ_FUNCTION std::string fz_string_from_text_language2(fz_text_language lang); | |
| 1244 | |
| 1245 /** | |
| 1246 C++ alternative to fz_get_glyph_name() that returns information in a std::string. | |
| 1247 */ | |
| 1248 FZ_FUNCTION std::string fz_get_glyph_name2(fz_context* ctx, fz_font* font, int glyph); | |
| 1249 | |
| 1250 /** | |
| 1251 Extra struct containing fz_install_load_system_font_funcs()'s args, | |
| 1252 which we wrap with virtual_fnptrs set to allow use from Python/C# via | |
| 1253 Swig Directors. | |
| 1254 */ | |
| 1255 typedef struct fz_install_load_system_font_funcs_args | |
| 1256 {{ | |
| 1257 fz_load_system_font_fn* f; | |
| 1258 fz_load_system_cjk_font_fn* f_cjk; | |
| 1259 fz_load_system_fallback_font_fn* f_fallback; | |
| 1260 }} fz_install_load_system_font_funcs_args; | |
| 1261 | |
| 1262 /** | |
| 1263 Alternative to fz_install_load_system_font_funcs() that takes args in a | |
| 1264 struct, to allow use from Python/C# via Swig Directors. | |
| 1265 */ | |
| 1266 FZ_FUNCTION void fz_install_load_system_font_funcs2(fz_context* ctx, fz_install_load_system_font_funcs_args* args); | |
| 1267 | |
| 1268 /** Internal singleton state to allow Swig Director class to find | |
| 1269 fz_install_load_system_font_funcs_args class wrapper instance. */ | |
| 1270 FZ_DATA extern void* fz_install_load_system_font_funcs2_state; | |
| 1271 | |
| 1272 /** Helper for calling ``fz_document_handler::open`` function pointer via | |
| 1273 Swig from Python/C#. */ | |
| 1274 FZ_FUNCTION fz_document* fz_document_handler_open(fz_context* ctx, const fz_document_handler *handler, fz_stream* stream, fz_stream* accel, fz_archive* dir, void* recognize_state); | |
| 1275 | |
| 1276 /** Helper for calling a ``fz_document_handler::recognize`` function | |
| 1277 pointer via Swig from Python/C#. */ | |
| 1278 FZ_FUNCTION int fz_document_handler_recognize(fz_context* ctx, const fz_document_handler *handler, const char *magic); | |
| 1279 | |
| 1280 /** Swig-friendly wrapper for pdf_choice_widget_options(), returns the | |
| 1281 options directly in a vector. */ | |
| 1282 FZ_FUNCTION std::vector<std::string> pdf_choice_widget_options2(fz_context* ctx, pdf_annot* tw, int exportval); | |
| 1283 | |
| 1284 /** Swig-friendly wrapper for fz_new_image_from_compressed_buffer(), | |
| 1285 uses specified ``decode`` and ``colorkey`` if they are not null (in which | |
| 1286 case we assert that they have size ``2*fz_colorspace_n(colorspace)``). */ | |
| 1287 FZ_FUNCTION fz_image* fz_new_image_from_compressed_buffer2( | |
| 1288 fz_context* ctx, | |
| 1289 int w, | |
| 1290 int h, | |
| 1291 int bpc, | |
| 1292 fz_colorspace* colorspace, | |
| 1293 int xres, | |
| 1294 int yres, | |
| 1295 int interpolate, | |
| 1296 int imagemask, | |
| 1297 const std::vector<float>& decode, | |
| 1298 const std::vector<int>& colorkey, | |
| 1299 fz_compressed_buffer* buffer, | |
| 1300 fz_image* mask | |
| 1301 ); | |
| 1302 | |
| 1303 /** Swig-friendly wrapper for pdf_rearrange_pages(). */ | |
| 1304 void pdf_rearrange_pages2( | |
| 1305 fz_context* ctx, | |
| 1306 pdf_document* doc, | |
| 1307 const std::vector<int>& pages, | |
| 1308 pdf_clean_options_structure structure | |
| 1309 ); | |
| 1310 | |
| 1311 /** Swig-friendly wrapper for pdf_subset_fonts(). */ | |
| 1312 void pdf_subset_fonts2(fz_context *ctx, pdf_document *doc, const std::vector<int>& pages); | |
| 1313 | |
| 1314 /** Swig-friendly and typesafe way to do fz_snprintf(fmt, value). ``fmt`` | |
| 1315 must end with one of 'efg' otherwise we throw an exception. */ | |
| 1316 std::string fz_format_double(fz_context* ctx, const char* fmt, double value); | |
| 1317 | |
| 1318 struct fz_font_ucs_gid | |
| 1319 {{ | |
| 1320 unsigned long ucs; | |
| 1321 unsigned int gid; | |
| 1322 }}; | |
| 1323 | |
| 1324 /** SWIG-friendly wrapper for fz_enumerate_font_cmap(). */ | |
| 1325 std::vector<fz_font_ucs_gid> fz_enumerate_font_cmap2(fz_context* ctx, fz_font* font); | |
| 1326 | |
| 1327 /** SWIG-friendly wrapper for pdf_set_annot_callout_line(). */ | |
| 1328 void pdf_set_annot_callout_line2(fz_context *ctx, pdf_annot *annot, std::vector<fz_point>& callout); | |
| 1329 | |
| 1330 /** SWIG-friendly wrapper for fz_decode_barcode_from_display_list(), | |
| 1331 avoiding leak of the returned string. */ | |
| 1332 std::string fz_decode_barcode_from_display_list2(fz_context *ctx, fz_barcode_type *type, fz_display_list *list, fz_rect subarea, int rotate); | |
| 1333 | |
| 1334 /** SWIG-friendly wrapper for fz_decode_barcode_from_pixmap(), avoiding | |
| 1335 leak of the returned string. */ | |
| 1336 std::string fz_decode_barcode_from_pixmap2(fz_context *ctx, fz_barcode_type *type, fz_pixmap *pix, int rotate); | |
| 1337 | |
| 1338 /** SWIG-friendly wrapper for fz_decode_barcode_from_page(), avoiding | |
| 1339 leak of the returned string. */ | |
| 1340 std::string fz_decode_barcode_from_page2(fz_context *ctx, fz_barcode_type *type, fz_page *page, fz_rect subarea, int rotate); | |
| 1341 | |
| 1342 | |
| 1343 Python/C# bindings details | |
| 1344 --------------------------------------------------------------- | |
| 1345 | |
| 1346 Extra Python functions | |
| 1347 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1348 | |
| 1349 Access to raw C arrays | |
| 1350 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1351 | |
| 1352 The following functions can be used from Python to get access to raw data: | |
| 1353 | |
| 1354 * | |
| 1355 ``mupdf.bytes_getitem(array, index)``: Gives access to individual items | |
| 1356 in an array of ``unsigned char``'s, for example in the data returned by | |
| 1357 ``mupdf::FzPixmap``'s ``samples()`` method. | |
| 1358 | |
| 1359 * | |
| 1360 ``mupdf.floats_getitem(array, index)``: Gives access to individual items in an | |
| 1361 array of ``float``'s, for example in ``fz_stroke_state``'s ``float dash_list[32]`` | |
| 1362 array. Generated with SWIG code ``carrays.i`` and ``array_functions(float, | |
| 1363 floats);``. | |
| 1364 | |
| 1365 * | |
| 1366 ``mupdf.python_buffer_data(b)``: returns a SWIG wrapper for a ``const unsigned | |
| 1367 char*`` pointing to a Python buffer instance's raw data. For example ``b`` can | |
| 1368 be a Python ``bytes`` or ``bytearray`` instance. | |
| 1369 | |
| 1370 * | |
| 1371 ``mupdfpython_mutable_buffer_data(b)``: returns a SWIG wrapper for an ``unsigned | |
| 1372 char*`` pointing to a Python buffer instance's raw data. For example ``b`` can | |
| 1373 be a Python ``bytearray`` instance. | |
| 1374 | |
| 1375 [These functions are implemented internally using SWIG's ``carrays.i`` and | |
| 1376 ``pybuffer.i``. | |
| 1377 | |
| 1378 | |
| 1379 Python differences from C API | |
| 1380 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1381 | |
| 1382 [The functions described below are also available as class methods.] | |
| 1383 | |
| 1384 | |
| 1385 Custom methods | |
| 1386 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1387 | |
| 1388 Python and C# code does not easily handle functions that return raw data, for example | |
| 1389 as an ``unsigned char*`` that is not a zero-terminated string. Sometimes we provide a | |
| 1390 C++ method that returns a ``std::vector`` by value, so that Python and C# code can | |
| 1391 wrap it in a systematic way. | |
| 1392 | |
| 1393 For example ``Md5::fz_md5_final2()``. | |
| 1394 | |
| 1395 For all functions described below, there is also a ``ll_*`` variant that | |
| 1396 takes/returns raw MuPDF structs instead of wrapper classes. | |
| 1397 | |
| 1398 | |
| 1399 New functions | |
| 1400 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1401 | |
| 1402 * ``fz_buffer_extract_copy()``: Returns copy of buffer data as a Python ``bytes``. | |
| 1403 * ``fz_buffer_storage_memoryview(buffer, writable)``: Returns a readonly/writable Python memoryview onto ``buffer``. | |
| 1404 Relies on ``buffer`` existing and not changing size while the memory view is used. | |
| 1405 * ``fz_pixmap_samples_memoryview()``: Returns Python ``memoryview`` onto ``fz_pixmap`` data. | |
| 1406 | |
| 1407 * ``fz_lookup_metadata2(fzdocument, key)``: Return key value or raise an exception if not found: | |
| 1408 * ``pdf_lookup_metadata2(pdfdocument, key)``: Return key value or raise an exception if not found: | |
| 1409 | |
| 1410 Implemented in Python | |
| 1411 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1412 | |
| 1413 * ``fz_format_output_path()`` | |
| 1414 * ``fz_story_positions()`` | |
| 1415 * ``pdf_dict_getl()`` | |
| 1416 * ``pdf_dict_putl()`` | |
| 1417 | |
| 1418 Non-standard API or implementation | |
| 1419 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1420 | |
| 1421 * ``fz_buffer_extract()``: Returns a *copy* of the original buffer data as a Python ``bytes``. Still clears the buffer. | |
| 1422 * ``fz_buffer_storage()``: Returns ``(size, data)`` where ``data`` is a low-level SWIG representation of the buffer's storage. | |
| 1423 * ``fz_convert_color()``: No ``float* fv`` param, instead returns ``(rgb0, rgb1, rgb2, rgb3)``. | |
| 1424 * ``fz_fill_text()``: ``color`` arg is tuple/list of 1-4 floats. | |
| 1425 * ``fz_lookup_metadata(fzdocument, key)``: Return key value or None if not found: | |
| 1426 * ``fz_new_buffer_from_copied_data()``: Takes a Python ``bytes`` (or other Python buffer) instance. | |
| 1427 * ``fz_set_error_callback()``: Takes a Python callable; no ``void* user`` arg. | |
| 1428 * ``fz_set_warning_callback()``: Takes a Python callable; no ``void* user`` arg. | |
| 1429 * ``fz_warn()``: Takes single Python ``str`` arg. | |
| 1430 * ``pdf_dict_putl_drop()``: Always raises exception because not useful with automatic ref-counts. | |
| 1431 * ``pdf_load_field_name()``: Uses extra C++ function ``pdf_load_field_name2()`` which returns ``std::string`` by value. | |
| 1432 * ``pdf_lookup_metadata(pdfdocument, key)``: Return key value or None if not found: | |
| 1433 * ``pdf_set_annot_color()``: Takes single ``color`` arg which must be float or tuple of 1-4 floats. | |
| 1434 * ``pdf_set_annot_interior_color()``: Takes single ``color`` arg which must be float or tuple of 1-4 floats. | |
| 1435 * ``fz_install_load_system_font_funcs()``: Takes Python callbacks with no ``ctx`` arg, | |
| 1436 which can return ``None``, ``fz_font*`` or a ``mupdf.FzFont``. | |
| 1437 | |
| 1438 Example usage (from ``scripts/mupdfwrap_test.py:test_install_load_system_font()``):: | |
| 1439 | |
| 1440 def font_f(name, bold, italic, needs_exact_metrics): | |
| 1441 print(f'font_f(): Looking for font: {name=} {bold=} {italic=} {needs_exact_metrics=}.') | |
| 1442 return mupdf.fz_new_font_from_file(...) | |
| 1443 def f_cjk(name, ordering, serif): | |
| 1444 print(f'f_cjk(): Looking for font: {name=} {ordering=} {serif=}.') | |
| 1445 return None | |
| 1446 def f_fallback(script, language, serif, bold, italic): | |
| 1447 print(f'f_fallback(): looking for font: {script=} {language=} {serif=} {bold=} {italic=}.') | |
| 1448 return None | |
| 1449 mupdf.fz_install_load_system_font_funcs(font_f, f_cjk, f_fallback) | |
| 1450 | |
| 1451 | |
| 1452 Making MuPDF function pointers call Python code | |
| 1453 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| 1454 | |
| 1455 Overview | |
| 1456 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1457 | |
| 1458 For MuPDF structs with function pointers, we provide a second C++ wrapper | |
| 1459 class for use by the Python bindings. | |
| 1460 | |
| 1461 * The second wrapper class has a ``2`` suffix, for example ``PdfFilterOptions2``. | |
| 1462 | |
| 1463 * This second wrapper class has a virtual method for each function pointer, so | |
| 1464 it can be used as a `SWIG Director class <https://swig.org/Doc4.0/SWIGDocumentation.html#SWIGPlus_target_language_callbacks>`_. | |
| 1465 | |
| 1466 * Overriding a virtual method in Python results in the Python method being | |
| 1467 called when MuPDF C code calls the corresponding function pointer. | |
| 1468 | |
| 1469 * One needs to activate the use of a Python method as a callback by calling the | |
| 1470 special method ``use_virtual_<method-name>()``. [It might be possible in future | |
| 1471 to remove the need to do this.] | |
| 1472 | |
| 1473 * It may be possible to use similar techniques in C# but this has not been | |
| 1474 tried. | |
| 1475 | |
| 1476 | |
| 1477 Callback args | |
| 1478 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1479 | |
| 1480 Python callbacks have args that are more low-level than in the rest of the | |
| 1481 Python API: | |
| 1482 | |
| 1483 * Callbacks generally have a first arg that is a SWIG representation of a MuPDF | |
| 1484 ``fz_context*``. | |
| 1485 | |
| 1486 * Where the underlying MuPDF function pointer has an arg that is a pointer to | |
| 1487 an MuPDF struct, unlike elsewhere in the MuPDF bindings we do not translate | |
| 1488 this into an instance of the corresponding wrapper class. Instead Python | |
| 1489 callbacks will see a SWIG representation of the low-level C pointer. | |
| 1490 | |
| 1491 * It is not safe to construct a Python wrapper class instance directly from | |
| 1492 such a SWIG representation of a C pointer, because it will break MuPDF's | |
| 1493 reference counting - Python/C++ constructors that take a raw pointer to a | |
| 1494 MuPDF struct do not call ``fz_keep_*()`` but the corresponding Python/C++ | |
| 1495 destructor will call ``fz_drop_*()``. | |
| 1496 | |
| 1497 * It might be safe to create an wrapper class instance using an explicit call | |
| 1498 to ``mupdf.fz_keep_*()``, but this has not been tried. | |
| 1499 | |
| 1500 * As of 2023-02-03, exceptions from Python callbacks are propagated back | |
| 1501 through the Python, C++, C, C++ and Python layers. The resulting Python | |
| 1502 exception will have the original exception text, but the original Python | |
| 1503 backtrace is lost. | |
| 1504 | |
| 1505 | |
| 1506 Exceptions in callbacks | |
| 1507 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1508 | |
| 1509 Python exceptions in Director callbacks are propagated back through the | |
| 1510 language layers (from Python to C++ to C, then back to C++ and finally to | |
| 1511 Python). | |
| 1512 | |
| 1513 For convenience we add a text representation of the original Python backtrace | |
| 1514 to the exception text, but the C layer's fz_try/catch exception handling only | |
| 1515 holds 256 characters of exception text, so this backtrace information may be | |
| 1516 truncated by the time the exception reaches the original Python code's ``except ...`` block. | |
| 1517 | |
| 1518 Example | |
| 1519 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
| 1520 | |
| 1521 Here is an example PDF filter written in Python that removes alternating items: | |
| 1522 | |
| 1523 **Details** | |
| 1524 | |
| 1525 |expand_begin| | |
| 1526 | |
| 1527 .. code-block:: | |
| 1528 | |
| 1529 import mupdf | |
| 1530 | |
| 1531 def test_filter(path): | |
| 1532 class MyFilter( mupdf.PdfFilterOptions2): | |
| 1533 def __init__( self): | |
| 1534 super().__init__() | |
| 1535 self.use_virtual_text_filter() | |
| 1536 self.recurse = 1 | |
| 1537 self.sanitize = 1 | |
| 1538 self.state = 1 | |
| 1539 self.ascii = True | |
| 1540 def text_filter( self, ctx, ucsbuf, ucslen, trm, ctm, bbox): | |
| 1541 print( f'text_filter(): ctx={ctx} ucsbuf={ucsbuf} ucslen={ucslen} trm={trm} ctm={ctm} bbox={bbox}') | |
| 1542 # Remove every other item. | |
| 1543 self.state = 1 - self.state | |
| 1544 return self.state | |
| 1545 | |
| 1546 filter_ = MyFilter() | |
| 1547 | |
| 1548 document = mupdf.PdfDocument(path) | |
| 1549 for p in range(document.pdf_count_pages()): | |
| 1550 page = document.pdf_load_page(p) | |
| 1551 print( f'Running document.pdf_filter_page_contents on page {p}') | |
| 1552 document.pdf_begin_operation('test filter') | |
| 1553 document.pdf_filter_page_contents(page, filter_) | |
| 1554 document.pdf_end_operation() | |
| 1555 | |
| 1556 document.pdf_save_document('foo.pdf', mupdf.PdfWriteOptions()) | |
| 1557 | |
| 1558 |expand_end| | |
| 1559 | |
| 1560 | |
| 1561 | |
| 1562 | |
| 1563 | |
| 1564 | |
| 1565 | |
| 1566 | |
| 1567 .. External links |
