Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/docs/reference/c/overview.md @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 # Overview | |
| 2 | |
| 3 ## Basic MuPDF usage example | |
| 4 | |
| 5 To limit the complexity and give an easier introduction | |
| 6 this code has no error handling at all, but any serious piece of code | |
| 7 using MuPDF should use the error handling strategies described below. | |
| 8 | |
| 9 ## Common function arguments | |
| 10 | |
| 11 Most functions in MuPDF's interface take a context argument. | |
| 12 | |
| 13 A context contains global state used by MuPDF inside functions when | |
| 14 parsing or rendering pages of the document. It contains for example: | |
| 15 | |
| 16 - an exception stack (see error handling below), | |
| 17 - a memory allocator (allowing for custom allocators) | |
| 18 - a resource store (for caching of images, fonts, etc.) | |
| 19 - a set of locks and (un-)locking functions (for multi-threading) | |
| 20 | |
| 21 Without the set of locks and accompanying functions the context and | |
| 22 its proxies may only be used in a single-threaded application. | |
| 23 | |
| 24 ## Error handling | |
| 25 | |
| 26 MuPDF uses a set of exception handling macros to simplify error return | |
| 27 and cleanup. Conceptually, they work a lot like C++'s try/catch | |
| 28 system, but do not require any special compiler support. | |
| 29 | |
| 30 The basic formulation is as follows: | |
| 31 | |
| 32 fz_try(ctx) | |
| 33 { | |
| 34 // Try to perform a task. Never 'return', 'goto' or | |
| 35 // 'longjmp' out of here. 'break' may be used to | |
| 36 // safely exit (just) the try block scope. | |
| 37 } | |
| 38 fz_always(ctx) | |
| 39 { | |
| 40 // Any code here is always executed, regardless of | |
| 41 // whether an exception was thrown within the try or | |
| 42 // not. Never 'return', 'goto' or longjmp out from | |
| 43 // here. 'break' may be used to safely exit (just) the | |
| 44 // always block scope. | |
| 45 } | |
| 46 fz_catch(ctx) | |
| 47 { | |
| 48 // This code is called (after any always block) only | |
| 49 // if something within the fz_try block (including any | |
| 50 // functions it called) threw an exception. The code | |
| 51 // here is expected to handle the exception (maybe | |
| 52 // record/report the error, cleanup any stray state | |
| 53 // etc) and can then either exit the block, or pass on | |
| 54 // the exception to a higher level (enclosing) fz_try | |
| 55 // block (using fz_throw, or fz_rethrow). | |
| 56 } | |
| 57 | |
| 58 The fz_always block is optional, and can safely be omitted. | |
| 59 | |
| 60 The macro based nature of this system has 3 main limitations: | |
| 61 | |
| 62 1. | |
| 63 Never return from within try (or 'goto' or longjmp out of it). | |
| 64 This upsets the internal housekeeping of the macros and will | |
| 65 cause problems later on. The code will detect such things | |
| 66 happening, but by then it is too late to give a helpful error | |
| 67 report as to where the original infraction occurred. | |
| 68 | |
| 69 2. | |
| 70 The try/always/catch statement is not one atomic C statement. | |
| 71 | |
| 72 The `fz_try(ctx) { ... } fz_always(ctx) { ... } fz_catch(ctx) { ... }` is not one atomic C statement. | |
| 73 That is to say, if you do: | |
| 74 | |
| 75 if (condition) | |
| 76 fz_try(ctx) { ... } | |
| 77 fz_catch(ctx) { ... } | |
| 78 | |
| 79 then you will not get what you want. Use the following instead: | |
| 80 | |
| 81 if (condition) { | |
| 82 fz_try(ctx) { ... } | |
| 83 fz_catch(ctx) { ... } | |
| 84 } | |
| 85 | |
| 86 3. | |
| 87 The macros are implemented using setjmp and longjmp, and so | |
| 88 the standard C restrictions on the use of those functions | |
| 89 apply to fz_try/fz_catch too. | |
| 90 | |
| 91 In particular, any "truly local" | |
| 92 variable that is set between the start of fz_try and something | |
| 93 in fz_try throwing an exception may become undefined as part | |
| 94 of the process of throwing that exception. | |
| 95 | |
| 96 As a way of mitigating this problem, we provide a fz_var() | |
| 97 macro that tells the compiler to ensure that that variable is | |
| 98 not unset by the act of throwing the exception. | |
| 99 | |
| 100 A model piece of code using these macros then might be: | |
| 101 | |
| 102 house build_house(plans *p) | |
| 103 { | |
| 104 material m = NULL; | |
| 105 walls w = NULL; | |
| 106 roof r = NULL; | |
| 107 house h = NULL; | |
| 108 tiles t = make_tiles(); | |
| 109 | |
| 110 fz_var(w); | |
| 111 fz_var(r); | |
| 112 fz_var(h); | |
| 113 | |
| 114 fz_try(ctx) | |
| 115 { | |
| 116 fz_try(ctx) | |
| 117 { | |
| 118 m = make_bricks(); | |
| 119 } | |
| 120 fz_catch(ctx) | |
| 121 { | |
| 122 // No bricks available, make do with straw? | |
| 123 m = make_straw(); | |
| 124 } | |
| 125 w = make_walls(m, p); | |
| 126 r = make_roof(m, t); | |
| 127 // Note, NOT: return combine(w,r); | |
| 128 h = combine(w, r); | |
| 129 } | |
| 130 fz_always(ctx) | |
| 131 { | |
| 132 drop_walls(w); | |
| 133 drop_roof(r); | |
| 134 drop_material(m); | |
| 135 drop_tiles(t); | |
| 136 } | |
| 137 fz_catch(ctx) | |
| 138 { | |
| 139 fz_throw(ctx, "build_house failed"); | |
| 140 } | |
| 141 return h; | |
| 142 } | |
| 143 | |
| 144 Things to note about this: | |
| 145 | |
| 146 1. If make_tiles throws an exception, this will immediately be | |
| 147 handled by some higher level exception handler. If it | |
| 148 succeeds, t will be set before fz_try starts, so there is no | |
| 149 need to fz_var(t); | |
| 150 | |
| 151 2. We try first off to make some bricks as our building material. | |
| 152 If this fails, we fall back to straw. If this fails, we'll end | |
| 153 up in the fz_catch, and the process will fail neatly. | |
| 154 | |
| 155 3. We assume in this code that combine takes new reference to | |
| 156 both the walls and the roof it uses, and therefore that w and | |
| 157 r need to be cleaned up in all cases. | |
| 158 | |
| 159 4. We assume the standard C convention that it is safe to destroy | |
| 160 NULL things. | |
| 161 | |
| 162 ## Multi-threading | |
| 163 | |
| 164 First off, study the basic usage example in `docs/examples/example.c` | |
| 165 and make sure you understand how it works as the data structures manipulated | |
| 166 there will be referred to in this section too. | |
| 167 | |
| 168 MuPDF can usefully be built into a multi-threaded application without | |
| 169 the library needing to know anything threading at all. If the library | |
| 170 opens a document in one thread, and then sits there as a 'server' | |
| 171 requesting pages and rendering them for other threads that need them, | |
| 172 then the library is only ever being called from this one thread. | |
| 173 | |
| 174 Other threads can still be used to handle UI requests etc, but as far | |
| 175 as MuPDF is concerned it is only being used in a single threaded way. | |
| 176 In this instance, there are no threading issues with MuPDF at all, | |
| 177 and it can safely be used without any locking, as described in the | |
| 178 previous sections. | |
| 179 | |
| 180 This section will attempt to explain how to use MuPDF in the more | |
| 181 complex case; where we genuinely want to call the MuPDF library | |
| 182 concurrently from multiple threads within a single application. | |
| 183 | |
| 184 MuPDF can be invoked with a user supplied set of locking functions. | |
| 185 It uses these to take mutexes around operations that would conflict | |
| 186 if performed concurrently in multiple threads. By leaving the | |
| 187 exact implementation of locks to the caller MuPDF remains threading | |
| 188 library agnostic. | |
| 189 | |
| 190 The following simple rules should be followed to ensure that | |
| 191 multi-threaded operations run smoothly: | |
| 192 | |
| 193 1. | |
| 194 "No simultaneous calls to MuPDF in different threads are | |
| 195 allowed to use the same context." | |
| 196 | |
| 197 Most of the time it is simplest to just use a different | |
| 198 context for every thread; just create a new context at the | |
| 199 same time as you create the thread. For more details see | |
| 200 "Cloning the context" below. | |
| 201 | |
| 202 2. | |
| 203 "No simultaneous calls to MuPDF in different threads are | |
| 204 allowed to use the same document." | |
| 205 | |
| 206 Only one thread can be accessing a document at a time, but | |
| 207 once display lists are created from that document, multiple | |
| 208 threads at a time can operate on them. | |
| 209 | |
| 210 The document can be used from several different threads as | |
| 211 long as there are safeguards in place to prevent the usages | |
| 212 being simultaneous. | |
| 213 | |
| 214 3. | |
| 215 "No simultaneous calls to MuPDF in different threads are | |
| 216 allowed to use the same device." | |
| 217 | |
| 218 Calling a device simultaneously from different threads will | |
| 219 cause it to get confused and may crash. Calling a device from | |
| 220 several different threads is perfectly acceptable as long as | |
| 221 there are safeguards in place to prevent the calls being | |
| 222 simultaneous. | |
| 223 | |
| 224 4. | |
| 225 "An fz_locks_context must be supplied at context creation time, | |
| 226 unless MuPDF is to be used purely in a single thread at a time." | |
| 227 | |
| 228 MuPDF needs to protect against unsafe access to certain structures/ | |
| 229 resources/libraries from multiple threads. It does this by using | |
| 230 the user supplied locking functions. This holds true even when | |
| 231 using completely separate instances of MuPDF. | |
| 232 | |
| 233 5. | |
| 234 "All contexts in use must share the same fz_locks_context (or | |
| 235 the underlying locks thereof)." | |
| 236 | |
| 237 We strongly recommend that fz_new_context is called just once, | |
| 238 and fz_clone_context is called to generate new contexts from | |
| 239 that. This will automatically ensure that the same locking | |
| 240 mechanism is used in all MuPDF instances. For now, we do support | |
| 241 multiple completely independent contexts being created using | |
| 242 repeated calls to fz_new_context, but these MUST share the | |
| 243 same fz_locks_context (or at least depend upon the same underlying | |
| 244 locks). The facility to create different independent contexts | |
| 245 may be removed in future. | |
| 246 | |
| 247 So, how does a multi-threaded example differ from a non-multithreaded | |
| 248 one? | |
| 249 | |
| 250 Firstly, when we create the first context, we call fz_new_context | |
| 251 as before, but the second argument should be a pointer to a set | |
| 252 of locking functions. | |
| 253 | |
| 254 The calling code should provide FZ_LOCK_MAX mutexes, which will be | |
| 255 locked/unlocked by MuPDF calling the lock/unlock function pointers | |
| 256 in the supplied structure with the user pointer from the structure | |
| 257 and the lock number, i (0 <= i < FZ_LOCK_MAX). These mutexes can | |
| 258 safely be recursive or non-recursive as MuPDF only calls in a non- | |
| 259 recursive style. | |
| 260 | |
| 261 To make subsequent contexts, the user should NOT call fz_new_context | |
| 262 again (as this will fail to share important resources such as the | |
| 263 store and glyphcache), but should rather call fz_clone_context. | |
| 264 Each of these cloned contexts can be freed by fz_free_context as | |
| 265 usual. They will share the important data structures (like store, | |
| 266 glyph cache etc) with the original context, but will have their | |
| 267 own exception stacks. | |
| 268 | |
| 269 To open a document, call fz_open_document as usual, passing a context | |
| 270 and a filename. It is important to realise that only one thread at a | |
| 271 time can be accessing the documents itself. | |
| 272 | |
| 273 This means that only one thread at a time can perform operations such | |
| 274 as fetching a page, or rendering that page to a display list. Once a | |
| 275 display list has been obtained however, it can be rendered from any | |
| 276 other thread (or even from several threads simultaneously, giving | |
| 277 banded rendering). | |
| 278 | |
| 279 This means that an implementer has 2 basic choices when constructing | |
| 280 an application to use MuPDF in multi-threaded mode. Either he can | |
| 281 construct it so that a single nominated thread opens the document | |
| 282 and then acts as a 'server' creating display lists for other threads | |
| 283 to render, or he can add his own mutex around calls to mupdf that | |
| 284 use the document. The former is likely to be far more efficient in | |
| 285 the long run. | |
| 286 | |
| 287 For an example of how to do multi-threading see | |
| 288 `docs/examples/multi-threaded.c` | |
| 289 which has a main thread and one rendering thread per page. | |
| 290 | |
| 291 ## Cloning the context | |
| 292 | |
| 293 As described above, every context contains an exception stack which is | |
| 294 manipulated during the course of nested fz_try/fz_catches. For obvious | |
| 295 reasons the same exception stack cannot be used from more than one | |
| 296 thread at a time. | |
| 297 | |
| 298 If, however, we simply created a new context (using fz_new_context) for | |
| 299 every thread, we would end up with separate stores/glyph caches etc, | |
| 300 which is not (generally) what is desired. MuPDF therefore provides a | |
| 301 mechanism for "cloning" a context. This creates a new context that | |
| 302 shares everything with the given context, except for the exception | |
| 303 stack. | |
| 304 | |
| 305 A commonly used general scheme is therefore to create a 'base' context | |
| 306 at program start up, and to clone this repeatedly to get new contexts | |
| 307 that can be used on new threads. |
