Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/harfbuzz/docs/repacker.md @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 # Introduction | |
| 2 | |
| 3 Several tables in the opentype format are formed internally by a graph of subtables. Parent node's | |
| 4 reference their children through the use of positive offsets, which are typically 16 bits wide. | |
| 5 Since offsets are always positive this forms a directed acyclic graph. For storage in the font file | |
| 6 the graph must be given a topological ordering and then the subtables packed in serial according to | |
| 7 that ordering. Since 16 bit offsets have a maximum value of 65,535 if the distance between a parent | |
| 8 subtable and a child is more then 65,535 bytes then it's not possible for the offset to encode that | |
| 9 edge. | |
| 10 | |
| 11 For many fonts with complex layout rules (such as Arabic) it's not unusual for the tables containing | |
| 12 layout rules ([GSUB/GPOS](https://docs.microsoft.com/en-us/typography/opentype/spec/gsub)) to be | |
| 13 larger than 65kb. As a result these types of fonts are susceptible to offset overflows when | |
| 14 serializing to the binary font format. | |
| 15 | |
| 16 Offset overflows can happen for a variety of reasons and require different strategies to resolve: | |
| 17 * Simple overflows can often be resolved with a different topological ordering. | |
| 18 * If a subtable has many parents this can result in the link from furthest parent(s) | |
| 19 being at risk for overflows. In these cases it's possible to duplicate the shared subtable which | |
| 20 allows it to be placed closer to it's parent. | |
| 21 * If subtables exist which are themselves larger than 65kb it's not possible for any offsets to point | |
| 22 past them. In these cases the subtable can usually be split into two smaller subtables to allow | |
| 23 for more flexibility in the ordering. | |
| 24 * In GSUB/GPOS overflows from Lookup subtables can be resolved by changing the Lookup to an extension | |
| 25 lookup which uses a 32 bit offset instead of 16 bit offset. | |
| 26 | |
| 27 In general there isn't a simple solution to produce an optimal topological ordering for a given graph. | |
| 28 Finding an ordering which doesn't overflow is a NP hard problem. Existing solutions use heuristics | |
| 29 which attempt a combination of the above strategies to attempt to find a non-overflowing configuration. | |
| 30 | |
| 31 The harfbuzz subsetting library | |
| 32 [includes a repacking algorithm](https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-repacker.hh) | |
| 33 which is used to resolve offset overflows that are present in the subsetted tables it produces. This | |
| 34 document provides a deep dive into how the harfbuzz repacking algorithm works. | |
| 35 | |
| 36 Other implementations exist, such as in | |
| 37 [fontTools](https://github.com/fonttools/fonttools/blob/7af43123d49c188fcef4e540fa94796b3b44e858/Lib/fontTools/ttLib/tables/otBase.py#L72), however these are not covered in this document. | |
| 38 | |
| 39 # Foundations | |
| 40 | |
| 41 There's four key pieces to the harfbuzz approach: | |
| 42 | |
| 43 * Subtable Graph: a table's internal structure is abstracted out into a lightweight graph | |
| 44 representation where each subtable is a node and each offset forms an edge. The nodes only need | |
| 45 to know how many bytes the corresponding subtable occupies. This lightweight representation can | |
| 46 be easily modified to test new ordering's and strategies as the repacking algorithm iterates. | |
| 47 | |
| 48 * [Topological sorting algorithm](https://en.wikipedia.org/wiki/Topological_sorting): an algorithm | |
| 49 which given a graph gives a linear sorting of the nodes such that all offsets will be positive. | |
| 50 | |
| 51 * Overflow check: given a graph and a topological sorting it checks if there will be any overflows | |
| 52 in any of the offsets. If there are overflows it returns a list of (parent, child) tuples that | |
| 53 will overflow. Since the graph has information on the size of each subtable it's straightforward | |
| 54 to calculate the final position of each subtable and then check if any offsets to it will | |
| 55 overflow. | |
| 56 | |
| 57 * Content Aware Preprocessing: if the overflow resolver is aware of the format of the underlying | |
| 58 tables (eg. GSUB, GPOS) then in some cases preprocessing can be done to increase the chance of | |
| 59 successfully packing the graph. For example for GSUB and GPOS we can preprocess the graph and | |
| 60 promote lookups to extension lookups (upgrades a 16 bit offset to 32 bits) or split large lookup | |
| 61 subtables into two or more pieces. | |
| 62 | |
| 63 * Offset resolution strategies: given a particular occurrence of an overflow these strategies | |
| 64 modify the graph to attempt to resolve the overflow. | |
| 65 | |
| 66 # High Level Algorithm | |
| 67 | |
| 68 ``` | |
| 69 def repack(graph): | |
| 70 graph.topological_sort() | |
| 71 | |
| 72 if (graph.will_overflow()) | |
| 73 preprocess(graph) | |
| 74 assign_spaces(graph) | |
| 75 graph.topological_sort() | |
| 76 | |
| 77 while (overflows = graph.will_overflow()): | |
| 78 for overflow in overflows: | |
| 79 apply_offset_resolution_strategy (overflow, graph) | |
| 80 graph.topological_sort() | |
| 81 ``` | |
| 82 | |
| 83 The actual code for this processing loop can be found in the function hb_resolve_overflows () of | |
| 84 [hb-repacker.hh](https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-repacker.hh). | |
| 85 | |
| 86 # Topological Sorting Algorithms | |
| 87 | |
| 88 The harfbuzz repacker uses two different algorithms for topological sorting: | |
| 89 * [Kahn's Algorithm](https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm) | |
| 90 * Sorting by shortest distance | |
| 91 | |
| 92 Kahn's algorithm is approximately twice as fast as the shortest distance sort so that is attempted | |
| 93 first (only on the first topological sort). If it fails to eliminate overflows then shortest distance | |
| 94 sort will be used for all subsequent topological sorting operations. | |
| 95 | |
| 96 ## Shortest Distance Sort | |
| 97 | |
| 98 This algorithm orders the nodes based on total distance to each node. Nodes with a shorter distance | |
| 99 are ordered first. | |
| 100 | |
| 101 The "weight" of an edge is the sum of the size of the sub-table being pointed to plus 2^16 for a 16 bit | |
| 102 offset and 2^32 for a 32 bit offset. | |
| 103 | |
| 104 The distance of a node is the sum of all weights along the shortest path from the root to that node | |
| 105 plus a priority modifier (used to change where nodes are placed by moving increasing or | |
| 106 decreasing the effective distance). Ties between nodes with the same distance are broken based | |
| 107 on the order of the offset in the sub table bytes. | |
| 108 | |
| 109 The shortest distance to each node is determined using | |
| 110 [Djikstra's algorithm](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm). Then the topological | |
| 111 ordering is produce by applying a modified version of Kahn's algorithm that uses a priority queue | |
| 112 based on the shortest distance to each node. | |
| 113 | |
| 114 ## Optimizing the Sorting | |
| 115 | |
| 116 The topological sorting operation is the core of the repacker and is run on each iteration so it needs | |
| 117 to be as fast as possible. There's a few things that are done to speed up subsequent sorting | |
| 118 operations: | |
| 119 | |
| 120 * The number of incoming edges to each node is cached. This is required by the Kahn's algorithm | |
| 121 portion of both sorts. Where possible when the graph is modified we manually update the cached | |
| 122 edge counts of affected nodes. | |
| 123 | |
| 124 * The distance to each node is cached. Where possible when the graph is modified we manually update | |
| 125 the cached distances of any affected nodes. | |
| 126 | |
| 127 Caching these values allows the repacker to avoid recalculating them for the full graph on each | |
| 128 iteration. | |
| 129 | |
| 130 The other important factor to speed is a fast priority queue which is a core datastructure to | |
| 131 the topological sorting algorithm. Currently a basic heap based queue is used. Heap based queue's | |
| 132 don't support fast priority decreases, but that can be worked around by just adding redundant entries | |
| 133 to the priority queue and filtering the older ones out when poppping off entries. This is based | |
| 134 on the recommendations in | |
| 135 [a study of the practical performance of priority queues in Dijkstra's algorithm](https://www3.cs.stonybrook.edu/~rezaul/papers/TR-07-54.pdf) | |
| 136 | |
| 137 ## Special Handling of 32 bit Offsets | |
| 138 | |
| 139 If a graph contains multiple 32 bit offsets then the shortest distance sorting will be likely be | |
| 140 suboptimal. For example consider the case where a graph contains two 32 bit offsets that each point | |
| 141 to a subgraph which are not connected to each other. The shortest distance sort will interleave the | |
| 142 subtables of the two subgraphs, potentially resulting in overflows. Since each of these subgraphs are | |
| 143 independent of each other, and 32 bit offsets can point extremely long distances a better strategy is | |
| 144 to pack the first subgraph in it's entirety and then have the second subgraph packed after with the 32 | |
| 145 bit offset pointing over the first subgraph. For example given the graph: | |
| 146 | |
| 147 | |
| 148 ``` | |
| 149 a--- b -- d -- f | |
| 150 \ | |
| 151 \_ c -- e -- g | |
| 152 ``` | |
| 153 | |
| 154 Where the links from a to b and a to c are 32 bit offsets, the shortest distance sort would be: | |
| 155 | |
| 156 ``` | |
| 157 a, b, c, d, e, f, g | |
| 158 | |
| 159 ``` | |
| 160 | |
| 161 If nodes d and e have a combined size greater than 65kb then the offset from d to f will overflow. | |
| 162 A better ordering is: | |
| 163 | |
| 164 ``` | |
| 165 a, b, d, f, c, e, g | |
| 166 ``` | |
| 167 | |
| 168 The ability for 32 bit offsets to point long distances is utilized to jump over the subgraph of | |
| 169 b which gives the remaining 16 bit offsets a better chance of not overflowing. | |
| 170 | |
| 171 The above is an ideal situation where the subgraphs are disconnected from each other, in practice | |
| 172 this is often not this case. So this idea can be generalized as follows: | |
| 173 | |
| 174 If there is a subgraph that is only reachable from one or more 32 bit offsets, then: | |
| 175 * That subgraph can be treated as an independent unit and all nodes of the subgraph packed in isolation | |
| 176 from the rest of the graph. | |
| 177 * In a table that occupies less than 4gb of space (in practice all fonts), that packed independent | |
| 178 subgraph can be placed anywhere after the parent nodes without overflowing the 32 bit offsets from | |
| 179 the parent nodes. | |
| 180 | |
| 181 The sorting algorithm incorporates this via a "space" modifier that can be applied to nodes in the | |
| 182 graph. By default all nodes are treated as being in space zero. If a node is given a non-zero space, n, | |
| 183 then the computed distance to the node will be modified by adding `n * 2^32`. This will cause that | |
| 184 node and it's descendants to be packed between all nodes in space n-1 and space n+1. Resulting in a | |
| 185 topological sort like: | |
| 186 | |
| 187 ``` | |
| 188 | space 0 subtables | space 1 subtables | .... | space n subtables | | |
| 189 ``` | |
| 190 | |
| 191 The assign_spaces() step in the high level algorithm is responsible for identifying independent | |
| 192 subgraphs and assigning unique spaces to each one. More information on the space assignment can be | |
| 193 found in the next section. | |
| 194 | |
| 195 # Graph Preprocessing | |
| 196 | |
| 197 For certain table types we can preprocess and modify the graph structure to reduce the occurences | |
| 198 of overflows. Currently the repacker implements preprocessing only for GPOS and GSUB tables. | |
| 199 | |
| 200 ## GSUB/GPOS Table Splitting | |
| 201 | |
| 202 The GSUB/GPOS preprocessor scans each lookup subtable and determines if the subtable's children are | |
| 203 so large that no overflow resolution is possible (for example a single subtable that exceeds 65kb | |
| 204 cannot be pointed over). When such cases are detected table splitting is invoked: | |
| 205 | |
| 206 * The subtable is first analyzed to determine the smallest number of split points that will allow | |
| 207 for successful offset overflow resolution. | |
| 208 | |
| 209 * Then the subtable in the graph representation is modified to actually perform the split at the | |
| 210 previously computed split points. At a high level splits are done by inserting new subtables | |
| 211 which contain a subset of the data of the original subtable and then shrinking the original subtable. | |
| 212 | |
| 213 Table splitting must be aware of the underlying format of each subtable type and thus needs custom | |
| 214 code for each subtable type. Currently subtable splitting is only supported for GPOS subtable types. | |
| 215 | |
| 216 ## GSUB/GPOS Extension Lookup Promotion | |
| 217 | |
| 218 In GSUB/GPOS tables lookups can be regular lookups which use 16 bit offsets to the children subtables | |
| 219 or extension lookups which use 32 bit offsets to the children subtables. If the sub graph of all | |
| 220 regular lookups is too large then it can be difficult to find an overflow free configuration. This | |
| 221 can be remedied by promoting one or more regular lookups to extension lookups. | |
| 222 | |
| 223 During preprocessing the graph is scanned to determine the size of the subgraph of regular lookups. | |
| 224 If the graph is found to be too big then the analysis finds a set of lookups to promote to reduce | |
| 225 the subgraph size. Lastly the graph is modified to convert those lookups to extension lookups. | |
| 226 | |
| 227 # Offset Resolution Strategies | |
| 228 | |
| 229 ## Space Assignment | |
| 230 | |
| 231 The goal of space assignment is to find connected subgraphs that are only reachable via 32 bit offsets | |
| 232 and then assign each such subgraph to a unique non-zero space. The algorithm is roughly: | |
| 233 | |
| 234 1. Collect the set, `S`, of nodes that are children of 32 bit offsets. | |
| 235 | |
| 236 2. Do a directed traversal from each node in `S` and collect all encountered nodes into set `T`. | |
| 237 Mark all nodes in the graph that are not in `T` as being in space 0. | |
| 238 | |
| 239 3. Set `next_space = 1`. | |
| 240 | |
| 241 4. While set `S` is not empty: | |
| 242 | |
| 243 a. Pick a node `n` in set `S` then perform an undirected graph traversal and find the set `Q` of | |
| 244 nodes that are reachable from `n`. | |
| 245 | |
| 246 b. During traversal if a node, `m`, has a edge to a node in space 0 then `m` must be duplicated | |
| 247 to disconnect it from space 0. | |
| 248 | |
| 249 d. Remove all nodes in `Q` from `S` and assign all nodes in `Q` to `next_space`. | |
| 250 | |
| 251 | |
| 252 c. Increment `next_space` by one. | |
| 253 | |
| 254 | |
| 255 ## Manual Iterative Resolutions | |
| 256 | |
| 257 For each overflow in each iteration the algorithm will attempt to apply offset overflow resolution | |
| 258 strategies to eliminate the overflow. The type of strategy applied is dependent on the characteristics | |
| 259 of the overflowing link: | |
| 260 | |
| 261 * If the overflowing offset is inside a space other than space 0 and the subgraph space has more | |
| 262 than one 32 bit offset pointing into the subgraph then subdivide the space by moving subgraph | |
| 263 from one of the 32 bit offsets into a new space via the duplication of shared nodes. | |
| 264 | |
| 265 * If the overflowing offset is pointing to a subtable with more than one incoming edge: duplicate | |
| 266 the node so that the overflowing offset is pointing at it's own copy of that node. | |
| 267 | |
| 268 * Otherwise, attempt to move the child subtable closer to it's parent. This is accomplished by | |
| 269 raising the priority of all children of the parent. Next time the topological sort is run the | |
| 270 children will be ordered closer to the parent. | |
| 271 | |
| 272 # Test Cases | |
| 273 | |
| 274 The harfbuzz repacker has tests defined using generic graphs: https://github.com/harfbuzz/harfbuzz/blob/main/src/test-repacker.cc | |
| 275 | |
| 276 # Future Improvements | |
| 277 | |
| 278 Currently for GPOS tables the repacker implementation is sufficient to handle both subsetting and the | |
| 279 general case of font compilation repacking. However for GSUB the repacker is only sufficient for | |
| 280 subsetting related overflows. To enable general case repacking of GSUB, support for splitting of | |
| 281 GSUB subtables will need to be added. Other table types such as COLRv1 shouldn't require table | |
| 282 splitting due to the wide use of 24 bit offsets throughout the table. | |
| 283 | |
| 284 Beyond subtable splitting there are a couple of "nice to have" improvements, but these are not required | |
| 285 to support the general case: | |
| 286 | |
| 287 * Extension demotion: currently extension promotion is supported but in some cases if the non-extension | |
| 288 subgraph is underfilled then packed size can be reduced by demoting extension lookups back to regular | |
| 289 lookups. | |
| 290 | |
| 291 * Currently only children nodes are moved to resolve offsets. However, in many cases moving a parent | |
| 292 node closer to it's children will have less impact on the size of other offsets. Thus the algorithm | |
| 293 should use a heuristic (based on parent and child subtable sizes) to decide if the children's | |
| 294 priority should be increased or the parent's priority decreased. |
