Mercurial > hgrepos > Python2 > PyMuPDF
annotate mupdf-source/thirdparty/tesseract/src/ccstruct/pageres.cpp @ 43:202a1f38a622
FIX: New added code snippets did not yet account for the changed return values from "_fromto()"
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Sat, 11 Oct 2025 17:15:08 +0200 |
| parents | b50eed0cc0ef |
| children |
| rev | line source |
|---|---|
|
2
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1 /********************************************************************** |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
2 * File: pageres.cpp (Formerly page_res.c) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
3 * Description: Hierarchy of results classes from PAGE_RES to WERD_RES |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
4 * and an iterator class to iterate over the words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
5 * Main purposes: |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
6 * Easy way to iterate over the words without a 3-nested loop. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
7 * Holds data used during word recognition. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
8 * Holds information about alternative spacing paths. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
9 * Author: Phil Cheatle |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
10 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
11 * (C) Copyright 1992, Hewlett-Packard Ltd. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
12 ** Licensed under the Apache License, Version 2.0 (the "License"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
13 ** you may not use this file except in compliance with the License. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
14 ** You may obtain a copy of the License at |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
15 ** http://www.apache.org/licenses/LICENSE-2.0 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
16 ** Unless required by applicable law or agreed to in writing, software |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
17 ** distributed under the License is distributed on an "AS IS" BASIS, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
18 ** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
19 ** See the License for the specific language governing permissions and |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
20 ** limitations under the License. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
21 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
22 **********************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
23 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
24 #include "pageres.h" |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
25 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
26 #include "blamer.h" // for BlamerBundle |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
27 #include "blobs.h" // for TWERD, TBLOB |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
28 #include "boxword.h" // for BoxWord |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
29 #include "errcode.h" // for ASSERT_HOST |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
30 #include "ocrblock.h" // for BLOCK_IT, BLOCK, BLOCK_LIST (ptr only) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
31 #include "ocrrow.h" // for ROW, ROW_IT |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
32 #include "pdblock.h" // for PDBLK |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
33 #include "polyblk.h" // for POLY_BLOCK |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
34 #include "seam.h" // for SEAM, start_seam_list |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
35 #include "stepblob.h" // for C_BLOB_IT, C_BLOB, C_BLOB_LIST |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
36 #include "tprintf.h" // for tprintf |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
37 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
38 #include <tesseract/publictypes.h> // for OcrEngineMode, OEM_LSTM_ONLY |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
39 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
40 #include <cassert> // for assert |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
41 #include <cstdint> // for INT32_MAX |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
42 #include <cstring> // for strlen |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
43 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
44 struct Pix; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
45 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
46 namespace tesseract { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
47 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
48 // Gain factor for computing thresholds that determine the ambiguity of a |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
49 // word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
50 static const double kStopperAmbiguityThresholdGain = 8.0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
51 // Constant offset for computing thresholds that determine the ambiguity of a |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
52 // word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
53 static const double kStopperAmbiguityThresholdOffset = 1.5; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
54 // Max number of broken pieces to associate. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
55 const int kWordrecMaxNumJoinChunks = 4; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
56 // Max ratio of word box height to line size to allow it to be processed as |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
57 // a line with other words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
58 const double kMaxWordSizeRatio = 1.25; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
59 // Max ratio of line box height to line size to allow a new word to be added. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
60 const double kMaxLineSizeRatio = 1.25; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
61 // Max ratio of word gap to line size to allow a new word to be added. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
62 const double kMaxWordGapRatio = 2.0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
63 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
64 // Computes and returns a threshold of certainty difference used to determine |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
65 // which words to keep, based on the adjustment factors of the two words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
66 // TODO(rays) This is horrible. Replace with an enhance params training model. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
67 static double StopperAmbigThreshold(double f1, double f2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
68 return (f2 - f1) * kStopperAmbiguityThresholdGain - |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
69 kStopperAmbiguityThresholdOffset; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
70 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
71 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
72 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
73 * PAGE_RES::PAGE_RES |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
74 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
75 * Constructor for page results |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
76 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
77 PAGE_RES::PAGE_RES(bool merge_similar_words, BLOCK_LIST *the_block_list, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
78 WERD_CHOICE **prev_word_best_choice_ptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
79 Init(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
80 BLOCK_IT block_it(the_block_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
81 BLOCK_RES_IT block_res_it(&block_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
82 for (block_it.mark_cycle_pt(); !block_it.cycled_list(); block_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
83 block_res_it.add_to_end( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
84 new BLOCK_RES(merge_similar_words, block_it.data())); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
85 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
86 prev_word_best_choice = prev_word_best_choice_ptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
87 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
88 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
89 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
90 * BLOCK_RES::BLOCK_RES |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
91 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
92 * Constructor for BLOCK results |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
93 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
94 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
95 BLOCK_RES::BLOCK_RES(bool merge_similar_words, BLOCK *the_block) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
96 ROW_IT row_it(the_block->row_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
97 ROW_RES_IT row_res_it(&row_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
98 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
99 char_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
100 rej_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
101 font_class = -1; // not assigned |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
102 x_height = -1.0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
103 font_assigned = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
104 row_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
105 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
106 block = the_block; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
107 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
108 for (row_it.mark_cycle_pt(); !row_it.cycled_list(); row_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
109 row_res_it.add_to_end(new ROW_RES(merge_similar_words, row_it.data())); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
110 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
111 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
112 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
113 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
114 * ROW_RES::ROW_RES |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
115 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
116 * Constructor for ROW results |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
117 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
118 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
119 ROW_RES::ROW_RES(bool merge_similar_words, ROW *the_row) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
120 WERD_IT word_it(the_row->word_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
121 WERD_RES_IT word_res_it(&word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
122 WERD_RES *combo = nullptr; // current combination of fuzzies |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
123 WERD *copy_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
124 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
125 char_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
126 rej_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
127 whole_word_rej_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
128 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
129 row = the_row; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
130 bool add_next_word = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
131 TBOX union_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
132 float line_height = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
133 the_row->x_height() + the_row->ascenders() - the_row->descenders(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
134 for (word_it.mark_cycle_pt(); !word_it.cycled_list(); word_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
135 auto *word_res = new WERD_RES(word_it.data()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
136 word_res->x_height = the_row->x_height(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
137 if (add_next_word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
138 ASSERT_HOST(combo != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
139 // We are adding this word to the combination. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
140 word_res->part_of_combo = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
141 combo->copy_on(word_res); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
142 } else if (merge_similar_words) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
143 union_box = word_res->word->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
144 add_next_word = !word_res->word->flag(W_REP_CHAR) && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
145 union_box.height() <= line_height * kMaxWordSizeRatio; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
146 word_res->odd_size = !add_next_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
147 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
148 WERD *next_word = word_it.data_relative(1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
149 if (merge_similar_words) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
150 if (add_next_word && !next_word->flag(W_REP_CHAR)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
151 // Next word will be added on if all of the following are true: |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
152 // Not a rep char. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
153 // Box height small enough. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
154 // Union box height small enough. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
155 // Horizontal gap small enough. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
156 TBOX next_box = next_word->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
157 int prev_right = union_box.right(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
158 union_box += next_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
159 if (next_box.height() > line_height * kMaxWordSizeRatio || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
160 union_box.height() > line_height * kMaxLineSizeRatio || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
161 next_box.left() > prev_right + line_height * kMaxWordGapRatio) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
162 add_next_word = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
163 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
164 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
165 next_word->set_flag(W_FUZZY_NON, add_next_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
166 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
167 add_next_word = next_word->flag(W_FUZZY_NON); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
168 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
169 if (add_next_word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
170 if (combo == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
171 copy_word = new WERD; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
172 *copy_word = *(word_it.data()); // deep copy |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
173 combo = new WERD_RES(copy_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
174 combo->x_height = the_row->x_height(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
175 combo->combination = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
176 word_res_it.add_to_end(combo); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
177 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
178 word_res->part_of_combo = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
179 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
180 combo = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
181 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
182 word_res_it.add_to_end(word_res); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
183 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
184 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
185 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
186 WERD_RES &WERD_RES::operator=(const WERD_RES &source) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
187 this->ELIST_LINK::operator=(source); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
188 Clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
189 if (source.combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
190 word = new WERD; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
191 *word = *(source.word); // deep copy |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
192 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
193 word = source.word; // pt to same word |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
194 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
195 if (source.bln_boxes != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
196 bln_boxes = new tesseract::BoxWord(*source.bln_boxes); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
197 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
198 if (source.chopped_word != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
199 chopped_word = new TWERD(*source.chopped_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
200 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
201 if (source.rebuild_word != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
202 rebuild_word = new TWERD(*source.rebuild_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
203 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
204 // TODO(rays) Do we ever need to copy the seam_array? |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
205 blob_row = source.blob_row; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
206 denorm = source.denorm; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
207 if (source.box_word != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
208 box_word = new tesseract::BoxWord(*source.box_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
209 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
210 best_state = source.best_state; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
211 correct_text = source.correct_text; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
212 blob_widths = source.blob_widths; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
213 blob_gaps = source.blob_gaps; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
214 // None of the uses of operator= require the ratings matrix to be copied, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
215 // so don't as it would be really slow. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
216 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
217 // Copy the cooked choices. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
218 WERD_CHOICE_IT wc_it(const_cast<WERD_CHOICE_LIST *>(&source.best_choices)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
219 WERD_CHOICE_IT wc_dest_it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
220 for (wc_it.mark_cycle_pt(); !wc_it.cycled_list(); wc_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
221 const WERD_CHOICE *choice = wc_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
222 wc_dest_it.add_after_then_move(new WERD_CHOICE(*choice)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
223 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
224 if (!wc_dest_it.empty()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
225 wc_dest_it.move_to_first(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
226 best_choice = wc_dest_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
227 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
228 best_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
229 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
230 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
231 if (source.raw_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
232 raw_choice = new WERD_CHOICE(*source.raw_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
233 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
234 raw_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
235 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
236 if (source.ep_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
237 ep_choice = new WERD_CHOICE(*source.ep_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
238 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
239 ep_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
240 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
241 reject_map = source.reject_map; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
242 combination = source.combination; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
243 part_of_combo = source.part_of_combo; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
244 CopySimpleFields(source); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
245 if (source.blamer_bundle != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
246 blamer_bundle = new BlamerBundle(*(source.blamer_bundle)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
247 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
248 return *this; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
249 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
250 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
251 // Copies basic fields that don't involve pointers that might be useful |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
252 // to copy when making one WERD_RES from another. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
253 void WERD_RES::CopySimpleFields(const WERD_RES &source) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
254 tess_failed = source.tess_failed; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
255 tess_accepted = source.tess_accepted; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
256 tess_would_adapt = source.tess_would_adapt; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
257 done = source.done; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
258 unlv_crunch_mode = source.unlv_crunch_mode; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
259 small_caps = source.small_caps; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
260 odd_size = source.odd_size; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
261 fontinfo = source.fontinfo; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
262 fontinfo2 = source.fontinfo2; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
263 fontinfo_id_count = source.fontinfo_id_count; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
264 fontinfo_id2_count = source.fontinfo_id2_count; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
265 x_height = source.x_height; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
266 caps_height = source.caps_height; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
267 baseline_shift = source.baseline_shift; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
268 guessed_x_ht = source.guessed_x_ht; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
269 guessed_caps_ht = source.guessed_caps_ht; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
270 reject_spaces = source.reject_spaces; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
271 uch_set = source.uch_set; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
272 tesseract = source.tesseract; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
273 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
274 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
275 // Initializes a blank (default constructed) WERD_RES from one that has |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
276 // already been recognized. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
277 // Use SetupFor*Recognition afterwards to complete the setup and make |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
278 // it ready for a retry recognition. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
279 void WERD_RES::InitForRetryRecognition(const WERD_RES &source) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
280 word = source.word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
281 CopySimpleFields(source); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
282 if (source.blamer_bundle != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
283 blamer_bundle = new BlamerBundle(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
284 blamer_bundle->CopyTruth(*source.blamer_bundle); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
285 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
286 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
287 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
288 // Sets up the members used in recognition: bln_boxes, chopped_word, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
289 // seam_array, denorm. Returns false if |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
290 // the word is empty and sets up fake results. If use_body_size is |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
291 // true and row->body_size is set, then body_size will be used for |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
292 // blob normalization instead of xheight + ascrise. This flag is for |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
293 // those languages that are using CJK pitch model and thus it has to |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
294 // be true if and only if tesseract->textord_use_cjk_fp_model is |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
295 // true. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
296 // If allow_detailed_fx is true, the feature extractor will receive fine |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
297 // precision outline information, allowing smoother features and better |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
298 // features on low resolution images. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
299 // The norm_mode_hint sets the default mode for normalization in absence |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
300 // of any of the above flags. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
301 // norm_box is used to override the word bounding box to determine the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
302 // normalization scale and offset. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
303 // Returns false if the word is empty and sets up fake results. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
304 bool WERD_RES::SetupForRecognition(const UNICHARSET &unicharset_in, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
305 tesseract::Tesseract *tess, Image pix, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
306 int norm_mode, const TBOX *norm_box, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
307 bool numeric_mode, bool use_body_size, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
308 bool allow_detailed_fx, ROW *row, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
309 const BLOCK *block) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
310 auto norm_mode_hint = static_cast<tesseract::OcrEngineMode>(norm_mode); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
311 tesseract = tess; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
312 POLY_BLOCK *pb = block != nullptr ? block->pdblk.poly_block() : nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
313 if ((norm_mode_hint != tesseract::OEM_LSTM_ONLY && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
314 word->cblob_list()->empty()) || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
315 (pb != nullptr && !pb->IsText())) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
316 // Empty words occur when all the blobs have been moved to the rej_blobs |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
317 // list, which seems to occur frequently in junk. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
318 SetupFake(unicharset_in); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
319 word->set_flag(W_REP_CHAR, false); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
320 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
321 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
322 ClearResults(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
323 SetupWordScript(unicharset_in); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
324 chopped_word = TWERD::PolygonalCopy(allow_detailed_fx, word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
325 float word_xheight = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
326 use_body_size && row != nullptr && row->body_size() > 0.0f |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
327 ? row->body_size() |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
328 : x_height; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
329 chopped_word->BLNormalize(block, row, pix, word->flag(W_INVERSE), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
330 word_xheight, baseline_shift, numeric_mode, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
331 norm_mode_hint, norm_box, &denorm); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
332 blob_row = row; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
333 SetupBasicsFromChoppedWord(unicharset_in); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
334 SetupBlamerBundle(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
335 int num_blobs = chopped_word->NumBlobs(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
336 ratings = new MATRIX(num_blobs, kWordrecMaxNumJoinChunks); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
337 tess_failed = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
338 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
339 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
340 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
341 // Set up the seam array, bln_boxes, best_choice, and raw_choice to empty |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
342 // accumulators from a made chopped word. We presume the fields are already |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
343 // empty. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
344 void WERD_RES::SetupBasicsFromChoppedWord(const UNICHARSET &unicharset_in) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
345 bln_boxes = tesseract::BoxWord::CopyFromNormalized(chopped_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
346 start_seam_list(chopped_word, &seam_array); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
347 SetupBlobWidthsAndGaps(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
348 ClearWordChoices(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
349 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
350 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
351 // Sets up the members used in recognition for an empty recognition result: |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
352 // bln_boxes, chopped_word, seam_array, denorm, best_choice, raw_choice. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
353 void WERD_RES::SetupFake(const UNICHARSET &unicharset_in) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
354 ClearResults(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
355 SetupWordScript(unicharset_in); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
356 chopped_word = new TWERD; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
357 rebuild_word = new TWERD; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
358 bln_boxes = new tesseract::BoxWord; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
359 box_word = new tesseract::BoxWord; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
360 int blob_count = word->cblob_list()->length(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
361 if (blob_count > 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
362 auto **fake_choices = new BLOB_CHOICE *[blob_count]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
363 // For non-text blocks, just pass any blobs through to the box_word |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
364 // and call the word failed with a fake classification. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
365 C_BLOB_IT b_it(word->cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
366 int blob_id = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
367 for (b_it.mark_cycle_pt(); !b_it.cycled_list(); b_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
368 TBOX box = b_it.data()->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
369 box_word->InsertBox(box_word->length(), box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
370 fake_choices[blob_id++] = new BLOB_CHOICE; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
371 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
372 FakeClassifyWord(blob_count, fake_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
373 delete[] fake_choices; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
374 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
375 auto *word = new WERD_CHOICE(&unicharset_in); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
376 word->make_bad(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
377 LogNewRawChoice(word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
378 // Ownership of word is taken by *this WERD_RES in LogNewCookedChoice. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
379 LogNewCookedChoice(1, false, word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
380 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
381 tess_failed = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
382 done = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
383 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
384 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
385 void WERD_RES::SetupWordScript(const UNICHARSET &uch) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
386 uch_set = &uch; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
387 int script = uch.default_sid(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
388 word->set_script_id(script); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
389 word->set_flag(W_SCRIPT_HAS_XHEIGHT, uch.script_has_xheight()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
390 word->set_flag(W_SCRIPT_IS_LATIN, script == uch.latin_sid()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
391 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
392 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
393 // Sets up the blamer_bundle if it is not null, using the initialized denorm. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
394 void WERD_RES::SetupBlamerBundle() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
395 if (blamer_bundle != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
396 blamer_bundle->SetupNormTruthWord(denorm); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
397 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
398 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
399 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
400 // Computes the blob_widths and blob_gaps from the chopped_word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
401 void WERD_RES::SetupBlobWidthsAndGaps() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
402 blob_widths.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
403 blob_gaps.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
404 int num_blobs = chopped_word->NumBlobs(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
405 for (int b = 0; b < num_blobs; ++b) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
406 TBLOB *blob = chopped_word->blobs[b]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
407 TBOX box = blob->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
408 blob_widths.push_back(box.width()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
409 if (b + 1 < num_blobs) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
410 blob_gaps.push_back(chopped_word->blobs[b + 1]->bounding_box().left() - |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
411 box.right()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
412 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
413 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
414 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
415 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
416 // Updates internal data to account for a new SEAM (chop) at the given |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
417 // blob_number. Fixes the ratings matrix and states in the choices, as well |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
418 // as the blob widths and gaps. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
419 void WERD_RES::InsertSeam(int blob_number, SEAM *seam) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
420 // Insert the seam into the SEAMS array. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
421 seam->PrepareToInsertSeam(seam_array, chopped_word->blobs, blob_number, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
422 seam_array.insert(seam_array.begin() + blob_number, seam); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
423 if (ratings != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
424 // Expand the ratings matrix. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
425 ratings = ratings->ConsumeAndMakeBigger(blob_number); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
426 // Fix all the segmentation states. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
427 if (raw_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
428 raw_choice->UpdateStateForSplit(blob_number); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
429 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
430 WERD_CHOICE_IT wc_it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
431 for (wc_it.mark_cycle_pt(); !wc_it.cycled_list(); wc_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
432 WERD_CHOICE *choice = wc_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
433 choice->UpdateStateForSplit(blob_number); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
434 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
435 SetupBlobWidthsAndGaps(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
436 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
437 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
438 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
439 // Returns true if all the word choices except the first have adjust_factors |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
440 // worse than the given threshold. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
441 bool WERD_RES::AlternativeChoiceAdjustmentsWorseThan(float threshold) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
442 // The choices are not changed by this iteration. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
443 WERD_CHOICE_IT wc_it(const_cast<WERD_CHOICE_LIST *>(&best_choices)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
444 for (wc_it.forward(); !wc_it.at_first(); wc_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
445 WERD_CHOICE *choice = wc_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
446 if (choice->adjust_factor() <= threshold) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
447 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
448 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
449 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
450 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
451 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
452 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
453 // Returns true if the current word is ambiguous (by number of answers or |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
454 // by dangerous ambigs.) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
455 bool WERD_RES::IsAmbiguous() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
456 return !best_choices.singleton() || best_choice->dangerous_ambig_found(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
457 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
458 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
459 // Returns true if the ratings matrix size matches the sum of each of the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
460 // segmentation states. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
461 bool WERD_RES::StatesAllValid() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
462 unsigned ratings_dim = ratings->dimension(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
463 if (raw_choice->TotalOfStates() != ratings_dim) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
464 tprintf("raw_choice has total of states = %u vs ratings dim of %u\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
465 raw_choice->TotalOfStates(), ratings_dim); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
466 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
467 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
468 WERD_CHOICE_IT it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
469 unsigned index = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
470 for (it.mark_cycle_pt(); !it.cycled_list(); it.forward(), ++index) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
471 WERD_CHOICE *choice = it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
472 if (choice->TotalOfStates() != ratings_dim) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
473 tprintf("Cooked #%u has total of states = %u vs ratings dim of %u\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
474 index, choice->TotalOfStates(), ratings_dim); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
475 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
476 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
477 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
478 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
479 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
480 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
481 // Prints a list of words found if debug is true or the word result matches |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
482 // the word_to_debug. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
483 void WERD_RES::DebugWordChoices(bool debug, const char *word_to_debug) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
484 if (debug || (word_to_debug != nullptr && *word_to_debug != '\0' && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
485 best_choice != nullptr && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
486 best_choice->unichar_string() == std::string(word_to_debug))) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
487 if (raw_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
488 raw_choice->print("\nBest Raw Choice"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
489 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
490 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
491 WERD_CHOICE_IT it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
492 int index = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
493 for (it.mark_cycle_pt(); !it.cycled_list(); it.forward(), ++index) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
494 WERD_CHOICE *choice = it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
495 std::string label; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
496 label += "\nCooked Choice #" + std::to_string(index); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
497 choice->print(label.c_str()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
498 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
499 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
500 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
501 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
502 // Prints the top choice along with the accepted/done flags. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
503 void WERD_RES::DebugTopChoice(const char *msg) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
504 tprintf("Best choice: accepted=%d, adaptable=%d, done=%d : ", tess_accepted, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
505 tess_would_adapt, done); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
506 if (best_choice == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
507 tprintf("<Null choice>\n"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
508 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
509 best_choice->print(msg); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
510 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
511 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
512 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
513 // Removes from best_choices all choices which are not within a reasonable |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
514 // range of the best choice. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
515 // TODO(rays) incorporate the information used here into the params training |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
516 // re-ranker, in place of this heuristic that is based on the previous |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
517 // adjustment factor. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
518 void WERD_RES::FilterWordChoices(int debug_level) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
519 if (best_choice == nullptr || best_choices.singleton()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
520 return; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
521 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
522 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
523 if (debug_level >= 2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
524 best_choice->print("\nFiltering against best choice"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
525 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
526 WERD_CHOICE_IT it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
527 int index = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
528 for (it.forward(); !it.at_first(); it.forward(), ++index) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
529 WERD_CHOICE *choice = it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
530 float threshold = StopperAmbigThreshold(best_choice->adjust_factor(), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
531 choice->adjust_factor()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
532 // i, j index the blob choice in choice, best_choice. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
533 // chunk is an index into the chopped_word blobs (AKA chunks). |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
534 // Since the two words may use different segmentations of the chunks, we |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
535 // iterate over the chunks to find out whether a comparable blob |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
536 // classification is much worse than the best result. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
537 unsigned i = 0, j = 0, chunk = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
538 // Each iteration of the while deals with 1 chunk. On entry choice_chunk |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
539 // and best_chunk are the indices of the first chunk in the NEXT blob, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
540 // i.e. we don't have to increment i, j while chunk < choice_chunk and |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
541 // best_chunk respectively. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
542 auto choice_chunk = choice->state(0), best_chunk = best_choice->state(0); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
543 while (i < choice->length() && j < best_choice->length()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
544 if (choice->unichar_id(i) != best_choice->unichar_id(j) && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
545 choice->certainty(i) - best_choice->certainty(j) < threshold) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
546 if (debug_level >= 2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
547 choice->print("WorstCertaintyDiffWorseThan"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
548 tprintf( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
549 "i %u j %u Choice->Blob[i].Certainty %.4g" |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
550 " WorstOtherChoiceCertainty %g Threshold %g\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
551 i, j, choice->certainty(i), best_choice->certainty(j), threshold); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
552 tprintf("Discarding bad choice #%d\n", index); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
553 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
554 delete it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
555 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
556 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
557 ++chunk; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
558 // If needed, advance choice_chunk to keep up with chunk. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
559 while (choice_chunk < chunk && ++i < choice->length()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
560 choice_chunk += choice->state(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
561 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
562 // If needed, advance best_chunk to keep up with chunk. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
563 while (best_chunk < chunk && ++j < best_choice->length()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
564 best_chunk += best_choice->state(j); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
565 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
566 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
567 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
568 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
569 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
570 void WERD_RES::ComputeAdaptionThresholds(float certainty_scale, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
571 float min_rating, float max_rating, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
572 float rating_margin, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
573 float *thresholds) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
574 int chunk = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
575 int end_chunk = best_choice->state(0); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
576 int end_raw_chunk = raw_choice->state(0); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
577 int raw_blob = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
578 for (unsigned i = 0; i < best_choice->length(); i++, thresholds++) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
579 float avg_rating = 0.0f; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
580 int num_error_chunks = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
581 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
582 // For each chunk in best choice blob i, count non-matching raw results. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
583 while (chunk < end_chunk) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
584 if (chunk >= end_raw_chunk) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
585 ++raw_blob; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
586 end_raw_chunk += raw_choice->state(raw_blob); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
587 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
588 if (best_choice->unichar_id(i) != raw_choice->unichar_id(raw_blob)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
589 avg_rating += raw_choice->certainty(raw_blob); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
590 ++num_error_chunks; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
591 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
592 ++chunk; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
593 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
594 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
595 if (num_error_chunks > 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
596 avg_rating /= num_error_chunks; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
597 *thresholds = (avg_rating / -certainty_scale) * (1.0 - rating_margin); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
598 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
599 *thresholds = max_rating; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
600 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
601 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
602 if (*thresholds > max_rating) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
603 *thresholds = max_rating; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
604 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
605 if (*thresholds < min_rating) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
606 *thresholds = min_rating; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
607 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
608 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
609 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
610 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
611 // Saves a copy of the word_choice if it has the best unadjusted rating. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
612 // Returns true if the word_choice was the new best. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
613 bool WERD_RES::LogNewRawChoice(WERD_CHOICE *word_choice) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
614 if (raw_choice == nullptr || word_choice->rating() < raw_choice->rating()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
615 delete raw_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
616 raw_choice = new WERD_CHOICE(*word_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
617 raw_choice->set_permuter(TOP_CHOICE_PERM); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
618 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
619 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
620 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
621 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
622 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
623 // Consumes word_choice by adding it to best_choices, (taking ownership) if |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
624 // the certainty for word_choice is some distance of the best choice in |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
625 // best_choices, or by deleting the word_choice and returning false. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
626 // The best_choices list is kept in sorted order by rating. Duplicates are |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
627 // removed, and the list is kept no longer than max_num_choices in length. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
628 // Returns true if the word_choice is still a valid pointer. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
629 bool WERD_RES::LogNewCookedChoice(int max_num_choices, bool debug, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
630 WERD_CHOICE *word_choice) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
631 if (best_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
632 // Throw out obviously bad choices to save some work. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
633 // TODO(rays) Get rid of this! This piece of code produces different |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
634 // results according to the order in which words are found, which is an |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
635 // undesirable behavior. It would be better to keep all the choices and |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
636 // prune them later when more information is available. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
637 float max_certainty_delta = StopperAmbigThreshold( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
638 best_choice->adjust_factor(), word_choice->adjust_factor()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
639 if (max_certainty_delta > -kStopperAmbiguityThresholdOffset) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
640 max_certainty_delta = -kStopperAmbiguityThresholdOffset; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
641 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
642 if (word_choice->certainty() - best_choice->certainty() < |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
643 max_certainty_delta) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
644 if (debug) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
645 std::string bad_string; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
646 word_choice->string_and_lengths(&bad_string, nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
647 tprintf( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
648 "Discarding choice \"%s\" with an overly low certainty" |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
649 " %.3f vs best choice certainty %.3f (Threshold: %.3f)\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
650 bad_string.c_str(), word_choice->certainty(), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
651 best_choice->certainty(), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
652 max_certainty_delta + best_choice->certainty()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
653 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
654 delete word_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
655 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
656 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
657 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
658 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
659 // Insert in the list in order of increasing rating, but knock out worse |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
660 // string duplicates. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
661 WERD_CHOICE_IT it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
662 const std::string &new_str = word_choice->unichar_string(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
663 bool inserted = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
664 int num_choices = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
665 if (!it.empty()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
666 do { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
667 WERD_CHOICE *choice = it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
668 if (choice->rating() > word_choice->rating() && !inserted) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
669 // Time to insert. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
670 it.add_before_stay_put(word_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
671 inserted = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
672 if (num_choices == 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
673 best_choice = word_choice; // This is the new best. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
674 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
675 ++num_choices; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
676 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
677 if (choice->unichar_string() == new_str) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
678 if (inserted) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
679 // New is better. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
680 delete it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
681 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
682 // Old is better. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
683 if (debug) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
684 tprintf("Discarding duplicate choice \"%s\", rating %g vs %g\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
685 new_str.c_str(), word_choice->rating(), choice->rating()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
686 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
687 delete word_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
688 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
689 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
690 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
691 ++num_choices; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
692 if (num_choices > max_num_choices) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
693 delete it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
694 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
695 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
696 it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
697 } while (!it.at_first()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
698 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
699 if (!inserted && num_choices < max_num_choices) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
700 it.add_to_end(word_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
701 inserted = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
702 if (num_choices == 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
703 best_choice = word_choice; // This is the new best. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
704 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
705 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
706 if (debug) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
707 if (inserted) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
708 tprintf("New %s", best_choice == word_choice ? "Best" : "Secondary"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
709 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
710 tprintf("Poor"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
711 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
712 word_choice->print(" Word Choice"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
713 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
714 if (!inserted) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
715 delete word_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
716 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
717 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
718 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
719 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
720 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
721 // Simple helper moves the ownership of the pointer data from src to dest, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
722 // first deleting anything in dest, and nulling out src afterwards. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
723 template <class T> |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
724 static void MovePointerData(T **dest, T **src) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
725 delete *dest; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
726 *dest = *src; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
727 *src = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
728 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
729 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
730 // Prints a brief list of all the best choices. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
731 void WERD_RES::PrintBestChoices() const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
732 std::string alternates_str; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
733 WERD_CHOICE_IT it(const_cast<WERD_CHOICE_LIST *>(&best_choices)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
734 for (it.mark_cycle_pt(); !it.cycled_list(); it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
735 if (!it.at_first()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
736 alternates_str += "\", \""; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
737 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
738 alternates_str += it.data()->unichar_string(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
739 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
740 tprintf("Alternates for \"%s\": {\"%s\"}\n", |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
741 best_choice->unichar_string().c_str(), alternates_str.c_str()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
742 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
743 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
744 // Returns the sum of the widths of the blob between start_blob and last_blob |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
745 // inclusive. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
746 int WERD_RES::GetBlobsWidth(int start_blob, int last_blob) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
747 int result = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
748 for (int b = start_blob; b <= last_blob; ++b) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
749 result += blob_widths[b]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
750 if (b < last_blob) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
751 result += blob_gaps[b]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
752 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
753 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
754 return result; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
755 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
756 // Returns the width of a gap between the specified blob and the next one. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
757 int WERD_RES::GetBlobsGap(unsigned blob_index) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
758 if (blob_index >= blob_gaps.size()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
759 return 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
760 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
761 return blob_gaps[blob_index]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
762 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
763 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
764 // Returns the BLOB_CHOICE corresponding to the given index in the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
765 // best choice word taken from the appropriate cell in the ratings MATRIX. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
766 // Borrowed pointer, so do not delete. May return nullptr if there is no |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
767 // BLOB_CHOICE matching the unichar_id at the given index. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
768 BLOB_CHOICE *WERD_RES::GetBlobChoice(unsigned index) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
769 if (index >= best_choice->length()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
770 return nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
771 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
772 BLOB_CHOICE_LIST *choices = GetBlobChoices(index); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
773 return FindMatchingChoice(best_choice->unichar_id(index), choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
774 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
775 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
776 // Returns the BLOB_CHOICE_LIST corresponding to the given index in the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
777 // best choice word taken from the appropriate cell in the ratings MATRIX. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
778 // Borrowed pointer, so do not delete. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
779 BLOB_CHOICE_LIST *WERD_RES::GetBlobChoices(int index) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
780 return best_choice->blob_choices(index, ratings); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
781 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
782 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
783 // Moves the results fields from word to this. This takes ownership of all |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
784 // the data, so src can be destructed. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
785 void WERD_RES::ConsumeWordResults(WERD_RES *word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
786 denorm = word->denorm; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
787 blob_row = word->blob_row; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
788 MovePointerData(&chopped_word, &word->chopped_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
789 MovePointerData(&rebuild_word, &word->rebuild_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
790 MovePointerData(&box_word, &word->box_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
791 for (auto data : seam_array) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
792 delete data; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
793 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
794 seam_array = word->seam_array; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
795 word->seam_array.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
796 // TODO: optimize moves. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
797 best_state = word->best_state; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
798 word->best_state.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
799 correct_text = word->correct_text; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
800 word->correct_text.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
801 blob_widths = word->blob_widths; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
802 word->blob_widths.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
803 blob_gaps = word->blob_gaps; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
804 word->blob_gaps.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
805 if (ratings != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
806 ratings->delete_matrix_pointers(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
807 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
808 MovePointerData(&ratings, &word->ratings); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
809 best_choice = word->best_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
810 MovePointerData(&raw_choice, &word->raw_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
811 best_choices.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
812 WERD_CHOICE_IT wc_it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
813 wc_it.add_list_after(&word->best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
814 reject_map = word->reject_map; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
815 if (word->blamer_bundle != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
816 assert(blamer_bundle != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
817 blamer_bundle->CopyResults(*(word->blamer_bundle)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
818 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
819 CopySimpleFields(*word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
820 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
821 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
822 // Replace the best choice and rebuild box word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
823 // choice must be from the current best_choices list. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
824 void WERD_RES::ReplaceBestChoice(WERD_CHOICE *choice) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
825 best_choice = choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
826 RebuildBestState(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
827 SetupBoxWord(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
828 // Make up a fake reject map of the right length to keep the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
829 // rejection pass happy. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
830 reject_map.initialise(best_state.size()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
831 done = tess_accepted = tess_would_adapt = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
832 SetScriptPositions(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
833 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
834 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
835 // Builds the rebuild_word and sets the best_state from the chopped_word and |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
836 // the best_choice->state. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
837 void WERD_RES::RebuildBestState() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
838 ASSERT_HOST(best_choice != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
839 delete rebuild_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
840 rebuild_word = new TWERD; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
841 if (seam_array.empty()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
842 start_seam_list(chopped_word, &seam_array); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
843 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
844 best_state.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
845 int start = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
846 for (unsigned i = 0; i < best_choice->length(); ++i) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
847 int length = best_choice->state(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
848 best_state.push_back(length); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
849 if (length > 1) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
850 SEAM::JoinPieces(seam_array, chopped_word->blobs, start, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
851 start + length - 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
852 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
853 TBLOB *blob = chopped_word->blobs[start]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
854 rebuild_word->blobs.push_back(new TBLOB(*blob)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
855 if (length > 1) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
856 SEAM::BreakPieces(seam_array, chopped_word->blobs, start, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
857 start + length - 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
858 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
859 start += length; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
860 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
861 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
862 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
863 // Copies the chopped_word to the rebuild_word, faking a best_state as well. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
864 // Also sets up the output box_word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
865 void WERD_RES::CloneChoppedToRebuild() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
866 delete rebuild_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
867 rebuild_word = new TWERD(*chopped_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
868 SetupBoxWord(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
869 auto word_len = box_word->length(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
870 best_state.reserve(word_len); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
871 correct_text.reserve(word_len); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
872 for (unsigned i = 0; i < word_len; ++i) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
873 best_state.push_back(1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
874 correct_text.emplace_back(""); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
875 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
876 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
877 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
878 // Sets/replaces the box_word with one made from the rebuild_word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
879 void WERD_RES::SetupBoxWord() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
880 delete box_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
881 rebuild_word->ComputeBoundingBoxes(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
882 box_word = tesseract::BoxWord::CopyFromNormalized(rebuild_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
883 box_word->ClipToOriginalWord(denorm.block(), word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
884 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
885 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
886 // Sets up the script positions in the output best_choice using the best_choice |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
887 // to get the unichars, and the unicharset to get the target positions. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
888 void WERD_RES::SetScriptPositions() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
889 best_choice->SetScriptPositions(small_caps, chopped_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
890 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
891 // Sets all the blobs in all the words (raw choice and best choices) to be |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
892 // the given position. (When a sub/superscript is recognized as a separate |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
893 // word, it falls victim to the rule that a whole word cannot be sub or |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
894 // superscript, so this function overrides that problem.) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
895 void WERD_RES::SetAllScriptPositions(tesseract::ScriptPos position) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
896 raw_choice->SetAllScriptPositions(position); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
897 WERD_CHOICE_IT wc_it(&best_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
898 for (wc_it.mark_cycle_pt(); !wc_it.cycled_list(); wc_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
899 wc_it.data()->SetAllScriptPositions(position); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
900 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
901 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
902 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
903 // Classifies the word with some already-calculated BLOB_CHOICEs. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
904 // The choices are an array of blob_count pointers to BLOB_CHOICE, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
905 // providing a single classifier result for each blob. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
906 // The BLOB_CHOICEs are consumed and the word takes ownership. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
907 // The number of blobs in the box_word must match blob_count. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
908 void WERD_RES::FakeClassifyWord(unsigned blob_count, BLOB_CHOICE **choices) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
909 // Setup the WERD_RES. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
910 ASSERT_HOST(box_word != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
911 ASSERT_HOST(blob_count == box_word->length()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
912 ClearWordChoices(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
913 ClearRatings(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
914 ratings = new MATRIX(blob_count, 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
915 for (unsigned c = 0; c < blob_count; ++c) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
916 auto *choice_list = new BLOB_CHOICE_LIST; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
917 BLOB_CHOICE_IT choice_it(choice_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
918 choice_it.add_after_then_move(choices[c]); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
919 ratings->put(c, c, choice_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
920 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
921 FakeWordFromRatings(TOP_CHOICE_PERM); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
922 reject_map.initialise(blob_count); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
923 best_state.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
924 best_state.resize(blob_count, 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
925 done = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
926 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
927 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
928 // Creates a WERD_CHOICE for the word using the top choices from the leading |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
929 // diagonal of the ratings matrix. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
930 void WERD_RES::FakeWordFromRatings(PermuterType permuter) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
931 int num_blobs = ratings->dimension(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
932 auto *word_choice = new WERD_CHOICE(uch_set, num_blobs); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
933 word_choice->set_permuter(permuter); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
934 for (int b = 0; b < num_blobs; ++b) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
935 UNICHAR_ID unichar_id = UNICHAR_SPACE; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
936 // Initialize rating and certainty like in WERD_CHOICE::make_bad(). |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
937 float rating = WERD_CHOICE::kBadRating; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
938 float certainty = -FLT_MAX; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
939 BLOB_CHOICE_LIST *choices = ratings->get(b, b); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
940 if (choices != nullptr && !choices->empty()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
941 BLOB_CHOICE_IT bc_it(choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
942 BLOB_CHOICE *choice = bc_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
943 unichar_id = choice->unichar_id(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
944 rating = choice->rating(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
945 certainty = choice->certainty(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
946 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
947 word_choice->append_unichar_id_space_allocated(unichar_id, 1, rating, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
948 certainty); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
949 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
950 LogNewRawChoice(word_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
951 // Ownership of word_choice taken by word here. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
952 LogNewCookedChoice(1, false, word_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
953 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
954 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
955 // Copies the best_choice strings to the correct_text for adaption/training. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
956 void WERD_RES::BestChoiceToCorrectText() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
957 correct_text.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
958 ASSERT_HOST(best_choice != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
959 for (unsigned i = 0; i < best_choice->length(); ++i) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
960 UNICHAR_ID choice_id = best_choice->unichar_id(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
961 const char *blob_choice = uch_set->id_to_unichar(choice_id); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
962 correct_text.emplace_back(blob_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
963 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
964 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
965 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
966 // Merges 2 adjacent blobs in the result if the permanent callback |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
967 // class_cb returns other than INVALID_UNICHAR_ID, AND the permanent |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
968 // callback box_cb is nullptr or returns true, setting the merged blob |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
969 // result to the class returned from class_cb. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
970 // Returns true if anything was merged. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
971 bool WERD_RES::ConditionalBlobMerge( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
972 const std::function<UNICHAR_ID(UNICHAR_ID, UNICHAR_ID)> &class_cb, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
973 const std::function<bool(const TBOX &, const TBOX &)> &box_cb) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
974 ASSERT_HOST(best_choice->empty() || ratings != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
975 bool modified = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
976 for (unsigned i = 0; i + 1 < best_choice->length(); ++i) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
977 UNICHAR_ID new_id = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
978 class_cb(best_choice->unichar_id(i), best_choice->unichar_id(i + 1)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
979 if (new_id != INVALID_UNICHAR_ID && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
980 (box_cb == nullptr || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
981 box_cb(box_word->BlobBox(i), box_word->BlobBox(i + 1)))) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
982 // Raw choice should not be fixed. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
983 best_choice->set_unichar_id(new_id, i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
984 modified = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
985 MergeAdjacentBlobs(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
986 const MATRIX_COORD &coord = best_choice->MatrixCoord(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
987 if (!coord.Valid(*ratings)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
988 ratings->IncreaseBandSize(coord.row + 1 - coord.col); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
989 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
990 BLOB_CHOICE_LIST *blob_choices = GetBlobChoices(i); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
991 if (FindMatchingChoice(new_id, blob_choices) == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
992 // Insert a fake result. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
993 auto *blob_choice = new BLOB_CHOICE; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
994 blob_choice->set_unichar_id(new_id); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
995 BLOB_CHOICE_IT bc_it(blob_choices); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
996 bc_it.add_before_then_move(blob_choice); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
997 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
998 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
999 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1000 return modified; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1001 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1002 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1003 // Merges 2 adjacent blobs in the result (index and index+1) and corrects |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1004 // all the data to account for the change. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1005 void WERD_RES::MergeAdjacentBlobs(unsigned index) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1006 if (reject_map.length() == best_choice->length()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1007 reject_map.remove_pos(index); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1008 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1009 best_choice->remove_unichar_id(index + 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1010 rebuild_word->MergeBlobs(index, index + 2); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1011 box_word->MergeBoxes(index, index + 2); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1012 if (index + 1 < best_state.size()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1013 best_state[index] += best_state[index + 1]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1014 best_state.erase(best_state.begin() + index + 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1015 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1016 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1017 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1018 // TODO(tkielbus) Decide between keeping this behavior here or modifying the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1019 // training data. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1020 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1021 // Utility function for fix_quotes |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1022 // Return true if the next character in the string (given the UTF8 length in |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1023 // bytes) is a quote character. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1024 static int is_simple_quote(const char *signed_str, int length) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1025 const auto *str = reinterpret_cast<const unsigned char *>(signed_str); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1026 // Standard 1 byte quotes. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1027 return (length == 1 && (*str == '\'' || *str == '`')) || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1028 // UTF-8 3 bytes curved quotes. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1029 (length == 3 && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1030 ((*str == 0xe2 && *(str + 1) == 0x80 && *(str + 2) == 0x98) || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1031 (*str == 0xe2 && *(str + 1) == 0x80 && *(str + 2) == 0x99))); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1032 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1033 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1034 // Callback helper for fix_quotes returns a double quote if both |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1035 // arguments are quote, otherwise INVALID_UNICHAR_ID. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1036 UNICHAR_ID WERD_RES::BothQuotes(UNICHAR_ID id1, UNICHAR_ID id2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1037 const char *ch = uch_set->id_to_unichar(id1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1038 const char *next_ch = uch_set->id_to_unichar(id2); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1039 if (is_simple_quote(ch, strlen(ch)) && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1040 is_simple_quote(next_ch, strlen(next_ch))) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1041 return uch_set->unichar_to_id("\""); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1042 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1043 return INVALID_UNICHAR_ID; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1044 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1045 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1046 // Change pairs of quotes to double quotes. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1047 void WERD_RES::fix_quotes() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1048 if (!uch_set->contains_unichar("\"") || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1049 !uch_set->get_enabled(uch_set->unichar_to_id("\""))) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1050 return; // Don't create it if it is disallowed. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1051 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1052 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1053 using namespace std::placeholders; // for _1, _2 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1054 ConditionalBlobMerge(std::bind(&WERD_RES::BothQuotes, this, _1, _2), nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1055 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1056 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1057 // Callback helper for fix_hyphens returns UNICHAR_ID of - if both |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1058 // arguments are hyphen, otherwise INVALID_UNICHAR_ID. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1059 UNICHAR_ID WERD_RES::BothHyphens(UNICHAR_ID id1, UNICHAR_ID id2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1060 const char *ch = uch_set->id_to_unichar(id1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1061 const char *next_ch = uch_set->id_to_unichar(id2); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1062 if (strlen(ch) == 1 && strlen(next_ch) == 1 && (*ch == '-' || *ch == '~') && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1063 (*next_ch == '-' || *next_ch == '~')) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1064 return uch_set->unichar_to_id("-"); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1065 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1066 return INVALID_UNICHAR_ID; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1067 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1068 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1069 // Callback helper for fix_hyphens returns true if box1 and box2 overlap |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1070 // (assuming both on the same textline, are in order and a chopped em dash.) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1071 bool WERD_RES::HyphenBoxesOverlap(const TBOX &box1, const TBOX &box2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1072 return box1.right() >= box2.left(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1073 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1074 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1075 // Change pairs of hyphens to a single hyphen if the bounding boxes touch |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1076 // Typically a long dash which has been segmented. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1077 void WERD_RES::fix_hyphens() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1078 if (!uch_set->contains_unichar("-") || |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1079 !uch_set->get_enabled(uch_set->unichar_to_id("-"))) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1080 return; // Don't create it if it is disallowed. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1081 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1082 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1083 using namespace std::placeholders; // for _1, _2 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1084 ConditionalBlobMerge(std::bind(&WERD_RES::BothHyphens, this, _1, _2), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1085 std::bind(&WERD_RES::HyphenBoxesOverlap, this, _1, _2)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1086 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1087 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1088 // Callback helper for merge_tess_fails returns a space if both |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1089 // arguments are space, otherwise INVALID_UNICHAR_ID. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1090 UNICHAR_ID WERD_RES::BothSpaces(UNICHAR_ID id1, UNICHAR_ID id2) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1091 if (id1 == id2 && id1 == uch_set->unichar_to_id(" ")) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1092 return id1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1093 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1094 return INVALID_UNICHAR_ID; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1095 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1096 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1097 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1098 // Change pairs of tess failures to a single one |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1099 void WERD_RES::merge_tess_fails() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1100 using namespace std::placeholders; // for _1, _2 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1101 if (ConditionalBlobMerge(std::bind(&WERD_RES::BothSpaces, this, _1, _2), |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1102 nullptr)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1103 unsigned len = best_choice->length(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1104 ASSERT_HOST(reject_map.length() == len); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1105 ASSERT_HOST(box_word->length() == len); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1106 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1107 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1108 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1109 // Returns true if the collection of count pieces, starting at start, are all |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1110 // natural connected components, ie there are no real chops involved. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1111 bool WERD_RES::PiecesAllNatural(int start, int count) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1112 // all seams must have no splits. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1113 for (int index = start; index < start + count - 1; ++index) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1114 if (index >= 0 && static_cast<size_t>(index) < seam_array.size()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1115 SEAM *seam = seam_array[index]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1116 if (seam != nullptr && seam->HasAnySplits()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1117 return false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1118 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1119 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1120 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1121 return true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1122 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1123 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1124 WERD_RES::~WERD_RES() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1125 Clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1126 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1127 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1128 void WERD_RES::Clear() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1129 if (combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1130 delete word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1131 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1132 word = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1133 delete blamer_bundle; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1134 blamer_bundle = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1135 ClearResults(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1136 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1137 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1138 void WERD_RES::ClearResults() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1139 done = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1140 fontinfo = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1141 fontinfo2 = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1142 fontinfo_id_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1143 fontinfo_id2_count = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1144 delete bln_boxes; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1145 bln_boxes = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1146 blob_row = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1147 delete chopped_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1148 chopped_word = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1149 delete rebuild_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1150 rebuild_word = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1151 delete box_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1152 box_word = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1153 best_state.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1154 correct_text.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1155 for (auto data : seam_array) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1156 delete data; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1157 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1158 seam_array.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1159 blob_widths.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1160 blob_gaps.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1161 ClearRatings(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1162 ClearWordChoices(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1163 if (blamer_bundle != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1164 blamer_bundle->ClearResults(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1165 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1166 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1167 void WERD_RES::ClearWordChoices() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1168 best_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1169 delete raw_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1170 raw_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1171 best_choices.clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1172 delete ep_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1173 ep_choice = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1174 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1175 void WERD_RES::ClearRatings() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1176 if (ratings != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1177 ratings->delete_matrix_pointers(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1178 delete ratings; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1179 ratings = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1180 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1181 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1182 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1183 int PAGE_RES_IT::cmp(const PAGE_RES_IT &other) const { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1184 ASSERT_HOST(page_res == other.page_res); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1185 if (other.block_res == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1186 // other points to the end of the page. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1187 if (block_res == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1188 return 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1189 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1190 return -1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1191 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1192 if (block_res == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1193 return 1; // we point to the end of the page. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1194 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1195 if (block_res == other.block_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1196 if (other.row_res == nullptr || row_res == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1197 // this should only happen if we hit an image block. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1198 return 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1199 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1200 if (row_res == other.row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1201 // we point to the same block and row. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1202 ASSERT_HOST(other.word_res != nullptr && word_res != nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1203 if (word_res == other.word_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1204 // we point to the same word! |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1205 return 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1206 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1207 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1208 WERD_RES_IT word_res_it(&row_res->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1209 for (word_res_it.mark_cycle_pt(); !word_res_it.cycled_list(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1210 word_res_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1211 if (word_res_it.data() == word_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1212 return -1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1213 } else if (word_res_it.data() == other.word_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1214 return 1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1215 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1216 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1217 ASSERT_HOST("Error: Incomparable PAGE_RES_ITs" == nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1218 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1219 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1220 // we both point to the same block, but different rows. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1221 ROW_RES_IT row_res_it(&block_res->row_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1222 for (row_res_it.mark_cycle_pt(); !row_res_it.cycled_list(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1223 row_res_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1224 if (row_res_it.data() == row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1225 return -1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1226 } else if (row_res_it.data() == other.row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1227 return 1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1228 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1229 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1230 ASSERT_HOST("Error: Incomparable PAGE_RES_ITs" == nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1231 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1232 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1233 // We point to different blocks. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1234 BLOCK_RES_IT block_res_it(&page_res->block_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1235 for (block_res_it.mark_cycle_pt(); !block_res_it.cycled_list(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1236 block_res_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1237 if (block_res_it.data() == block_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1238 return -1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1239 } else if (block_res_it.data() == other.block_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1240 return 1; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1241 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1242 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1243 // Shouldn't happen... |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1244 ASSERT_HOST("Error: Incomparable PAGE_RES_ITs" == nullptr); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1245 return 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1246 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1247 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1248 // Inserts the new_word as a combination owned by a corresponding WERD_RES |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1249 // before the current position. The simple fields of the WERD_RES are copied |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1250 // from clone_res and the resulting WERD_RES is returned for further setup |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1251 // with best_choice etc. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1252 WERD_RES *PAGE_RES_IT::InsertSimpleCloneWord(const WERD_RES &clone_res, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1253 WERD *new_word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1254 // Make a WERD_RES for the new_word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1255 auto *new_res = new WERD_RES(new_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1256 new_res->CopySimpleFields(clone_res); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1257 new_res->combination = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1258 // Insert into the appropriate place in the ROW_RES. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1259 WERD_RES_IT wr_it(&row()->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1260 for (wr_it.mark_cycle_pt(); !wr_it.cycled_list(); wr_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1261 WERD_RES *word = wr_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1262 if (word == word_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1263 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1264 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1265 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1266 ASSERT_HOST(!wr_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1267 wr_it.add_before_then_move(new_res); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1268 if (wr_it.at_first()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1269 // This is the new first word, so reset the member iterator so it |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1270 // detects the cycled_list state correctly. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1271 ResetWordIterator(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1272 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1273 return new_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1274 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1275 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1276 // Helper computes the boundaries between blobs in the word. The blob bounds |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1277 // are likely very poor, if they come from LSTM, where it only outputs the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1278 // character at one pixel within it, so we find the midpoints between them. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1279 static void ComputeBlobEnds(const WERD_RES &word, const TBOX &clip_box, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1280 C_BLOB_LIST *next_word_blobs, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1281 std::vector<int> *blob_ends) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1282 C_BLOB_IT blob_it(word.word->cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1283 for (int length : word.best_state) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1284 // Get the bounding box of the fake blobs |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1285 TBOX blob_box = blob_it.data()->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1286 blob_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1287 for (int b = 1; b < length; ++b) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1288 blob_box += blob_it.data()->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1289 blob_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1290 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1291 // This blob_box is crap, so for now we are only looking for the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1292 // boundaries between them. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1293 int blob_end = INT32_MAX; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1294 if (!blob_it.at_first() || next_word_blobs != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1295 if (blob_it.at_first()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1296 blob_it.set_to_list(next_word_blobs); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1297 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1298 blob_end = (blob_box.right() + blob_it.data()->bounding_box().left()) / 2; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1299 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1300 blob_end = ClipToRange<int>(blob_end, clip_box.left(), clip_box.right()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1301 blob_ends->push_back(blob_end); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1302 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1303 blob_ends->back() = clip_box.right(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1304 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1305 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1306 // Helper computes the bounds of a word by restricting it to existing words |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1307 // that significantly overlap. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1308 static TBOX ComputeWordBounds(const tesseract::PointerVector<WERD_RES> &words, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1309 int w_index, TBOX prev_box, WERD_RES_IT w_it) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1310 constexpr int kSignificantOverlapFraction = 4; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1311 TBOX clipped_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1312 TBOX current_box = words[w_index]->word->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1313 TBOX next_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1314 if (static_cast<size_t>(w_index + 1) < words.size() && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1315 words[w_index + 1] != nullptr && words[w_index + 1]->word != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1316 next_box = words[w_index + 1]->word->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1317 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1318 for (w_it.forward(); !w_it.at_first() && w_it.data()->part_of_combo; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1319 w_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1320 if (w_it.data() == nullptr || w_it.data()->word == nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1321 continue; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1322 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1323 TBOX w_box = w_it.data()->word->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1324 int height_limit = std::min<int>(w_box.height(), w_box.width() / 2); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1325 int width_limit = w_box.width() / kSignificantOverlapFraction; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1326 int min_significant_overlap = std::max(height_limit, width_limit); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1327 int overlap = w_box.intersection(current_box).width(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1328 int prev_overlap = w_box.intersection(prev_box).width(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1329 int next_overlap = w_box.intersection(next_box).width(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1330 if (overlap > min_significant_overlap) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1331 if (prev_overlap > min_significant_overlap) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1332 // We have no choice but to use the LSTM word edge. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1333 clipped_box.set_left(current_box.left()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1334 } else if (next_overlap > min_significant_overlap) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1335 // We have no choice but to use the LSTM word edge. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1336 clipped_box.set_right(current_box.right()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1337 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1338 clipped_box += w_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1339 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1340 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1341 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1342 if (clipped_box.height() <= 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1343 clipped_box.set_top(current_box.top()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1344 clipped_box.set_bottom(current_box.bottom()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1345 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1346 if (clipped_box.width() <= 0) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1347 clipped_box = current_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1348 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1349 return clipped_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1350 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1351 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1352 // Helper moves the blob from src to dest. If it isn't contained by clip_box, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1353 // the blob is replaced by a fake that is contained. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1354 static TBOX MoveAndClipBlob(C_BLOB_IT *src_it, C_BLOB_IT *dest_it, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1355 const TBOX &clip_box) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1356 C_BLOB *src_blob = src_it->extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1357 TBOX box = src_blob->bounding_box(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1358 if (!clip_box.contains(box)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1359 int left = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1360 ClipToRange<int>(box.left(), clip_box.left(), clip_box.right() - 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1361 int right = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1362 ClipToRange<int>(box.right(), clip_box.left() + 1, clip_box.right()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1363 int top = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1364 ClipToRange<int>(box.top(), clip_box.bottom() + 1, clip_box.top()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1365 int bottom = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1366 ClipToRange<int>(box.bottom(), clip_box.bottom(), clip_box.top() - 1); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1367 box = TBOX(left, bottom, right, top); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1368 delete src_blob; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1369 src_blob = C_BLOB::FakeBlob(box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1370 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1371 dest_it->add_after_then_move(src_blob); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1372 return box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1373 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1374 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1375 // Replaces the current WERD/WERD_RES with the given words. The given words |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1376 // contain fake blobs that indicate the position of the characters. These are |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1377 // replaced with real blobs from the current word as much as possible. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1378 void PAGE_RES_IT::ReplaceCurrentWord( |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1379 tesseract::PointerVector<WERD_RES> *words) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1380 if (words->empty()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1381 DeleteCurrentWord(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1382 return; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1383 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1384 WERD_RES *input_word = word(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1385 // Set the BOL/EOL flags on the words from the input word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1386 if (input_word->word->flag(W_BOL)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1387 (*words)[0]->word->set_flag(W_BOL, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1388 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1389 (*words)[0]->word->set_blanks(input_word->word->space()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1390 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1391 words->back()->word->set_flag(W_EOL, input_word->word->flag(W_EOL)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1392 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1393 // Move the blobs from the input word to the new set of words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1394 // If the input word_res is a combination, then the replacements will also be |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1395 // combinations, and will own their own words. If the input word_res is not a |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1396 // combination, then the final replacements will not be either, (although it |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1397 // is allowed for the input words to be combinations) and their words |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1398 // will get put on the row list. This maintains the ownership rules. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1399 WERD_IT w_it(row()->row->word_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1400 if (!input_word->combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1401 for (w_it.mark_cycle_pt(); !w_it.cycled_list(); w_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1402 WERD *word = w_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1403 if (word == input_word->word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1404 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1405 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1406 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1407 // w_it is now set to the input_word's word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1408 ASSERT_HOST(!w_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1409 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1410 // Insert into the appropriate place in the ROW_RES. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1411 WERD_RES_IT wr_it(&row()->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1412 for (wr_it.mark_cycle_pt(); !wr_it.cycled_list(); wr_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1413 WERD_RES *word = wr_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1414 if (word == input_word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1415 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1416 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1417 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1418 ASSERT_HOST(!wr_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1419 // Since we only have an estimate of the bounds between blobs, use the blob |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1420 // x-middle as the determiner of where to put the blobs |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1421 C_BLOB_IT src_b_it(input_word->word->cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1422 src_b_it.sort(&C_BLOB::SortByXMiddle); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1423 C_BLOB_IT rej_b_it(input_word->word->rej_cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1424 rej_b_it.sort(&C_BLOB::SortByXMiddle); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1425 TBOX clip_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1426 for (size_t w = 0; w < words->size(); ++w) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1427 WERD_RES *word_w = (*words)[w]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1428 clip_box = ComputeWordBounds(*words, w, clip_box, wr_it_of_current_word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1429 // Compute blob boundaries. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1430 std::vector<int> blob_ends; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1431 C_BLOB_LIST *next_word_blobs = |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1432 w + 1 < words->size() ? (*words)[w + 1]->word->cblob_list() : nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1433 ComputeBlobEnds(*word_w, clip_box, next_word_blobs, &blob_ends); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1434 // Remove the fake blobs on the current word, but keep safe for back-up if |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1435 // no blob can be found. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1436 C_BLOB_LIST fake_blobs; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1437 C_BLOB_IT fake_b_it(&fake_blobs); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1438 fake_b_it.add_list_after(word_w->word->cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1439 fake_b_it.move_to_first(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1440 word_w->word->cblob_list()->clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1441 C_BLOB_IT dest_it(word_w->word->cblob_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1442 // Build the box word as we move the blobs. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1443 auto *box_word = new tesseract::BoxWord; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1444 for (size_t i = 0; i < blob_ends.size(); ++i, fake_b_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1445 int end_x = blob_ends[i]; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1446 TBOX blob_box; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1447 // Add the blobs up to end_x. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1448 while (!src_b_it.empty() && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1449 src_b_it.data()->bounding_box().x_middle() < end_x) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1450 blob_box += MoveAndClipBlob(&src_b_it, &dest_it, clip_box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1451 src_b_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1452 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1453 while (!rej_b_it.empty() && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1454 rej_b_it.data()->bounding_box().x_middle() < end_x) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1455 blob_box += MoveAndClipBlob(&rej_b_it, &dest_it, clip_box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1456 rej_b_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1457 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1458 if (blob_box.null_box()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1459 // Use the original box as a back-up. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1460 blob_box = MoveAndClipBlob(&fake_b_it, &dest_it, clip_box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1461 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1462 box_word->InsertBox(i, blob_box); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1463 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1464 delete word_w->box_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1465 word_w->box_word = box_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1466 if (!input_word->combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1467 // Insert word_w->word into the ROW. It doesn't own its word, so the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1468 // ROW needs to own it. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1469 w_it.add_before_stay_put(word_w->word); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1470 word_w->combination = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1471 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1472 (*words)[w] = nullptr; // We are taking ownership. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1473 wr_it.add_before_stay_put(word_w); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1474 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1475 // We have taken ownership of the words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1476 words->clear(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1477 // Delete the current word, which has been replaced. We could just call |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1478 // DeleteCurrentWord, but that would iterate both lists again, and we know |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1479 // we are already in the right place. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1480 if (!input_word->combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1481 delete w_it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1482 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1483 delete wr_it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1484 ResetWordIterator(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1485 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1486 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1487 // Deletes the current WERD_RES and its underlying WERD. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1488 void PAGE_RES_IT::DeleteCurrentWord() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1489 // Check that this word is as we expect. part_of_combos are NEVER iterated |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1490 // by the normal iterator, so we should never be trying to delete them. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1491 ASSERT_HOST(!word_res->part_of_combo); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1492 if (!word_res->combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1493 // Combinations own their own word, so we won't find the word on the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1494 // row's word_list, but it is legitimate to try to delete them. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1495 // Delete word from the ROW when not a combination. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1496 WERD_IT w_it(row()->row->word_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1497 for (w_it.mark_cycle_pt(); !w_it.cycled_list(); w_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1498 if (w_it.data() == word_res->word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1499 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1500 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1501 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1502 ASSERT_HOST(!w_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1503 delete w_it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1504 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1505 // Remove the WERD_RES for the new_word. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1506 // Remove the WORD_RES from the ROW_RES. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1507 WERD_RES_IT wr_it(&row()->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1508 for (wr_it.mark_cycle_pt(); !wr_it.cycled_list(); wr_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1509 if (wr_it.data() == word_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1510 word_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1511 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1512 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1513 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1514 ASSERT_HOST(!wr_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1515 delete wr_it.extract(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1516 ResetWordIterator(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1517 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1518 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1519 // Makes the current word a fuzzy space if not already fuzzy. Updates |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1520 // corresponding part of combo if required. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1521 void PAGE_RES_IT::MakeCurrentWordFuzzy() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1522 WERD *real_word = word_res->word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1523 if (!real_word->flag(W_FUZZY_SP) && !real_word->flag(W_FUZZY_NON)) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1524 real_word->set_flag(W_FUZZY_SP, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1525 if (word_res->combination) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1526 // The next word should be the corresponding part of combo, but we have |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1527 // already stepped past it, so find it by search. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1528 WERD_RES_IT wr_it(&row()->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1529 for (wr_it.mark_cycle_pt(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1530 !wr_it.cycled_list() && wr_it.data() != word_res; wr_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1531 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1532 wr_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1533 ASSERT_HOST(wr_it.data()->part_of_combo); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1534 real_word = wr_it.data()->word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1535 ASSERT_HOST(!real_word->flag(W_FUZZY_SP) && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1536 !real_word->flag(W_FUZZY_NON)); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1537 real_word->set_flag(W_FUZZY_SP, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1538 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1539 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1540 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1541 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1542 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1543 * PAGE_RES_IT::restart_page |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1544 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1545 * Set things up at the start of the page |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1546 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1547 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1548 WERD_RES *PAGE_RES_IT::start_page(bool empty_ok) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1549 block_res_it.set_to_list(&page_res->block_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1550 block_res_it.mark_cycle_pt(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1551 prev_block_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1552 prev_row_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1553 prev_word_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1554 block_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1555 row_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1556 word_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1557 next_block_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1558 next_row_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1559 next_word_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1560 internal_forward(true, empty_ok); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1561 return internal_forward(false, empty_ok); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1562 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1563 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1564 // Recovers from operations on the current word, such as in InsertCloneWord |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1565 // and DeleteCurrentWord. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1566 // Resets the word_res_it so that it is one past the next_word_res, as |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1567 // it should be after internal_forward. If next_row_res != row_res, |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1568 // then the next_word_res is in the next row, so there is no need to do |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1569 // anything to word_res_it, but it is still a good idea to reset the pointers |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1570 // word_res and prev_word_res, which are still in the current row. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1571 void PAGE_RES_IT::ResetWordIterator() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1572 if (row_res == next_row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1573 // Reset the member iterator so it can move forward and detect the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1574 // cycled_list state correctly. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1575 word_res_it.move_to_first(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1576 for (word_res_it.mark_cycle_pt(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1577 !word_res_it.cycled_list() && word_res_it.data() != next_word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1578 word_res_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1579 if (!word_res_it.data()->part_of_combo) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1580 if (prev_row_res == row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1581 prev_word_res = word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1582 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1583 word_res = word_res_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1584 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1585 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1586 ASSERT_HOST(!word_res_it.cycled_list()); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1587 wr_it_of_next_word = word_res_it; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1588 word_res_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1589 } else { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1590 // word_res_it is OK, but reset word_res and prev_word_res if needed. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1591 WERD_RES_IT wr_it(&row_res->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1592 for (wr_it.mark_cycle_pt(); !wr_it.cycled_list(); wr_it.forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1593 if (!wr_it.data()->part_of_combo) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1594 if (prev_row_res == row_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1595 prev_word_res = word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1596 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1597 word_res = wr_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1598 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1599 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1600 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1601 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1602 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1603 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1604 * PAGE_RES_IT::internal_forward |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1605 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1606 * Find the next word on the page. If empty_ok is true, then non-text blocks |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1607 * and text blocks with no text are visited as if they contain a single |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1608 * imaginary word in a single imaginary row. (word() and row() both return |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1609 *nullptr in such a block and the return value is nullptr.) If empty_ok is |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1610 *false, the old behaviour is maintained. Each real word is visited and empty |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1611 *and non-text blocks and rows are skipped. new_block is used to initialize the |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1612 *iterators for a new block. The iterator maintains pointers to block, row and |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1613 *word for the previous, current and next words. These are correct, regardless |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1614 *of block/row boundaries. nullptr values denote start and end of the page. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1615 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1616 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1617 WERD_RES *PAGE_RES_IT::internal_forward(bool new_block, bool empty_ok) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1618 bool new_row = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1619 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1620 prev_block_res = block_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1621 prev_row_res = row_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1622 prev_word_res = word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1623 block_res = next_block_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1624 row_res = next_row_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1625 word_res = next_word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1626 wr_it_of_current_word = wr_it_of_next_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1627 next_block_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1628 next_row_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1629 next_word_res = nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1630 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1631 while (!block_res_it.cycled_list()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1632 if (new_block) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1633 new_block = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1634 row_res_it.set_to_list(&block_res_it.data()->row_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1635 row_res_it.mark_cycle_pt(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1636 if (row_res_it.empty() && empty_ok) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1637 next_block_res = block_res_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1638 break; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1639 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1640 new_row = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1641 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1642 while (!row_res_it.cycled_list()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1643 if (new_row) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1644 new_row = false; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1645 word_res_it.set_to_list(&row_res_it.data()->word_res_list); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1646 word_res_it.mark_cycle_pt(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1647 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1648 // Skip any part_of_combo words. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1649 while (!word_res_it.cycled_list() && word_res_it.data()->part_of_combo) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1650 word_res_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1651 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1652 if (!word_res_it.cycled_list()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1653 next_block_res = block_res_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1654 next_row_res = row_res_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1655 next_word_res = word_res_it.data(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1656 wr_it_of_next_word = word_res_it; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1657 word_res_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1658 goto foundword; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1659 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1660 // end of row reached |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1661 row_res_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1662 new_row = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1663 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1664 // end of block reached |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1665 block_res_it.forward(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1666 new_block = true; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1667 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1668 foundword: |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1669 // Update prev_word_best_choice pointer. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1670 if (page_res != nullptr && page_res->prev_word_best_choice != nullptr) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1671 *page_res->prev_word_best_choice = (new_block || prev_word_res == nullptr) |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1672 ? nullptr |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1673 : prev_word_res->best_choice; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1674 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1675 return word_res; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1676 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1677 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1678 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1679 * PAGE_RES_IT::restart_row() |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1680 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1681 * Move to the beginning (leftmost word) of the current row. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1682 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1683 WERD_RES *PAGE_RES_IT::restart_row() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1684 ROW_RES *row = this->row(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1685 if (!row) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1686 return nullptr; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1687 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1688 for (restart_page(); this->row() != row; forward()) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1689 // pass |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1690 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1691 return word(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1692 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1693 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1694 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1695 * PAGE_RES_IT::forward_paragraph |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1696 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1697 * Move to the beginning of the next paragraph, allowing empty blocks. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1698 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1699 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1700 WERD_RES *PAGE_RES_IT::forward_paragraph() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1701 while (block_res == next_block_res && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1702 (next_row_res != nullptr && next_row_res->row != nullptr && |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1703 row_res->row->para() == next_row_res->row->para())) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1704 internal_forward(false, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1705 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1706 return internal_forward(false, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1707 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1708 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1709 /************************************************************************* |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1710 * PAGE_RES_IT::forward_block |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1711 * |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1712 * Move to the beginning of the next block, allowing empty blocks. |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1713 *************************************************************************/ |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1714 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1715 WERD_RES *PAGE_RES_IT::forward_block() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1716 while (block_res == next_block_res) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1717 internal_forward(false, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1718 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1719 return internal_forward(false, true); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1720 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1721 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1722 void PAGE_RES_IT::rej_stat_word() { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1723 int16_t chars_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1724 int16_t rejects_in_word = 0; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1725 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1726 chars_in_word = word_res->reject_map.length(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1727 page_res->char_count += chars_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1728 block_res->char_count += chars_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1729 row_res->char_count += chars_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1730 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1731 rejects_in_word = word_res->reject_map.reject_count(); |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1732 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1733 page_res->rej_count += rejects_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1734 block_res->rej_count += rejects_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1735 row_res->rej_count += rejects_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1736 if (chars_in_word == rejects_in_word) { |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1737 row_res->whole_word_rej_count += rejects_in_word; |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1738 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1739 } |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1740 |
|
b50eed0cc0ef
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
Franz Glasner <fzglas.hg@dom66.de>
parents:
diff
changeset
|
1741 } // namespace tesseract |
