Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/tesseract/doc/tesseract.1.asc @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 TESSERACT(1) | |
| 2 ============ | |
| 3 :doctype: manpage | |
| 4 | |
| 5 NAME | |
| 6 ---- | |
| 7 tesseract - command-line OCR engine | |
| 8 | |
| 9 SYNOPSIS | |
| 10 -------- | |
| 11 *tesseract* 'FILE' 'OUTPUTBASE' ['OPTIONS']... ['CONFIGFILE']... | |
| 12 | |
| 13 DESCRIPTION | |
| 14 ----------- | |
| 15 tesseract(1) is a commercial quality OCR engine originally developed at HP | |
| 16 between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by | |
| 17 UNLV. It was open-sourced by HP and UNLV in 2005, and has been developed | |
| 18 at Google until 2018. | |
| 19 | |
| 20 | |
| 21 IN/OUT ARGUMENTS | |
| 22 ---------------- | |
| 23 'FILE':: | |
| 24 The name of the input file. | |
| 25 This can either be an image file or a text file. + | |
| 26 Most image file formats (anything readable by Leptonica) are supported. + | |
| 27 A text file lists the names of all input images (one image name per line). | |
| 28 The results will be combined in a single file for each output file format | |
| 29 (txt, pdf, hocr, xml). + | |
| 30 If 'FILE' is `stdin` or `-` then the standard input is used. | |
| 31 | |
| 32 'OUTPUTBASE':: | |
| 33 The basename of the output file (to which the appropriate extension | |
| 34 will be appended). By default the output will be a text file | |
| 35 with `.txt` added to the basename unless there are one or more | |
| 36 parameters set which explicitly specify the desired output. + | |
| 37 If 'OUTPUTBASE' is `stdout` or `-` then the standard output is used. | |
| 38 | |
| 39 | |
| 40 [[TESSDATADIR]] | |
| 41 OPTIONS | |
| 42 ------- | |
| 43 *-c* 'CONFIGVAR=VALUE':: | |
| 44 Set value for parameter 'CONFIGVAR' to VALUE. Multiple *-c* arguments are allowed. | |
| 45 | |
| 46 *--dpi* 'N':: | |
| 47 Specify the resolution 'N' in DPI for the input image(s). | |
| 48 A typical value for 'N' is `300`. Without this option, | |
| 49 the resolution is read from the metadata included in the image. | |
| 50 If an image does not include that information, Tesseract tries to guess it. | |
| 51 | |
| 52 *-l* 'LANG':: | |
| 53 *-l* 'SCRIPT':: | |
| 54 The language or script to use. | |
| 55 If none is specified, `eng` (English) is assumed. | |
| 56 Multiple languages may be specified, separated by plus characters. | |
| 57 Tesseract uses 3-character ISO 639-2 language codes | |
| 58 (see <<LANGUAGES,*LANGUAGES AND SCRIPTS*>>). | |
| 59 | |
| 60 *--psm* 'N':: | |
| 61 Set Tesseract to only run a subset of layout analysis and assume | |
| 62 a certain form of image. The options for 'N' are: | |
| 63 | |
| 64 0 = Orientation and script detection (OSD) only. | |
| 65 1 = Automatic page segmentation with OSD. | |
| 66 2 = Automatic page segmentation, but no OSD, or OCR. (not implemented) | |
| 67 3 = Fully automatic page segmentation, but no OSD. (Default) | |
| 68 4 = Assume a single column of text of variable sizes. | |
| 69 5 = Assume a single uniform block of vertically aligned text. | |
| 70 6 = Assume a single uniform block of text. | |
| 71 7 = Treat the image as a single text line. | |
| 72 8 = Treat the image as a single word. | |
| 73 9 = Treat the image as a single word in a circle. | |
| 74 10 = Treat the image as a single character. | |
| 75 11 = Sparse text. Find as much text as possible in no particular order. | |
| 76 12 = Sparse text with OSD. | |
| 77 13 = Raw line. Treat the image as a single text line, | |
| 78 bypassing hacks that are Tesseract-specific. | |
| 79 | |
| 80 *--oem* 'N':: | |
| 81 Specify OCR Engine mode. The options for 'N' are: | |
| 82 | |
| 83 0 = Original Tesseract only. | |
| 84 1 = Neural nets LSTM only. | |
| 85 2 = Tesseract + LSTM. | |
| 86 3 = Default, based on what is available. | |
| 87 | |
| 88 *--tessdata-dir* 'PATH':: | |
| 89 Specify the location of tessdata path. | |
| 90 | |
| 91 *--user-patterns* 'FILE':: | |
| 92 Specify the location of user patterns file. | |
| 93 | |
| 94 *--user-words* 'FILE':: | |
| 95 Specify the location of user words file. | |
| 96 | |
| 97 [[CONFIGFILE]] | |
| 98 'CONFIGFILE':: | |
| 99 The name of a config to use. The name can be a file in `tessdata/configs` | |
| 100 or `tessdata/tessconfigs`, or an absolute or relative file path. | |
| 101 A config is a plain text file which contains a list of parameters and | |
| 102 their values, one per line, with a space separating parameter from value. + | |
| 103 Interesting config files include: | |
| 104 | |
| 105 * *alto* -- Output in ALTO format ('OUTPUTBASE'`.xml`). | |
| 106 * *hocr* -- Output in hOCR format ('OUTPUTBASE'`.hocr`). | |
| 107 * *page* -- Output in PAGE format ('OUTPUTBASE'`.page.xml`). | |
| 108 The output can be customized with the flags: | |
| 109 page_xml_polygon -- Create polygons instead of bounding boxes (default: true) | |
| 110 page_xml_level -- Create the PAGE file on 0=linelevel or 1=wordlevel (default: 0) | |
| 111 * *pdf* -- Output PDF ('OUTPUTBASE'`.pdf`). | |
| 112 * *tsv* -- Output TSV ('OUTPUTBASE'`.tsv`). | |
| 113 * *txt* -- Output plain text ('OUTPUTBASE'`.txt`). | |
| 114 * *get.images* -- Write processed input images to file ('OUTPUTBASE'`.processedPAGENUMBER.tif`). | |
| 115 * *logfile* -- Redirect debug messages to file (`tesseract.log`). | |
| 116 * *lstm.train* -- Output files used by LSTM training ('OUTPUTBASE'`.lstmf`). | |
| 117 * *makebox* -- Write box file ('OUTPUTBASE'`.box`). | |
| 118 * *quiet* -- Redirect debug messages to '/dev/null'. | |
| 119 | |
| 120 It is possible to select several config files, for example | |
| 121 `tesseract image.png demo alto hocr pdf txt` will create four output files | |
| 122 `demo.alto`, `demo.hocr`, `demo.pdf` and `demo.txt` with the OCR results. | |
| 123 | |
| 124 *Nota bene:* The options *-l* 'LANG', *-l* 'SCRIPT' and *--psm* 'N' | |
| 125 must occur before any 'CONFIGFILE'. | |
| 126 | |
| 127 | |
| 128 SINGLE OPTIONS | |
| 129 -------------- | |
| 130 *-h, --help*:: | |
| 131 Show help message. | |
| 132 | |
| 133 *--help-extra*:: | |
| 134 Show extra help for advanced users. | |
| 135 | |
| 136 *--help-psm*:: | |
| 137 Show page segmentation modes. | |
| 138 | |
| 139 *--help-oem*:: | |
| 140 Show OCR Engine modes. | |
| 141 | |
| 142 *-v, --version*:: | |
| 143 Returns the current version of the tesseract(1) executable. | |
| 144 | |
| 145 *--list-langs*:: | |
| 146 List available languages for tesseract engine. | |
| 147 Can be used with *--tessdata-dir* 'PATH'. | |
| 148 | |
| 149 *--print-parameters*:: | |
| 150 Print tesseract parameters. | |
| 151 | |
| 152 | |
| 153 [[LANGUAGES]] | |
| 154 LANGUAGES AND SCRIPTS | |
| 155 --------------------- | |
| 156 | |
| 157 To recognize some text with Tesseract, it is normally necessary to specify | |
| 158 the language(s) or script(s) of the text (unless it is English text which is | |
| 159 supported by default) using *-l* 'LANG' or *-l* 'SCRIPT'. | |
| 160 | |
| 161 Selecting a language automatically also selects the language specific | |
| 162 character set and dictionary (word list). | |
| 163 | |
| 164 Selecting a script typically selects all characters of that script | |
| 165 which can be from different languages. The dictionary which is included | |
| 166 also contains a mix from different languages. | |
| 167 In most cases, a script also supports English. | |
| 168 So it is possible to recognize a language that has not been specifically | |
| 169 trained for by using traineddata for the script it is written in. | |
| 170 | |
| 171 More than one language or script may be specified by using `+`. | |
| 172 Example: `tesseract myimage.png myimage -l eng+deu+fra`. | |
| 173 | |
| 174 https://github.com/tesseract-ocr/tessdata_fast provides fast language and | |
| 175 script models which are also part of Linux distributions. | |
| 176 | |
| 177 For Tesseract 4, `tessdata_fast` includes traineddata files for the | |
| 178 following languages: | |
| 179 | |
| 180 *afr* (Afrikaans), | |
| 181 *amh* (Amharic), | |
| 182 *ara* (Arabic), | |
| 183 *asm* (Assamese), | |
| 184 *aze* (Azerbaijani), | |
| 185 *aze_cyrl* (Azerbaijani - Cyrilic), | |
| 186 *bel* (Belarusian), | |
| 187 *ben* (Bengali), | |
| 188 *bod* (Tibetan), | |
| 189 *bos* (Bosnian), | |
| 190 *bre* (Breton), | |
| 191 *bul* (Bulgarian), | |
| 192 *cat* (Catalan; Valencian), | |
| 193 *ceb* (Cebuano), | |
| 194 *ces* (Czech), | |
| 195 *chi_sim* (Chinese simplified), | |
| 196 *chi_tra* (Chinese traditional), | |
| 197 *chr* (Cherokee), | |
| 198 *cos* (Corsican), | |
| 199 *cym* (Welsh), | |
| 200 *dan* (Danish), | |
| 201 *deu* (German), | |
| 202 *deu_latf* (German Fraktur Latin), | |
| 203 *div* (Dhivehi), | |
| 204 *dzo* (Dzongkha), | |
| 205 *ell* (Greek, Modern, 1453-), | |
| 206 *eng* (English), | |
| 207 *enm* (English, Middle, 1100-1500), | |
| 208 *epo* (Esperanto), | |
| 209 *equ* (Math / equation detection module), | |
| 210 *est* (Estonian), | |
| 211 *eus* (Basque), | |
| 212 *fas* (Persian), | |
| 213 *fao* (Faroese), | |
| 214 *fil* (Filipino), | |
| 215 *fin* (Finnish), | |
| 216 *fra* (French), | |
| 217 *frm* (French, Middle, ca.1400-1600), | |
| 218 *fry* (West Frisian), | |
| 219 *gla* (Scottish Gaelic), | |
| 220 *gle* (Irish), | |
| 221 *glg* (Galician), | |
| 222 *grc* (Greek, Ancient, to 1453), | |
| 223 *guj* (Gujarati), | |
| 224 *hat* (Haitian; Haitian Creole), | |
| 225 *heb* (Hebrew), | |
| 226 *hin* (Hindi), | |
| 227 *hrv* (Croatian), | |
| 228 *hun* (Hungarian), | |
| 229 *hye* (Armenian), | |
| 230 *iku* (Inuktitut), | |
| 231 *ind* (Indonesian), | |
| 232 *isl* (Icelandic), | |
| 233 *ita* (Italian), | |
| 234 *ita_old* (Italian - Old), | |
| 235 *jav* (Javanese), | |
| 236 *jpn* (Japanese), | |
| 237 *kan* (Kannada), | |
| 238 *kat* (Georgian), | |
| 239 *kat_old* (Georgian - Old), | |
| 240 *kaz* (Kazakh), | |
| 241 *khm* (Central Khmer), | |
| 242 *kir* (Kirghiz; Kyrgyz), | |
| 243 *kmr* (Kurdish Kurmanji), | |
| 244 *kor* (Korean), | |
| 245 *kor_vert* (Korean vertical), | |
| 246 *lao* (Lao), | |
| 247 *lat* (Latin), | |
| 248 *lav* (Latvian), | |
| 249 *lit* (Lithuanian), | |
| 250 *ltz* (Luxembourgish), | |
| 251 *mal* (Malayalam), | |
| 252 *mar* (Marathi), | |
| 253 *mkd* (Macedonian), | |
| 254 *mlt* (Maltese), | |
| 255 *mon* (Mongolian), | |
| 256 *mri* (Maori), | |
| 257 *msa* (Malay), | |
| 258 *mya* (Burmese), | |
| 259 *nep* (Nepali), | |
| 260 *nld* (Dutch; Flemish), | |
| 261 *nor* (Norwegian), | |
| 262 *oci* (Occitan post 1500), | |
| 263 *ori* (Oriya), | |
| 264 *osd* (Orientation and script detection module), | |
| 265 *pan* (Panjabi; Punjabi), | |
| 266 *pol* (Polish), | |
| 267 *por* (Portuguese), | |
| 268 *pus* (Pushto; Pashto), | |
| 269 *que* (Quechua), | |
| 270 *ron* (Romanian; Moldavian; Moldovan), | |
| 271 *rus* (Russian), | |
| 272 *san* (Sanskrit), | |
| 273 *sin* (Sinhala; Sinhalese), | |
| 274 *slk* (Slovak), | |
| 275 *slv* (Slovenian), | |
| 276 *snd* (Sindhi), | |
| 277 *spa* (Spanish; Castilian), | |
| 278 *spa_old* (Spanish; Castilian - Old), | |
| 279 *sqi* (Albanian), | |
| 280 *srp* (Serbian), | |
| 281 *srp_latn* (Serbian - Latin), | |
| 282 *sun* (Sundanese), | |
| 283 *swa* (Swahili), | |
| 284 *swe* (Swedish), | |
| 285 *syr* (Syriac), | |
| 286 *tam* (Tamil), | |
| 287 *tat* (Tatar), | |
| 288 *tel* (Telugu), | |
| 289 *tgk* (Tajik), | |
| 290 *tha* (Thai), | |
| 291 *tir* (Tigrinya), | |
| 292 *ton* (Tonga), | |
| 293 *tur* (Turkish), | |
| 294 *uig* (Uighur; Uyghur), | |
| 295 *ukr* (Ukrainian), | |
| 296 *urd* (Urdu), | |
| 297 *uzb* (Uzbek), | |
| 298 *uzb_cyrl* (Uzbek - Cyrilic), | |
| 299 *vie* (Vietnamese), | |
| 300 *yid* (Yiddish), | |
| 301 *yor* (Yoruba) | |
| 302 | |
| 303 To use a non-standard language pack named `foo.traineddata`, set the | |
| 304 `TESSDATA_PREFIX` environment variable so the file can be found at | |
| 305 `TESSDATA_PREFIX/tessdata/foo.traineddata` and give Tesseract the | |
| 306 argument *-l* `foo`. | |
| 307 | |
| 308 For Tesseract 4, `tessdata_fast` includes traineddata files for the | |
| 309 following scripts: | |
| 310 | |
| 311 *Arabic*, | |
| 312 *Armenian*, | |
| 313 *Bengali*, | |
| 314 *Canadian_Aboriginal*, | |
| 315 *Cherokee*, | |
| 316 *Cyrillic*, | |
| 317 *Devanagari*, | |
| 318 *Ethiopic*, | |
| 319 *Fraktur*, | |
| 320 *Georgian*, | |
| 321 *Greek*, | |
| 322 *Gujarati*, | |
| 323 *Gurmukhi*, | |
| 324 *HanS* (Han simplified), | |
| 325 *HanS_vert* (Han simplified, vertical), | |
| 326 *HanT* (Han traditional), | |
| 327 *HanT_vert* (Han traditional, vertical), | |
| 328 *Hangul*, | |
| 329 *Hangul_vert* (Hangul vertical), | |
| 330 *Hebrew*, | |
| 331 *Japanese*, | |
| 332 *Japanese_vert* (Japanese vertical), | |
| 333 *Kannada*, | |
| 334 *Khmer*, | |
| 335 *Lao*, | |
| 336 *Latin*, | |
| 337 *Malayalam*, | |
| 338 *Myanmar*, | |
| 339 *Oriya* (Odia), | |
| 340 *Sinhala*, | |
| 341 *Syriac*, | |
| 342 *Tamil*, | |
| 343 *Telugu*, | |
| 344 *Thaana*, | |
| 345 *Thai*, | |
| 346 *Tibetan*, | |
| 347 *Vietnamese*. | |
| 348 | |
| 349 The same languages and scripts are available from | |
| 350 https://github.com/tesseract-ocr/tessdata_best. | |
| 351 `tessdata_best` provides slow language and script models. | |
| 352 These models are needed for training. They also can give better OCR results, | |
| 353 but the recognition takes much more time. | |
| 354 | |
| 355 Both `tessdata_fast` and `tessdata_best` only support the LSTM OCR engine. | |
| 356 | |
| 357 There is a third repository, https://github.com/tesseract-ocr/tessdata, | |
| 358 with models which support both the Tesseract 3 legacy OCR engine and the | |
| 359 Tesseract 4 LSTM OCR engine. | |
| 360 | |
| 361 | |
| 362 CONFIG FILES AND AUGMENTING WITH USER DATA | |
| 363 ------------------------------------------ | |
| 364 | |
| 365 Tesseract config files consist of lines with parameter-value pairs (space | |
| 366 separated). The parameters are documented as flags in the source code like | |
| 367 the following one in tesseractclass.h: | |
| 368 | |
| 369 `STRING_VAR_H(tessedit_char_blacklist, "", | |
| 370 "Blacklist of chars not to recognize");` | |
| 371 | |
| 372 These parameters may enable or disable various features of the engine, and | |
| 373 may cause it to load (or not load) various data. For instance, let's suppose | |
| 374 you want to OCR in English, but suppress the normal dictionary and load an | |
| 375 alternative word list and an alternative list of patterns -- these two files | |
| 376 are the most commonly used extra data files. | |
| 377 | |
| 378 If your language pack is in '/path/to/eng.traineddata' and the hocr config | |
| 379 is in '/path/to/configs/hocr' then create three new files: | |
| 380 | |
| 381 '/path/to/eng.user-words': | |
| 382 [verse] | |
| 383 the | |
| 384 quick | |
| 385 brown | |
| 386 fox | |
| 387 jumped | |
| 388 | |
| 389 '/path/to/eng.user-patterns': | |
| 390 [verse] | |
| 391 1-\d\d\d-GOOG-411 | |
| 392 www.\n\\\*.com | |
| 393 | |
| 394 '/path/to/configs/bazaar': | |
| 395 [verse] | |
| 396 load_system_dawg F | |
| 397 load_freq_dawg F | |
| 398 user_words_suffix user-words | |
| 399 user_patterns_suffix user-patterns | |
| 400 | |
| 401 Now, if you pass the word 'bazaar' as a <<CONFIGFILE,'CONFIGFILE'>> to | |
| 402 Tesseract, Tesseract will not bother loading the system dictionary nor | |
| 403 the dictionary of frequent words and will load and use the 'eng.user-words' | |
| 404 and 'eng.user-patterns' files you provided. The former is a simple word list, | |
| 405 one per line. The format of the latter is documented in 'dict/trie.h' | |
| 406 on 'read_pattern_list()'. | |
| 407 | |
| 408 | |
| 409 ENVIRONMENT VARIABLES | |
| 410 --------------------- | |
| 411 *`TESSDATA_PREFIX`*:: | |
| 412 If the `TESSDATA_PREFIX` is set to a path, then that path is used to | |
| 413 find the `tessdata` directory with language and script recognition | |
| 414 models and config files. | |
| 415 Using <<TESSDATADIR,*--tessdata-dir* 'PATH'>> is the recommended alternative. | |
| 416 *`OMP_THREAD_LIMIT`*:: | |
| 417 If the `tesseract` executable was built with multithreading support, | |
| 418 it will normally use four CPU cores for the OCR process. While this | |
| 419 can be faster for a single image, it gives bad performance if the host | |
| 420 computer provides less than four CPU cores or if OCR is made for many images. | |
| 421 Only a single CPU core is used with `OMP_THREAD_LIMIT=1`. | |
| 422 | |
| 423 | |
| 424 HISTORY | |
| 425 ------- | |
| 426 The engine was developed at Hewlett Packard Laboratories Bristol and at | |
| 427 Hewlett Packard Co, Greeley Colorado between 1985 and 1994, with some more | |
| 428 changes made in 1996 to port to Windows, and some $$C++$$izing in 1998. A | |
| 429 lot of the code was written in C, and then some more was written in $$C++$$. | |
| 430 The $$C++$$ code makes heavy use of a list system using macros. This predates | |
| 431 STL, was portable before STL, and is more efficient than STL lists, but has | |
| 432 the big negative that if you do get a segmentation violation, it is hard to | |
| 433 debug. | |
| 434 | |
| 435 Version 2.00 brought Unicode (UTF-8) support, six languages, and the ability | |
| 436 to train Tesseract. | |
| 437 | |
| 438 Tesseract was included in UNLV's Fourth Annual Test of OCR Accuracy. | |
| 439 See <https://github.com/tesseract-ocr/docs/blob/main/AT-1995.pdf>. | |
| 440 Since Tesseract 2.00, | |
| 441 scripts are now included to allow anyone to reproduce some of these tests. | |
| 442 See <https://tesseract-ocr.github.io/tessdoc/TestingTesseract.html> for more | |
| 443 details. | |
| 444 | |
| 445 Tesseract 3.00 added a number of new languages, including Chinese, Japanese, | |
| 446 and Korean. It also introduced a new, single-file based system of managing | |
| 447 language data. | |
| 448 | |
| 449 Tesseract 3.02 added BiDirectional text support, the ability to recognize | |
| 450 multiple languages in a single image, and improved layout analysis. | |
| 451 | |
| 452 Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused | |
| 453 on line recognition, but also still supports the legacy Tesseract OCR engine of | |
| 454 Tesseract 3 which works by recognizing character patterns. Compatibility with | |
| 455 Tesseract 3 is enabled by `--oem 0`. This also needs traineddata files which | |
| 456 support the legacy engine, for example those from the tessdata repository | |
| 457 (https://github.com/tesseract-ocr/tessdata). | |
| 458 | |
| 459 For further details, see the release notes in the Tesseract documentation | |
| 460 (<https://tesseract-ocr.github.io/tessdoc/ReleaseNotes.html>). | |
| 461 | |
| 462 | |
| 463 RESOURCES | |
| 464 --------- | |
| 465 Main web site: <https://github.com/tesseract-ocr> + | |
| 466 User forum: <https://groups.google.com/g/tesseract-ocr> + | |
| 467 Documentation: <https://tesseract-ocr.github.io/> + | |
| 468 Information on training: <https://tesseract-ocr.github.io/tessdoc/Training-Tesseract.html> | |
| 469 | |
| 470 SEE ALSO | |
| 471 -------- | |
| 472 ambiguous_words(1), cntraining(1), combine_tessdata(1), dawg2wordlist(1), | |
| 473 shape_training(1), mftraining(1), unicharambigs(5), unicharset(5), | |
| 474 unicharset_extractor(1), wordlist2dawg(1) | |
| 475 | |
| 476 AUTHOR | |
| 477 ------ | |
| 478 Tesseract development was led at Hewlett-Packard and Google by Ray Smith. | |
| 479 The development team has included: | |
| 480 | |
| 481 Ahmad Abdulkader, Chris Newton, Dan Johnson, Dar-Shyang Lee, David Eger, | |
| 482 Eric Wiseblatt, Faisal Shafait, Hiroshi Takenaka, Joe Liu, Joern Wanke, | |
| 483 Mark Seaman, Mickey Namiki, Nicholas Beato, Oded Fuhrmann, Phil Cheatle, | |
| 484 Pingping Xiu, Pong Eksombatchai (Chantat), Ranjith Unnikrishnan, Raquel | |
| 485 Romano, Ray Smith, Rika Antonova, Robert Moss, Samuel Charron, Sheelagh | |
| 486 Lloyd, Shobhit Saxena, and Thomas Kielbus. | |
| 487 | |
| 488 For a list of contributors see | |
| 489 <https://github.com/tesseract-ocr/tesseract/blob/main/AUTHORS>. | |
| 490 | |
| 491 COPYING | |
| 492 ------- | |
| 493 Licensed under the Apache License, Version 2.0 |
