Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/tesseract/doc/classifier_tester.1.asc @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 CLASSIFIER_TESTER(1) | |
| 2 ==================== | |
| 3 | |
| 4 NAME | |
| 5 ---- | |
| 6 classifier_tester - for *legacy tesseract* engine. | |
| 7 | |
| 8 SYNOPSIS | |
| 9 -------- | |
| 10 *classifier_tester* -U 'unicharset_file' -F 'font_properties_file' -X 'xheights_file' -classifier 'x' -lang 'lang' [-output_trainer trainer] *.tr | |
| 11 | |
| 12 DESCRIPTION | |
| 13 ----------- | |
| 14 classifier_tester(1) runs Tesseract in a special mode. | |
| 15 It takes a list of .tr files and tests a character classifier | |
| 16 on data as formatted for training, | |
| 17 but it doesn't have to be the same as the training data. | |
| 18 | |
| 19 IN/OUT ARGUMENTS | |
| 20 ---------------- | |
| 21 | |
| 22 a list of .tr files | |
| 23 | |
| 24 OPTIONS | |
| 25 ------- | |
| 26 -l 'lang':: | |
| 27 (Input) three character language code; default value 'eng'. | |
| 28 | |
| 29 -classifier 'x':: | |
| 30 (Input) One of "pruner", "full". | |
| 31 | |
| 32 | |
| 33 -U 'unicharset':: | |
| 34 (Input) The unicharset for the language. | |
| 35 | |
| 36 -F 'font_properties_file':: | |
| 37 (Input) font properties file, each line is of the following form, where each field other than the font name is 0 or 1: | |
| 38 | |
| 39 *font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur* | |
| 40 | |
| 41 -X 'xheights_file':: | |
| 42 (Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ] | |
| 43 | |
| 44 *font_name* *xheight* | |
| 45 | |
| 46 -output_trainer 'trainer':: | |
| 47 (Output, Optional) Filename for output trainer. | |
| 48 | |
| 49 SEE ALSO | |
| 50 -------- | |
| 51 tesseract(1) | |
| 52 | |
| 53 COPYING | |
| 54 ------- | |
| 55 Copyright \(C) 2012 Google, Inc. | |
| 56 Licensed under the Apache License, Version 2.0 | |
| 57 | |
| 58 AUTHOR | |
| 59 ------ | |
| 60 The Tesseract OCR engine was written by Ray Smith and his research groups | |
| 61 at Hewlett Packard (1985-1995) and Google (2006-2018). |
