comparison mupdf-source/thirdparty/tesseract/doc/text2image.1.asc @ 2:b50eed0cc0ef upstream

ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4. The directory name has changed: no version number in the expanded directory now.
author Franz Glasner <fzglas.hg@dom66.de>
date Mon, 15 Sep 2025 11:43:07 +0200
parents
children
comparison
equal deleted inserted replaced
1:1d09e1dec1d9 2:b50eed0cc0ef
1 TEXT2IMAGE(1)
2 =============
3 :doctype: manpage
4
5 NAME
6 ----
7 text2image - generate OCR training pages.
8
9 SYNOPSIS
10 --------
11 *text2image* --text 'FILE' --outputbase 'PATH' --fonts_dir 'PATH' [OPTION]
12
13 DESCRIPTION
14 -----------
15 text2image(1) generates OCR training pages. Given a text file it outputs an image with a given font and degradation.
16
17 OPTIONS
18 -------
19 '--text FILE'::
20 File name of text input to use for creating synthetic training data. (type:string default:)
21
22 '--outputbase FILE'::
23 Basename for output image/box file (type:string default:)
24
25 '--fontconfig_tmpdir PATH'::
26 Overrides fontconfig default temporary dir (type:string default:/tmp)
27
28 '--fonts_dir PATH'::
29 If empty it use system default. Otherwise it overrides system default font location (type:string default:)
30
31 '--font FONTNAME'::
32 Font description name to use (type:string default:Arial)
33
34 '--writing_mode MODE'::
35 Specify one of the following writing modes.
36 'horizontal' : Render regular horizontal text. (default)
37 'vertical' : Render vertical text. Glyph orientation is selected by Pango.
38 'vertical-upright' : Render vertical text. Glyph orientation is set to be upright. (type:string default:horizontal)
39
40 '--tlog_level INT'::
41 Minimum logging level for tlog() output (type:int default:0)
42
43 '--max_pages INT'::
44 Maximum number of pages to output (0=unlimited) (type:int default:0)
45
46 '--degrade_image BOOL'::
47 Degrade rendered image with speckle noise, dilation/erosion and rotation (type:bool default:true)
48
49 '--rotate_image BOOL'::
50 Rotate the image in a random way. (type:bool default:true)
51
52 '--strip_unrenderable_words BOOL'::
53 Remove unrenderable words from source text (type:bool default:true)
54
55 '--ligatures BOOL'::
56 Rebuild and render ligatures (type:bool default:false)
57
58 '--exposure INT'::
59 Exposure level in photocopier (type:int default:0)
60
61 '--resolution INT'::
62 Pixels per inch (type:int default:300)
63
64 '--xsize INT'::
65 Width of output image (type:int default:3600)
66
67 '--ysize INT'::
68 Height of output image (type:int default:4800)
69
70 '--margin INT'::
71 Margin round edges of image (type:int default:100)
72
73 '--ptsize INT'::
74 Size of printed text (type:int default:12)
75
76 '--leading INT'::
77 Inter-line space (in pixels) (type:int default:12)
78
79 '--box_padding INT'::
80 Padding around produced bounding boxes (type:int default:0)
81
82 '--char_spacing DOUBLE'::
83 Inter-character space in ems (type:double default:0)
84
85 '--underline_start_prob DOUBLE'::
86 Fraction of words to underline (value in [0,1]) (type:double default:0)
87
88 '--underline_continuation_prob DOUBLE'::
89 Fraction of words to underline (value in [0,1]) (type:double default:0)
90
91 '--render_ngrams BOOL'::
92 Put each space-separated entity from the input file into one bounding box. The ngrams in the input file will be randomly permuted before rendering (so that there is sufficient variety of characters on each line). (type:bool default:false)
93
94 '--output_word_boxes BOOL'::
95 Output word bounding boxes instead of character boxes. This is used for Cube training, and implied by --render_ngrams. (type:bool default:false)
96
97 '--unicharset_file FILE'::
98 File with characters in the unicharset. If --render_ngrams is true and --unicharset_file is specified, ngrams with characters that are not in unicharset will be omitted (type:string default:)
99
100 '--bidirectional_rotation BOOL'::
101 Rotate the generated characters both ways. (type:bool default:false)
102
103 '--only_extract_font_properties BOOL'::
104 Assumes that the input file contains a list of ngrams. Renders each ngram, extracts spacing properties and records them in output_base/[font_name].fontinfo file. (type:bool default:false)
105
106 Use these flags to output zero-padded, square individual character images
107 -------------------------------------------------------------------------
108
109 '--output_individual_glyph_images BOOL'::
110 If true also outputs individual character images (type:bool default:false)
111
112 '--glyph_resized_size INT'::
113 Each glyph is square with this side length in pixels (type:int default:0)
114
115 '--glyph_num_border_pixels_to_pad INT'::
116 Final_size=glyph_resized_size+2*glyph_num_border_pixels_to_pad (type:int default:0)
117
118 Use these flags to find fonts that can render a given text
119 ----------------------------------------------------------
120
121 '--find_fonts BOOL'::
122 Search for all fonts that can render the text (type:bool default:false)
123
124 '--render_per_font BOOL'::
125 If find_fonts==true, render each font to its own image. Image filenames are of the form output_name.font_name.tif (type:bool default:true)
126
127 '--min_coverage DOUBLE'::
128 If find_fonts==true, the minimum coverage the font has of the characters in the text file to include it, between 0 and 1. (type:double default:1)
129
130 Example Usage:
131 ```
132 text2image --find_fonts \
133 --fonts_dir /usr/share/fonts \
134 --text ../langdata/hin/hin.training_text \
135 --min_coverage .9 \
136 --render_per_font \
137 --outputbase ../langdata/hin/hin \
138 |& grep raw | sed -e 's/ :.*/" \\/g' | sed -e 's/^/ "/' >../langdata/hin/fontslist.txt
139 ```
140
141 SINGLE OPTIONS
142 --------------
143
144 '--list_available_fonts BOOL'::
145 List available fonts and quit. (type:bool default:false)
146
147 HISTORY
148 -------
149 text2image(1) was first made available for tesseract 3.03.
150
151 RESOURCES
152 ---------
153 Main web site: <https://github.com/tesseract-ocr> +
154 Information on training tesseract LSTM: <https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html>
155
156 SEE ALSO
157 --------
158 tesseract(1)
159
160 COPYING
161 -------
162 Copyright \(C) 2012 Google, Inc.
163 Licensed under the Apache License, Version 2.0
164
165 AUTHOR
166 ------
167 The Tesseract OCR engine was written by Ray Smith and his research groups
168 at Hewlett Packard (1985-1995) and Google (2006-2018).