comparison mupdf-source/thirdparty/tesseract/INSTALL.GIT.md @ 2:b50eed0cc0ef upstream

ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4. The directory name has changed: no version number in the expanded directory now.
author Franz Glasner <fzglas.hg@dom66.de>
date Mon, 15 Sep 2025 11:43:07 +0200
parents
children
comparison
equal deleted inserted replaced
1:1d09e1dec1d9 2:b50eed0cc0ef
1 ## autotools (LINUX/UNIX , msys...)
2
3 If you have cloned Tesseract from GitHub, you must generate
4 the configure script.
5
6 If you have tesseract 4.0x installation in your system, please remove it
7 before new build.
8
9 You need Leptonica 1.74.2 (minimum) for Tesseract 4.0x.
10
11 Known dependencies for training tools (excluding leptonica):
12
13 * compiler with c++17 support
14 * automake
15 * pkg-config
16 * pango-devel
17 * cairo-devel
18 * icu-devel
19
20 So, the steps for making Tesseract are:
21
22 ./autogen.sh
23 ./configure
24 make
25 sudo make install
26 sudo ldconfig
27 make training
28 sudo make training-install
29
30 You need to install at least English language and OSD traineddata files to
31 `TESSDATA_PREFIX` directory.
32
33 You can retrieve single file with tools like [wget](https://www.gnu.org/software/wget/), [curl](https://curl.haxx.se/), [GithubDownloader](https://github.com/intezer/GithubDownloader) or browser.
34
35 All language data files can be retrieved from git repository (useful only for packagers!).
36 (Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for
37 all languages).
38
39 git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata
40
41 You need an Internet connection and [curl](https://curl.haxx.se/) to compile `ScrollView.jar`
42 because the build will automatically download
43 [piccolo2d-core-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-core/3.0.1/piccolo2d-core-3.0.1.jar) and
44 [piccolo2d-extras-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-extras/3.0.1/piccolo2d-extras-3.0.1.jar) and
45 [jaxb-api-2.3.1.jar](http://search.maven.org/remotecontent?filepath=javax/xml/bind/jaxb-api/2.3.1/jaxb-api-2.3.1.jar) and place them to `tesseract/java`.
46
47 Just run:
48
49 make ScrollView.jar
50
51 and follow the instruction on [Viewer Debugging](https://tesseract-ocr.github.io/tessdoc/ViewerDebugging.html).
52
53 ## cmake
54
55 There is alternative build system based on multiplatform [cmake](https://cmake.org/)
56
57 ### LINUX
58
59 mkdir build
60 cd build && cmake .. && make
61 sudo make install
62
63 ### WINDOWS
64
65 See the [documentation](https://tesseract-ocr.github.io/tessdoc/) for more information on this.