Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/tesseract/INSTALL.GIT.md @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 ## autotools (LINUX/UNIX , msys...) | |
| 2 | |
| 3 If you have cloned Tesseract from GitHub, you must generate | |
| 4 the configure script. | |
| 5 | |
| 6 If you have tesseract 4.0x installation in your system, please remove it | |
| 7 before new build. | |
| 8 | |
| 9 You need Leptonica 1.74.2 (minimum) for Tesseract 4.0x. | |
| 10 | |
| 11 Known dependencies for training tools (excluding leptonica): | |
| 12 | |
| 13 * compiler with c++17 support | |
| 14 * automake | |
| 15 * pkg-config | |
| 16 * pango-devel | |
| 17 * cairo-devel | |
| 18 * icu-devel | |
| 19 | |
| 20 So, the steps for making Tesseract are: | |
| 21 | |
| 22 ./autogen.sh | |
| 23 ./configure | |
| 24 make | |
| 25 sudo make install | |
| 26 sudo ldconfig | |
| 27 make training | |
| 28 sudo make training-install | |
| 29 | |
| 30 You need to install at least English language and OSD traineddata files to | |
| 31 `TESSDATA_PREFIX` directory. | |
| 32 | |
| 33 You can retrieve single file with tools like [wget](https://www.gnu.org/software/wget/), [curl](https://curl.haxx.se/), [GithubDownloader](https://github.com/intezer/GithubDownloader) or browser. | |
| 34 | |
| 35 All language data files can be retrieved from git repository (useful only for packagers!). | |
| 36 (Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for | |
| 37 all languages). | |
| 38 | |
| 39 git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata | |
| 40 | |
| 41 You need an Internet connection and [curl](https://curl.haxx.se/) to compile `ScrollView.jar` | |
| 42 because the build will automatically download | |
| 43 [piccolo2d-core-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-core/3.0.1/piccolo2d-core-3.0.1.jar) and | |
| 44 [piccolo2d-extras-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-extras/3.0.1/piccolo2d-extras-3.0.1.jar) and | |
| 45 [jaxb-api-2.3.1.jar](http://search.maven.org/remotecontent?filepath=javax/xml/bind/jaxb-api/2.3.1/jaxb-api-2.3.1.jar) and place them to `tesseract/java`. | |
| 46 | |
| 47 Just run: | |
| 48 | |
| 49 make ScrollView.jar | |
| 50 | |
| 51 and follow the instruction on [Viewer Debugging](https://tesseract-ocr.github.io/tessdoc/ViewerDebugging.html). | |
| 52 | |
| 53 ## cmake | |
| 54 | |
| 55 There is alternative build system based on multiplatform [cmake](https://cmake.org/) | |
| 56 | |
| 57 ### LINUX | |
| 58 | |
| 59 mkdir build | |
| 60 cd build && cmake .. && make | |
| 61 sudo make install | |
| 62 | |
| 63 ### WINDOWS | |
| 64 | |
| 65 See the [documentation](https://tesseract-ocr.github.io/tessdoc/) for more information on this. |
