Mercurial > hgrepos > Python2 > PyMuPDF
view mupdf-source/thirdparty/tesseract/INSTALL.GIT.md @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
line wrap: on
line source
## autotools (LINUX/UNIX , msys...) If you have cloned Tesseract from GitHub, you must generate the configure script. If you have tesseract 4.0x installation in your system, please remove it before new build. You need Leptonica 1.74.2 (minimum) for Tesseract 4.0x. Known dependencies for training tools (excluding leptonica): * compiler with c++17 support * automake * pkg-config * pango-devel * cairo-devel * icu-devel So, the steps for making Tesseract are: ./autogen.sh ./configure make sudo make install sudo ldconfig make training sudo make training-install You need to install at least English language and OSD traineddata files to `TESSDATA_PREFIX` directory. You can retrieve single file with tools like [wget](https://www.gnu.org/software/wget/), [curl](https://curl.haxx.se/), [GithubDownloader](https://github.com/intezer/GithubDownloader) or browser. All language data files can be retrieved from git repository (useful only for packagers!). (Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for all languages). git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata You need an Internet connection and [curl](https://curl.haxx.se/) to compile `ScrollView.jar` because the build will automatically download [piccolo2d-core-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-core/3.0.1/piccolo2d-core-3.0.1.jar) and [piccolo2d-extras-3.0.1.jar](https://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-extras/3.0.1/piccolo2d-extras-3.0.1.jar) and [jaxb-api-2.3.1.jar](http://search.maven.org/remotecontent?filepath=javax/xml/bind/jaxb-api/2.3.1/jaxb-api-2.3.1.jar) and place them to `tesseract/java`. Just run: make ScrollView.jar and follow the instruction on [Viewer Debugging](https://tesseract-ocr.github.io/tessdoc/ViewerDebugging.html). ## cmake There is alternative build system based on multiplatform [cmake](https://cmake.org/) ### LINUX mkdir build cd build && cmake .. && make sudo make install ### WINDOWS See the [documentation](https://tesseract-ocr.github.io/tessdoc/) for more information on this.
