view mupdf-source/thirdparty/tesseract.txt @ 40:aa33339d6b8a upstream

ADD: MuPDF v1.26.10: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.5.
author Franz Glasner <fzglas.hg@dom66.de>
date Sat, 11 Oct 2025 11:31:38 +0200
parents b50eed0cc0ef
children
line wrap: on
line source

If you want to build with Tesseract functionality, you need to run make
with a "tesseract=yes" argument.

You will also need a suitable set of traineddata for the languages you
wish to run. Only the LSTM engine (the latest and most accurate engine)
is built into Tesseract, so the traineddata contained within the
repository itself is no good.

Suitable data can be retrieved from either:

  https://github.com/tesseract-ocr/tessdata_best

or

  https://github.com/tesseract-ocr/tessdata_fast

e.g.

  wget https://github.com/tesseract-ocr/tessdata_fast/raw/master/eng.traineddata