Mercurial > hgrepos > Python2 > PyMuPDF
comparison README.md @ 1:1d09e1dec1d9 upstream
ADD: PyMuPDF v1.26.4: the original sdist.
It does not yet contain MuPDF. This normally will be downloaded when
building PyMuPDF.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:37:51 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 1:1d09e1dec1d9 |
|---|---|
| 1 # PyMuPDF | |
| 2 | |
| 3 **PyMuPDF** is a high performance **Python** library for data extraction, analysis, conversion & manipulation of [PDF (and other) documents](https://pymupdf.readthedocs.io/en/latest/the-basics.html#supported-file-types). | |
| 4 | |
| 5 # Community | |
| 6 Join us on **Discord** here: [#pymupdf](https://discord.gg/TSpYGBW4eq) | |
| 7 | |
| 8 | |
| 9 # Installation | |
| 10 | |
| 11 **PyMuPDF** requires **Python 3.9 or later**, install using **pip** with: | |
| 12 | |
| 13 `pip install PyMuPDF` | |
| 14 | |
| 15 There are **no mandatory** external dependencies. However, some [optional features](#pymupdf-optional-features) become available only if additional packages are installed. | |
| 16 | |
| 17 You can also try without installing by visiting [PyMuPDF.io](https://pymupdf.io/#examples). | |
| 18 | |
| 19 | |
| 20 # Usage | |
| 21 | |
| 22 Basic usage is as follows: | |
| 23 | |
| 24 ```python | |
| 25 import pymupdf # imports the pymupdf library | |
| 26 doc = pymupdf.open("example.pdf") # open a document | |
| 27 for page in doc: # iterate the document pages | |
| 28 text = page.get_text() # get plain text encoded as UTF-8 | |
| 29 | |
| 30 ``` | |
| 31 | |
| 32 | |
| 33 # Documentation | |
| 34 | |
| 35 Full documentation can be found on [pymupdf.readthedocs.io](https://pymupdf.readthedocs.io). | |
| 36 | |
| 37 | |
| 38 | |
| 39 # <a id="pymupdf-optional-features"></a>Optional Features | |
| 40 | |
| 41 * [fontTools](https://pypi.org/project/fonttools/) for creating font subsets. | |
| 42 * [pymupdf-fonts](https://pypi.org/project/pymupdf-fonts/) contains some nice fonts for your text output. | |
| 43 * [Tesseract-OCR](https://github.com/tesseract-ocr/tesseract) for optical character recognition in images and document pages. | |
| 44 | |
| 45 | |
| 46 | |
| 47 # About | |
| 48 | |
| 49 **PyMuPDF** adds **Python** bindings and abstractions to [MuPDF](https://mupdf.com/), a lightweight **PDF**, **XPS**, and **eBook** viewer, renderer, and toolkit. Both **PyMuPDF** and **MuPDF** are maintained and developed by [Artifex Software, Inc](https://artifex.com). | |
| 50 | |
| 51 **PyMuPDF** was originally written by [Jorj X. McKie](mailto:jorj.x.mckie@outlook.de). | |
| 52 | |
| 53 | |
| 54 # License and Copyright | |
| 55 | |
| 56 **PyMuPDF** is available under [open-source AGPL](https://www.gnu.org/licenses/agpl-3.0.html) and commercial license agreements. If you determine you cannot meet the requirements of the **AGPL**, please contact [Artifex](https://artifex.com/contact/pymupdf-inquiry.php) for more information regarding a commercial license. | |
| 57 | |
| 58 | |
| 59 | |
| 60 |
