Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/gumbo-parser/python/gumbo/__init__.py @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 """Gumbo HTML parser. | |
| 2 | |
| 3 These are the Python bindings for Gumbo. All public API classes and functions | |
| 4 are exported from this module. They include: | |
| 5 | |
| 6 - CTypes representations of all structs and enums defined in gumbo.h. The | |
| 7 naming convention is to take the C name and strip off the "Gumbo" prefix. | |
| 8 | |
| 9 - A low-level wrapper around the gumbo_parse function, returning the classes | |
| 10 exposed above. Usage: | |
| 11 | |
| 12 import gumbo | |
| 13 with gumboc.parse(text, **options) as output: | |
| 14 do_stuff_with_doctype(output.document) | |
| 15 do_stuff_with_parse_tree(output.root) | |
| 16 | |
| 17 - Higher-level bindings that mimic the API provided by html5lib. Usage: | |
| 18 | |
| 19 from gumbo import html5lib | |
| 20 | |
| 21 This requires that html5lib be installed (it uses their treebuilders), and is | |
| 22 intended as a drop-in replacement. | |
| 23 | |
| 24 - Similarly, higher-level bindings that mimic BeautifulSoup and return | |
| 25 BeautifulSoup objects. For this, use: | |
| 26 | |
| 27 import gumbo | |
| 28 soup = gumbo.soup_parse(text, **options) | |
| 29 | |
| 30 It will give you back a soup object like BeautifulSoup.BeautifulSoup(text). | |
| 31 """ | |
| 32 | |
| 33 from gumbo.gumboc import * | |
| 34 | |
| 35 try: | |
| 36 from gumbo import html5lib_adapter as html5lib | |
| 37 except ImportError: | |
| 38 # html5lib not installed | |
| 39 pass | |
| 40 | |
| 41 try: | |
| 42 from gumbo.soup_adapter import parse as soup_parse | |
| 43 except ImportError: | |
| 44 # BeautifulSoup not installed | |
| 45 pass |
