Mercurial > hgrepos > Python > libs > pygments-lexer-pseudocode2
view docs/lexer-algpseudocode.rst @ 173:551c3421bccb
Scale IBM Plex to 0.92 by default
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 11 May 2026 15:08:19 +0200 |
| parents | ad80fcbf7b47 |
| children | 95754197f5b3 |
line wrap: on
line source
.. -*- coding: utf-8; indent-tabs-mode: nil; -*- ************************************* AlgPseudocode and Language Variants ************************************* These lexers are heavily heavily inspired by CTAN’s `Algpseudocodex`_. They recogzize expressions and additionally all sorts of comments and commands that are inspired by `Algpseudocodex`_. They may be used in `Sphinx`_ by their aliases: .. code-block:: none .. code-block:: algpseudocode \PROGRAM {The Pseudoprogram} \IS \END PROGRAM {The Pseudoprogram} It will be rendered as: .. code-block:: algpseudocode \PROGRAM {The Pseudoprogram} \IS \END PROGRAM {The Pseudoprogram} And the same with the german variant (using ``.. code-block:: algpseudocode-de`` as language alias): .. code-block:: algpseudocode-de \PROGRAM {The Pseudoprogram} \IS \END PROGRAM {The Pseudoprogram} States ====== The AlgPseudocode lexer and its language variants AlgPseudocodeDE and AlgPseudocodeFR basically work in three states: `default`, `expression` and `text`. In expressions it automatically recognizes: - Strings (single-quote, double-quote, triple-single-quote, triple-double-quote, `Python`_ style) - Numbers (also `Python`_ style) - (Mathematical) operators and symbols - ``\TEXT{...}`` To switch in a text-mode that prohibits automatic expression highlighting. A closing curly brace can be quoted with ``\}`` to not end the text mode prematurely. - ``\NAME``, ``\CALL`` and ``\GETS`` - ``\REM`` and ``\REMARK`` for remarks (aka comments) - Names (`Name.Entity`) - :ref:`explicit-token-types` In the default-mode it recogzizes expressions and additionally all sorts of comments and commands that look somewhat like `Algpseudocodex`_ commands. In texts it recogzizes: - ``\EXPR`` or ``\EXPRESSION`` To switch to expression-mode. A closing curly brace can be quoted with ``\}`` to not end the expression mode prematurely. - ``\REM`` and ``\REMARK`` for remarks (aka comments) - :ref:`explicit-token-types` Lexer Options ============= .. describe:: prohibit_raiseonerror_filter **Type:** :py:class:`bool` **Default:** :py:obj:`None` If :py:obj:`True` the `raiseonerror` filter is not allowed to be applied by `Sphinx`_ when :py:meth:`Lexer.add_filter` is called. This setting does not apply to filters that are set by the standard lexer option `filters`. .. describe:: no_end **Type:** :py:class:`bool` **Default:** :py:obj:`False` If :py:obj:`True` all the ``\ENDxxx`` commands will be skipped and yield no output. .. describe:: gets **Type:** :py:class:`str` or :py:obj:`None` **Default:** :py:obj:`None` (yields ``←``) The operator symbol to be printed by the command ``\GETS``. An often used alternative is ``:=``. .. describe:: remark **Type:** :py:class:`str` or :py:obj:`None` **Default:** :py:obj:`None` (yields ``▷``) The symbol to be printed as when starting comments with ``\REMARK`` or ``\REM``. To use a lexer with non-default options in `Sphinx`_ see section :ref:`customized-sphinx-lexers`. Comments ======== - with the ``\REMARK`` or ``\REM`` keywords (this includes a leading symbol) - multi-line comments with ``/* ... */``; they can be **nested** - multi-line comments with ``(* ... *)``; they can be **nested** - single-line comments with ``//`` or ``#`` (until the end of the line) .. code-block:: algpseudocode /* * A single multiline comment */ /* * A multiline comment * * /* This is a nested multi-line comment */ * */ (* * A multiline comment * * (* This is a nested multi-line comment *) * *) // A single-line comment # A single-line comment \REM A remark has a leading symbol Literals ======== Strings and numbers as in `Python`_. String prefixes ``r``, ``f`` and ``t`` are not supported -- ``u`` and ``b`` are. To yield non-string-delimiting single- and double-quotes you have to escape them using ``\'`` or ``\"``. This must be used to typeset something as :algpseudocode:`f\\'(x) = 0`. .. code-block:: algpseudocode 0 0xdead 0b100001 0o720 2.7 2.7e-54 "A string with an escaped double-quote \" " 'Another string with an escaped single-quote \' ' """A multiline string """ '''Another multiline string ''' b"A \x20 byte string" u'An explicit Unicode \u1234 string' \" a non string \' a non string also (Mathematical) Symbols and Operators ==================================== Some ASCII symbol combinations are recognized and replaced by a Unicode symbol: .. code-block:: algpseudocode \TEXT{<=>} <=> \TEXT{<->} <-> \TEXT{<-} <- \TEXT{->} -> \TEXT{=>} => \TEXT{<=} <= \TEXT{>=} >= \TEXT{<>} <> \TEXT{!=} != \TEXT{:=} := \TEXT{=:} =: \TEXT{?=} ?= Unicode codepoints with property ``Sm`` are recognized as mathematical symbol and highlighted accordingly. Punctuation =========== Runs of dots ``.``, ``..``, ``...``, ``....``, ... are handled properly in expressions and yield a punctuation token. They are not replaced by corresponding Unicode symbols. Commands ======== - Start with a backslash character ``\`` - Case-insensitive - Yield mostly to :py:class:`pygments.Token.Keyword` - Translated if a translation is found - Depending on the command---may have required or optional parameters Parameter handling is as follows: * Parameters are enclosed in curly braces ``{`` and ``}`` * Escaping within the braces is possible using the backslash ``\`` as escape character * Parameters are separated from the keyword/command by a (possibly empty) run of space or TAB characters. This is true for required and optional parameters. .. todo:: Escaping A single backslash yields a Generic.Error token when in `default` and `expression` states. Commands With Required Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: algpseudocode \TEXT{\PROGRAM {A Program\} or \PROG {A Program\}} \PROGRAM {A Program} \TEXT{\ALGORITHM{An Algorithm\} or \ALGO{An Algorithm\}} \ALGORITHM{An Algorithm} \TEXT{\PROCEDURE{A Procedure\} or \PROC{A Procedure\}} \PROCEDURE{A Procedure} \TEXT{\FUNCTION{A Function\} or \FUNC{A Function\} or \FN{A Function\}} \FUNCTION{A Function} \TEXT{\CLASS{A Class\}} \CLASS{A Class} \TEXT{\STATEMENT{the expression\} \STATE{the expression\} \BLOCK{the expression\}} \STATEMENT{the expression} \TEXT{expr1: \\EXPRESSION{expression a in b\} expr2: \\EXPR{expression b in a\}} \TEXT{expr1: \EXPRESSION{expression a in b} expr2: \EXPR{expression b in a}} \TEXT{\TEXTSTATEMENT{the text\} \TEXTSTATE{the text\} \TSTATEMENT{the text\} \TSTATE{the text\} \TEXTBLOCK{the text\} \TBLOCK{the text\}} \TEXTSTATEMENT{the text} \TEXT{\INPUT{Input 1\}} \INPUT{Input 1} \TEXT{\INPUTS{Input 2\}} \INPUTS{Input 2} \TEXT{\OUTPUT{Output 1\}} \OUTPUT{Output 1} \TEXT{\OUTPUTS{Output 2\}} \OUTPUTS{Output 2} \TEXT{\ENSURE{Whatever should be ensured!\}} \ENSURE{Whatever should be ensured!} \TEXT{\REQUIRE{Whatever should be required.\}} \REQUIRE{Whatever should be required.} \TEXT{\RETURNS{Return 2\}} \RETURNS{Return 2} \TEXT{\CALL{a function\}(p1, p2)} \CALL{a function}(p1, p2) \TEXT{\NAME{an entity name\}} \NAME{an entity name} Commands With Optional Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some ``END``-commands have optional parameters: .. code-block:: algpseudocode \TEXT{\ENDPROGRAM \ENDPROG} \ENDPROGRAM \TEXT{\ENDALGORITHM \ENDALGO} \ENDALGORITHM \TEXT{\ENDPROCEDURE \ENDPROC} \ENDPROCEDURE \TEXT{\ENDFUNCTION \ENDFUNC \ENDFN} \ENDFUNCTION \TEXT{\ENDCLASS} \ENDCLASS They are used like this: .. code-block:: algpseudocode \TEXT{\CLASS{Foo Bar Class\} ... \END CLASS {Foo Bar Class\}} \TEXT{yields} \CLASS{Foo Bar Class} ... \END CLASS {Foo Bar Class} \TEXT{\CLASS{Foo Bar Class\} ... \END CLASS} \TEXT{yields} \CLASS{Foo Bar Class} ... \END CLASS .. seealso:: For other syntax variants concerning `END` see also section `END-Commands`_. Commands Without Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Normal" Commands ''''''''''''''''' .. code-block:: algpseudocode \TEXT{\IF} \IF \TEXT{\THEN} \THEN \TEXT{\ELSE} \ELSE \TEXT{\ELSEIF or \ELSIF or \ELIF} \ELSEIF \text{or} \ELSIF \text{or} \ELIF \TEXT{\DO} \DO \TEXT{\WHILE} \WHILE \TEXT{\FORALL} \FORALL \TEXT{\FOR} \FOR \TEXT{\FROM} \FROM \TEXT{\TO} \TO \TEXT{\STEP} \STEP \TEXT{\IN} \IN \TEXT{\LOOP} \LOOP \TEXT{\REPEAT} \REPEAT \TEXT{\UNTIL} \UNTIL \TEXT{\RETURN} \RETURN \TEXT{\BEGIN} \BEGIN \TEXT{\END} \END \TEXT{\IS} \IS \TEXT{\WITH} \WITH \TEXT{\GETS} \GETS \TEXT{\\REMARK or \\REM} \REMARK A comment with a leading symbol ``\REMARK`` or ``\REM`` is special: all characters to the end of the line are taken as comment; curly braces are not needed---in fact: they are interpreted to be part of the comment. END-Commands '''''''''''' The separator character can be empty, a run of ASCII spaces, a run of TAB characters, a single underscore ``_`` or a single hyphen ``-`` like: ``\ENDIF``, ``\END IF``, ``\END-IF``, ``\END_IF`` or ``\END IF`` .. code-block:: algpseudocode \text{\ENDIF} \ENDIF \rem empty \text{\END IF} \END IF \rem a single space \text{\END IF} \END IF \rem two spaces \text{\END-IF} \END-IF \rem a single hyphen \text{\END_IF} \END_IF \rem a single underscore \text{\END IF} \END IF \rem a single TAB character The list of END-commands (here always just with ``-`` as separator): .. code-block:: algpseudocode \text{\END-PROGRAM \END-PROG} \END-PROGRAM \text{\END-ALGORITHM \END-ALGO} \END-ALGORITHM \text{\END-PROCEDURE \END-PROC} \END-PROCEDURE \text{\END-FUNCTION \END-FUNC \END-FN} \END-FUNCTION \text{\END-CLASS} \END-CLASS \text{\END-IF} \END-IF \text{\END-WHILE} \END-WHILE \text{\END-FOR} \END-FOR \text{\END-FORALL} \END-FORALL \text{\END-LOOP} \END-LOOP .. note:: The output of END-commands can be suppressed by setting the lexer option ``no_end`` to :py:obj:`True`. Names and Entities ================== In an expression context all other words are interpreted as entity names (token type :py:class:`pygments.token.Token.Name.Entity`). Allowed characters in the words follow the corresponding `Python`_ rules. As such, many Unicode characters are allowed. To highlight entity names with whitespace or other "special" characters in it use the ``NAME`` command. .. code-block:: algpseudocode \TEXT{entity_name_1} entity_name_1 \TEXT{entity_name_2} entity_name_2 \TEXT{\NAME{entity-name 3\}} \NAME{entity-name 3} \TEXT{München} München \TEXT{Genève} Genève .. _explicit-token-types: Explicit Token Types ==================== Handle keywords and operators that are not handled by default or change the default handling of some expressions. `XX` represents a `value` in the :py:data:`pygments.token.STANDARD_TYPES` dict. Its corresponding token type (the associated `key` in this `dict`) is used as token type. ``\\tt-XX/SINGLE-CHAR`` no escaping needed `SINGLE-CHAR` is a single character and can be *every* character (including a carriage-return or line-feed) ``\\ttx-XX{CHARACTERS}`` ``\\ttx-XX(CHARACTERS)`` ``\\ttx-XX[CHARACTERS]`` ``\\ttx-XX<CHARACTERS>`` ``\\ttx-XX<SEP>CHARACTERS<SEP>`` No escaping possible! There are enough alternatives available! `SEP` is one of ``/:|=*+!\$~``. Examples: .. code-block:: algpseudocode \text{• \\tt-kc/C} \tt-kc/C \rem C as Keyword.Constant \text{• \\tt-ow/∈} \tt-ow/∈ \rem ∈ as Operator.Word \text{• \\ttx-kc{A New Constant Keyword\}} \ttx-kc{A New Constant Keyword} \rem As a new Keyword.Constant \text{• \\ttx-nv{A New Variable Name\}} \ttx-nv{A New Variable Name} \rem An explicit Name.Variable \text{• \\ttx-k(∈ ∌)} \ttx-k(∈ ∌) \rem ∈ and ∌ as (ordinary) Keywords \text{• \\ttx-o<∈ ∌>} \ttx-o<∈ ∌> \rem ∈ and ∌ as (ordinary) Operators /* * The line below has ∈_∌ as (peculiar) function name. * Their params are automatic (i.e. a normal expression). */ \text{• \\ttx-nf<∈_∌>(p1, p2)} \ttx-nf<∈_∌>(p1, p2) /* * The line below has ∈_∌ as (peculiar) decorator name (as used in Python). * Their params are automatic (i.e. a normal expression). */ \text{• \\ttx-nd[∈_∌](p1, p2)} \ttx-nd[∈_∌](p1, p2) /* * This is a non-existing token type: you get some generic error marking * with a Generic.Error token and no expansion. */ \text{• \\ttx-NON-EXISTING[∈_∌](p1, p2)} \ttx-NON_EXISTING[∈_∌](p1, p2) .. note:: Explicit token types are **case-sensitive**. .. _customized-sphinx-lexers: Customized Lexers in Sphinx =========================== Defining lexers with non-default options in `Sphinx`_ can be done in its configuration file :file:`conf.py`. The first option is to apply the Sphinx config value ``highlight_options`` properly. An existing lexer can be customized by options. A more flexible alternative is to define a new lexer in the Sphinx application. The very same lexer class can be used with different options: .. code-block:: python from functools import partial from pygments_lexer_pseudocode2.lexers.algpseudocode import AlgPseudocodeLexer def setup(app): # # Add a custom lexer: AlgPseudocodeLexer with custom init # option "no_end". # # In modern Sphinx versions given lexer must be callable and may # not be a lexer instance. So use an indirection with "partial" # here. # app.add_lexer("noend-algpseudocode", partial(AlgPseudocodeLexer, no_end=True)) Similarily it works for custom styles and filters. .. note:: Lexers in Sphinx are instantiated with the `raiseonerror` filter applied by default. This is also true for custom lexers that are added by :py:meth:`Sphinx.add_lexer`. Lexer *instances* that are added to :py:data:`sphinx.highlighting.lexers` somehow are taken as is by Sphinx and are not augmented with any default filters. For older Sphinx versions your mileage may vary. Some Examples ============= A synthetic example with many features: .. literalinclude:: examples/example-1.pseudocode :language: algpseudocode :lines: 2- With a customized `AlgPseudocodeLexer` and its `no_end` option set to :py:obj:`True`. .. literalinclude:: examples/example-1.pseudocode :language: NoEndAlgPseudocode :lines: 2- This is Wikipedia's description of *Dinic's Algorithm* (see https://en.wikipedia.org/wiki/Dinic%27s_algorithm): .. literalinclude:: examples/algorithm-dinic.description :language: algpseudocode :lines: 2- This is Wikipedia's pseudocode of the *Ford–Fulkerson Algorithm* (see https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm): .. literalinclude:: examples/algorithm-ford-fulkerson.pseudocode :language: algpseudocode :lines: 2- This is Wikipedia's pseudocode of the *Edmonds–Karp Algorithm* (see https://en.wikipedia.org/wiki/Edmonds%E2%80%93Karp_algorithm) with a custom lexer that skip all ``ENDxxx`` keywords: .. literalinclude:: examples/algorithm-edmonds-karp.pseudocode :language: NoEndAlgPseudocode :lines: 2- And now the *Edmonds–Karp Algorithm* with french keywords: .. literalinclude:: examples/algorithm-edmonds-karp.pseudocode :language: algpseudocode-fr :lines: 2- And again the *Edmonds–Karp Algorithm* with german keywords: .. literalinclude:: examples/algorithm-edmonds-karp.pseudocode :language: algpseudocode-de :lines: 2-
