Mercurial > hgrepos > Python > libs > pygments-lexer-pseudocode2
changeset 168:bff8b900713a
REFACTOR: All documentation pages refactored: merge intro and details for lexers and filters
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 11 May 2026 01:31:12 +0200 |
| parents | ddefcc20367c |
| children | 3c517c22df9c |
| files | docs/details-algpseudocode.rst docs/details-filter.rst docs/details-frpseudocode.rst docs/details.rst docs/filters.rst docs/index.rst docs/intro.rst docs/lexer-algpseudocode.rst docs/lexer-frpseudocode.rst docs/lexers.rst |
| diffstat | 10 files changed, 730 insertions(+), 735 deletions(-) [+] |
line wrap: on
line diff
--- a/docs/details-algpseudocode.rst Sun May 10 15:27:18 2026 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,450 +0,0 @@ -.. -*- coding: utf-8; indent-tabs-mode: nil; -*- - - -.. _details-algpseudocode: - -*************** - AlgPseudocode -*************** - -Lexer Options -============= - - .. describe:: prohibit_raiseonerror_filter - - **Type:** :py:class:`bool` - - **Default:** `False` - - If ``True`` the `raiseonerror` filter is not allowed to be applied by - `Sphinx`_ when :py:meth:`Lexer.add_filter` is called. - - This setting does not apply to filters that are set by the standard - lexer option `filters`. - - .. describe:: no_end - - **Type:** :py:class:`bool` - - **Default:** `False` - - If ``True`` all the ``\ENDxxx`` commands will be skipped and yield - nothing. - - .. describe:: gets - - **Type:** :py:class:`str` or :py:obj:`None` - - **Default:** `None` (yields ``←``) - - The operator symbol to be printed by the command ``\GETS``. - - An often used alternative is ``:=``. - - .. describe:: remark - - **Type:** :py:class:`str` or :py:obj:`None` - - **Default:** `None` (yields ``▷``) - - The symbol to be printed as when starting comments with - ``\REMARK`` or ``\REM``. - - To use a lexer with non-default options in `Sphinx`_ see section - :ref:`customized-sphinx-lexers`. - - -Comments -======== - -- with the ``\REMARK`` or ``\REM`` keywords (this includes a leading symbol) -- multi-line comments with ``/* ... */``; they can be **nested** -- multi-line comments with ``(* ... *)``; they can be **nested** -- single-line comments with ``//`` or ``#`` (until the end of the line) - -.. code-block:: algpseudocode - - /* - * A single multiline comment - */ - - /* - * A multiline comment - * - * /* This is a nested multi-line comment */ - * - */ - - (* - * A multiline comment - * - * (* This is a nested multi-line comment *) - * - *) - - // A single-line comment - - # A single-line comment - - \REM A remark has a leading symbol - - -Literals -======== - -Strings and numbers as in `Python`_. String prefixes ``r``, ``f`` and ``t`` -are not supported -- ``u`` and ``b`` are. - -To yield non-string-delimiting single- and double-quotes you have to escape them -using ``\'`` or ``\"``. This must be used to typeset something as -:algpseudocode:`f\\'(x) = 0`. - -.. code-block:: algpseudocode - - 0 0xdead 0b100001 0o720 2.7 2.7e-54 - - "A string with an escaped double-quote \" " - - 'Another string with an escaped single-quote \' ' - - """A multiline - string - """ - - '''Another multiline string - - ''' - - b"A \x20 byte string" - - u'An explicit Unicode \u1234 string' - - \" a non string - - \' a non string also - - -(Mathematical) Symbols and Operators -==================================== - -Some ASCII symbol combinations are recognized and replaced by a -Unicode symbol: - -.. code-block:: algpseudocode - - \TEXT{<=>} <=> - \TEXT{<->} <-> - \TEXT{<-} <- - \TEXT{->} -> - \TEXT{=>} => - \TEXT{<=} <= - \TEXT{>=} >= - \TEXT{<>} <> - \TEXT{!=} != - \TEXT{:=} := - \TEXT{=:} =: - \TEXT{?=} ?= - -Unicode codepoints with property ``Sm`` are recognized as mathematical symbol -and highlighted accordingly. - - -Punctuation -=========== - -Runs of dots ``.``, ``..``, ``...``, ``....``, ... are handled -properly in expressions and yield a punctuation token. -They are not replaced by corresponding Unicode symbols. - - -Keywords -======== - -Explicit Keywords ------------------ - -- Start with a backslash character ``\`` -- Case-insensitive -- Translated if a translation is found - -Parameter handling is as follows: - -- Parameters are enclosed in curly braces ``{`` and ``}`` -- Escaping within the braces is possible using the backslash ``\`` -- Parameters are separated from the keyword/command by a (possibly empty) run - of space or TAB characters. - This is true for required and optional parameters. - -.. todo:: Escaping - - A single backslash is a Generic.Error token - - -With Required Parameters -~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: algpseudocode - - \TEXT{\PROGRAM {A Program\} or \PROG {A Program\}} \PROGRAM {A Program} - \TEXT{\ALGORITHM{An Algorithm\} or \ALGO{An Algorithm\}} \ALGORITHM{An Algorithm} - \TEXT{\PROCEDURE{A Procedure\} or \PROC{A Procedure\}} \PROCEDURE{A Procedure} - \TEXT{\FUNCTION{A Function\} or \FUNC{A Function\} or \FN{A Function\}} \FUNCTION{A Function} - \TEXT{\CLASS{A Class\}} \CLASS{A Class} - - \TEXT{\STATEMENT{the expression\} \STATE{the expression\} \BLOCK{the expression\}} \STATEMENT{the expression} - - \TEXT{expr1: \\EXPRESSION{expression a in b\} expr2: \\EXPR{expression b in a\}} \TEXT{expr1: \EXPRESSION{expression a in b} expr2: \EXPR{expression b in a}} - - \TEXT{\TEXTSTATEMENT{the text\} \TEXTSTATE{the text\} \TSTATEMENT{the text\} \TSTATE{the text\} \TEXTBLOCK{the text\} \TBLOCK{the text\}} \TEXTSTATEMENT{the text} - - \TEXT{\INPUT{Input 1\}} \INPUT{Input 1} - \TEXT{\INPUTS{Input 2\}} \INPUTS{Input 2} - - \TEXT{\OUTPUT{Output 1\}} \OUTPUT{Output 1} - \TEXT{\OUTPUTS{Output 2\}} \OUTPUTS{Output 2} - - \TEXT{\ENSURE{Whatever should be ensured!\}} \ENSURE{Whatever should be ensured!} - - \TEXT{\REQUIRE{Whatever should be required.\}} \REQUIRE{Whatever should be required.} - - \TEXT{\RETURNS{Return 2\}} \RETURNS{Return 2} - - \TEXT{\CALL{a function\}(p1, p2)} \CALL{a function}(p1, p2) - - \TEXT{\NAME{an entity name\}} \NAME{an entity name} - - -With Optional Parameters -~~~~~~~~~~~~~~~~~~~~~~~~ - -Some ``END``-keywords have optional parameters: - -.. code-block:: algpseudocode - - \TEXT{\ENDPROGRAM \ENDPROG} \ENDPROGRAM - \TEXT{\ENDALGORITHM \ENDALGO} \ENDALGORITHM - \TEXT{\ENDPROCEDURE \ENDPROC} \ENDPROCEDURE - \TEXT{\ENDFUNCTION \ENDFUNC \ENDFN} \ENDFUNCTION - \TEXT{\ENDCLASS} \ENDCLASS - -They are used like this: - -.. code-block:: algpseudocode - - \TEXT{\CLASS{Foo Bar Class\} ... \END CLASS {Foo Bar Class\}} \TEXT{yields} \CLASS{Foo Bar Class} ... \END CLASS {Foo Bar Class} - -.. seealso:: Syntax variants: `END-Keywords`_ - - -Without Parameters -~~~~~~~~~~~~~~~~~~ - -"Normal" Keywords -''''''''''''''''' - -.. code-block:: algpseudocode - - \TEXT{\IF} \IF - \TEXT{\THEN} \THEN - \TEXT{\ELSE} \ELSE - \TEXT{\ELSEIF or \ELSIF or \ELIF} \ELSEIF \text{or} \ELSIF \text{or} \ELIF - \TEXT{\DO} \DO - \TEXT{\WHILE} \WHILE - \TEXT{\FORALL} \FORALL - \TEXT{\FOR} \FOR - \TEXT{\FROM} \FROM - \TEXT{\TO} \TO - \TEXT{\STEP} \STEP - \TEXT{\IN} \IN - \TEXT{\LOOP} \LOOP - \TEXT{\REPEAT} \REPEAT - \TEXT{\UNTIL} \UNTIL - - \TEXT{\RETURN} \RETURN - - \TEXT{\BEGIN} \BEGIN - \TEXT{\END} \END - - \TEXT{\IS} \IS - \TEXT{\WITH} \WITH - - \TEXT{\GETS} \GETS - - \TEXT{\\REMARK or \\REM} \REMARK A comment with a leading symbol - -``\REMARK`` or ``\REM`` is special: all characters to the end of the -line are taken as comment; curly braces are not needed---in fact: -they are interpreted to be part of the comment. - - -END-Keywords -'''''''''''' - -The separator character can be empty, a run of ASCII spaces, a run of TAB characters, -a single underscore ``_`` or a single hyphen ``-`` like: - - ``\ENDIF``, ``\END IF``, ``\END-IF``, ``\END_IF`` or ``\END IF`` - - -.. code-block:: algpseudocode - - \text{\ENDIF} \ENDIF \rem empty - - \text{\END IF} \END IF \rem a single space - - \text{\END IF} \END IF \rem two spaces - - \text{\END-IF} \END-IF \rem a single hyphen - - \text{\END_IF} \END_IF \rem a single underscore - - \text{\END IF} \END IF \rem a single TAB character - -The list of END-keywords (here always just with ``-`` as separator): - -.. code-block:: algpseudocode - - \text{\END-PROGRAM \END-PROG} \END-PROGRAM - \text{\END-ALGORITHM \END-ALGO} \END-ALGORITHM - \text{\END-PROCEDURE \END-PROC} \END-PROCEDURE - \text{\END-FUNCTION \END-FUNC \END-FN} \END-FUNCTION - \text{\END-CLASS} \END-CLASS - \text{\END-IF} \END-IF - \text{\END-WHILE} \END-WHILE - \text{\END-FOR} \END-FOR - \text{\END-FORALL} \END-FORALL - \text{\END-LOOP} \END-LOOP - - -Names and Entities -================== - -In an expression context all other words are interpreted as entity -names (token type :py:class:`pygments.token.Token.Name.Entity`). - -Allowed characters in the words follow the corresponding `Python`_ rules. -As such, many Unicode characters are allowed. - -To highlight entity names with whitespace or other "special" characters in it -use the ``NAME`` command. - -.. code-block:: algpseudocode - - \TEXT{entity_name_1} entity_name_1 - - \TEXT{entity_name_2} entity_name_2 - - \TEXT{\NAME{entity-name 3\}} \NAME{entity-name 3} - - \TEXT{München} München - - \TEXT{Genève} Genève - -.. _explicit-token-types: - -Explicit Token Types -==================== - -Handle keywords and operators that are not handled by default or change -the default handling of some expressions. - -`XX` represents a `value` in the :py:data:`pygments.token.STANDARD_TYPES` -dict. -Its corresponding token type (the associated `key` in this `dict`) is -used as token type. - -``\\tt-XX/SINGLE-CHAR`` - - no escaping needed - - `SINGLE-CHAR` is a single character and can be *every* character - (including a carriage-return or line-feed) - -``\\ttx-XX{CHARACTERS}`` - -``\\ttx-XX(CHARACTERS)`` - -``\\ttx-XX[CHARACTERS]`` - -``\\ttx-XX<CHARACTERS>`` - -``\\ttx-XX<SEP>CHARACTERS<SEP>`` - - No escaping possible! There are enough alternatives available! - - `SEP` is one of ``/:|=*+!\$~``. - - -Examples: - -.. code-block:: algpseudocode - - \text{• \\tt-kc/C} \tt-kc/C \rem C as Keyword.Constant - \text{• \\tt-ow/∈} \tt-ow/∈ \rem ∈ as Operator.Word - \text{• \\ttx-kc{A New Constant Keyword\}} \ttx-kc{A New Constant Keyword} \rem As a new Keyword.Constant - \text{• \\ttx-nv{A New Variable Name\}} \ttx-nv{A New Variable Name} \rem An explicit Name.Variable - \text{• \\ttx-k(∈ ∌)} \ttx-k(∈ ∌) \rem ∈ and ∌ as (ordinary) Keywords - \text{• \\ttx-o<∈ ∌>} \ttx-o<∈ ∌> \rem ∈ and ∌ as (ordinary) Operators - /* - * The line below has ∈_∌ as (peculiar) function name. - * Their params are automatic (i.e. a normal expression). - */ - \text{• \\ttx-nf<∈_∌>(p1, p2)} \ttx-nf<∈_∌>(p1, p2) - /* - * The line below has ∈_∌ as (peculiar) decorator name (as used in Python). - * Their params are automatic (i.e. a normal expression). - */ - \text{• \\ttx-nd[∈_∌](p1, p2)} \ttx-nd[∈_∌](p1, p2) - /* - * This is a non-existing token type: you get some generic error marking - * with a Generic.Error token and no expansion. - */ - \text{• \\ttx-NON-EXISTING[∈_∌](p1, p2)} \ttx-NON_EXISTING[∈_∌](p1, p2) - -.. note:: Explicit token types are **case-sensitive**. - - -.. _customized-sphinx-lexers: - -Customized Lexers in Sphinx -=========================== - -Defining lexers with non-default options in `Sphinx`_ can be done in its -configuration file :file:`conf.py`. - -The first option is to apply the Sphinx config value ``highlight_options`` -properly. An existing lexer can be customized by options. - -A more flexible alternative is to define a new lexer in the Sphinx -application. The very same lexer class can be used with different options: - -.. code-block:: python - - from functools import partial - from pygments_lexer_pseudocode2.lexers.algpseudocode import AlgPseudocodeLexer - - def setup(app): - - # - # Add a custom lexer: AlgPseudocodeLexer with custom init - # option "no_end". - # - # In modern Sphinx versions given lexer must be callable and may - # not be a lexer instance. So use an indirection with "partial" - # here. - # - app.add_lexer("noend-algpseudocode", - partial(AlgPseudocodeLexer, no_end=True)) - -Similarily it works for custom styles and filters. - -.. note:: Lexers in Sphinx are instantiated with the `raiseonerror` filter - applied by default. - This is also true for custom lexers that are added by - :py:meth:`Sphinx.add_lexer`. - - Lexer *instances* that are added to - :py:data:`sphinx.highlighting.lexers` somehow are taken as is by - Sphinx and are not augmented with any default filters. - -For older Sphinx versions your mileage may vary.
--- a/docs/details-filter.rst Sun May 10 15:27:18 2026 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,31 +0,0 @@ -.. -*- coding: utf-8; indent-tabs-mode: nil; -*- - -.. _details-filter: - -******** - Filter -******** - -ErrorToGenericErrorTokenFilter -============================== - -:Name: errortogenericerror -:Filter Options: none - -Replace all :py:class:`pygments.token.Token.Error` tokens in a stream by -:py:class:`pygments.token.Token.Generic.Error` tokens. - - -TokenReplaceFilter -================== - -:Name: tokenreplace -:Required Filter Options: - **token_from** - **Type:** :py:class:`str` or :py:class:`pygments.token.Token` - - **token_to** - **Type:** :py:class:`str` or :py:class:`pygments.token.Token` - -Replace all token types given in `token_from` by the token type given -in `token_to`.
--- a/docs/details-frpseudocode.rst Sun May 10 15:27:18 2026 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,74 +0,0 @@ -.. -*- coding: utf-8; indent-tabs-mode: nil; -*- - -.. _fr-pseudocode: - -************** - FrPseudocode -************** - -This is the pseudocode lexer from the original `pygments-lexer-pseudocode` -package. - -It has been changed somewhat: - -- renamed from ``Pseudocode`` to ``FrPseudocode`` -- changed aliases to ``fr-pseudocode``, ``fr-pseudo``, ``fr-algorithm`` - and ``fr-algo`` -- changed file extension to ``.fr-algo`` and ``.fr-pseudocode`` -- changed some exististing arrows and added some more -- numbers parsing is more flexible by following the rules of the `Pygments`_ - lexer for `Python`_ -- also allow ``!=`` as inequality operator (in addition to ``<>``) - -It mostly just recognizes some (french) keywords and highlights them. - -Comments are supported (``//`` and ``/* ... */`` (single-line only))- -"Directives" in "special" comments are to be enclosed in curly braces ``{ ... }``. - -It also implements some symbol replacements/conversions like -``<=`` to ``≤``, ``>=`` to ``≥`` or ``<>`` to ``≠``. - - -.. rubric:: Example: - -The follwing example - -.. code-block:: none - - /* foo bar */ - - fonction fonc-1({passage par valeur}param1) - début - si param1 <= 0 alors - b = 0 - sinon - b = 1 - a = param1 - répéter - a = a - 1 - b = b * 2 - tantque a <> 0 - fin si - retourner b - fin fonction - -will be highlighted as - -.. code-block:: fr-algorithm - - /* foo bar */ - - fonction fonc-1({passage par valeur}param1) - début - si param1 <= 0 alors - b = 0 - sinon - b = 1 - a = param1 - répéter - a = a - 1 - b = b * 2 - tantque a <> 0 - fin si - retourner b - fin fonction
--- a/docs/details.rst Sun May 10 15:27:18 2026 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,11 +0,0 @@ -.. -*- coding: utf-8; indent-tabs-mode: nil; -*- - -********* - Details -********* - -.. toctree:: - - details-algpseudocode - details-frpseudocode - details-filter
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/filters.rst Mon May 11 01:31:12 2026 +0200 @@ -0,0 +1,57 @@ +.. -*- coding: utf-8; indent-tabs-mode: nil; -*- + +********* + Filters +********* + +The package contains the following filters: + +.. include:: filterlist.rst + + +The AlgPseudocode lexer yields an error token for the following code block. +`Sphinx`_ therefore suppresses highlighting completely: + +.. code-block:: none + + \EXPR{TEST} + +With a custom AlgPseudocode lexer that has ``prohibit_raiseonerror_filter`` +activated the output in `Sphinx`_ is as: + +.. code-block:: no-raiseonerror-algpseudocode + + \EXPR{TEST} + + +With the "errortogenericerror" filter the very same block is highlighted +as: + +.. code-block:: genericerror-algpseudocode + + \EXPR{TEST} + + +ErrorToGenericErrorTokenFilter +============================== + +:Name: errortogenericerror +:Filter Options: none + +Replace all :py:class:`pygments.token.Token.Error` tokens in a stream by +:py:class:`pygments.token.Token.Generic.Error` tokens. + + +TokenReplaceFilter +================== + +:Name: tokenreplace +:Required Filter Options: + **token_from** + **Type:** :py:class:`str` or :py:class:`pygments.token.Token` + + **token_to** + **Type:** :py:class:`str` or :py:class:`pygments.token.Token` + +Replace all token types given in `token_from` by the token type given +in `token_to`.
--- a/docs/index.rst Sun May 10 15:27:18 2026 +0200 +++ b/docs/index.rst Mon May 11 01:31:12 2026 +0200 @@ -8,8 +8,8 @@ :maxdepth: 2 :caption: Contents: - intro - details + lexers + filters .. * :ref:`genindex` .. * :ref:`modindex`
--- a/docs/intro.rst Sun May 10 15:27:18 2026 +0200 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,167 +0,0 @@ -.. -*- coding: utf-8; indent-tabs-mode: nil; -*- - - -************** - Introduction -************** - -Lexer -===== - -The package contains the following lexers: - -.. include:: lexerlist.rst - -They may be used in `Sphinx`_ by their aliases: - -.. code-block:: none - - .. code-block:: algpseudocode - - \PROGRAM {The Pseudoprogram} \IS - - \END PROGRAM {The Pseudoprogram} - -It will be rendered as: - -.. code-block:: algpseudocode - - \PROGRAM {The Pseudoprogram} \IS - - \END PROGRAM {The Pseudoprogram} - -And the same with the german variant -(using ``.. code-block:: algpseudocode-de`` as language alias): - -.. code-block:: algpseudocode-de - - \PROGRAM {The Pseudoprogram} \IS - - \END PROGRAM {The Pseudoprogram} - -A basic documentation for FrPseudocode you find on its -:ref:`detail page <fr-pseudocode>`. - -The AlgPseudocode lexer and its friends AlgPseudocodeDE and AlgPseudocodeFR -basically work in three states: `default`, `expression` and `text`. - - In expressions it automatically recognizes: - - - Strings (single-quote, double-quote, triple-single-quote, - triple-double-quote, `Python`_ style) - - Numbers (also `Python`_ style) - - (Mathematical) operators and symbols - - ``\TEXT{...}`` - - To switch in a text-mode that prohibits automatic expression - highlighting. - - A closing curly brace can be quoted with ``\}`` to not end the - text mode prematurely. - - - ``\NAME``, ``\CALL`` and ``\GETS`` - - - ``\REM`` and ``\REMARK`` for remarks (aka comments) - - - Names (`Name.Entity`) - - - :ref:`explicit-token-types` - - In the default-mode it recogzizes expressions and additionally all - sorts of comments and commands that are inspired by CTAN's - `Algpseudocodex`_. - - In texts it recogzizes: - - - ``\EXPR`` or ``\EXPRESSION`` - - To switch to expression-mode. - - A closing curly brace can be quoted with ``\}`` to not end the expression - mode prematurely. - - - ``\REM`` and ``\REMARK`` for remarks (aka comments) - - - :ref:`explicit-token-types` - - -.. rubric:: Some Examples - -A synthetic example with many features: - -.. literalinclude:: examples/example-1.pseudocode - :language: algpseudocode - :lines: 2- - -With a customized `AlgPseudocodeLexer` and its `no_end` -option set to ``True``. - -.. literalinclude:: examples/example-1.pseudocode - :language: NoEndAlgPseudocode - :lines: 2- - -This is Wikipedia's description of *Dinic's Algorithm* -(see https://en.wikipedia.org/wiki/Dinic%27s_algorithm): - -.. literalinclude:: examples/algorithm-dinic.description - :language: algpseudocode - :lines: 2- - -This is Wikipedia's pseudocode of the *Ford–Fulkerson Algorithm* -(see https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm): - -.. literalinclude:: examples/algorithm-ford-fulkerson.pseudocode - :language: algpseudocode - :lines: 2- - -This is Wikipedia's pseudocode of the *Edmonds–Karp Algorithm* -(see https://en.wikipedia.org/wiki/Edmonds%E2%80%93Karp_algorithm) -with a custom lexer that skip all ``ENDxxx`` keywords: - -.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode - :language: NoEndAlgPseudocode - :lines: 2- - -And now the *Edmonds–Karp Algorithm* with french keywords: - -.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode - :language: algpseudocode-fr - :lines: 2- - -And again the *Edmonds–Karp Algorithm* with german keywords: - -.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode - :language: algpseudocode-de - :lines: 2- - -More details you will find :ref:`here <details-algpseudocode>`. - - -Filter -====== - -The package contains the following filters: - -.. include:: filterlist.rst - -The AlgPseudocode lexer yields an error token for the following code block. -`Sphinx`_ therefore suppresses highlighting completely: - -.. code-block:: none - - \EXPR{TEST} - -With a custom AlgPseudocode lexer that has ``prohibit_raiseonerror_filter`` -activated the output in `Sphinx`_ is as: - -.. code-block:: no-raiseonerror-algpseudocode - - \EXPR{TEST} - - -With the "errortogenericerror" filter the very same block is highlighted -as: - -.. code-block:: genericerror-algpseudocode - - \EXPR{TEST}
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/lexer-algpseudocode.rst Mon May 11 01:31:12 2026 +0200 @@ -0,0 +1,577 @@ +.. -*- coding: utf-8; indent-tabs-mode: nil; -*- + + +************************************* + AlgPseudocode and Language Variants +************************************* + +These lexers are heavily heavily inspired by CTAN’s `Algpseudocodex`_. +They recogzize expressions and additionally all sorts of comments and +commands that are inspired by `Algpseudocodex`_. + +They may be used in `Sphinx`_ by their aliases: + +.. code-block:: none + + .. code-block:: algpseudocode + + \PROGRAM {The Pseudoprogram} \IS + + \END PROGRAM {The Pseudoprogram} + +It will be rendered as: + +.. code-block:: algpseudocode + + \PROGRAM {The Pseudoprogram} \IS + + \END PROGRAM {The Pseudoprogram} + +And the same with the german variant +(using ``.. code-block:: algpseudocode-de`` as language alias): + +.. code-block:: algpseudocode-de + + \PROGRAM {The Pseudoprogram} \IS + + \END PROGRAM {The Pseudoprogram} + +The AlgPseudocode lexer and its language variants AlgPseudocodeDE and +AlgPseudocodeFR basically work in three states: `default`, +`expression` and `text`. + + In expressions it automatically recognizes: + + - Strings (single-quote, double-quote, triple-single-quote, + triple-double-quote, `Python`_ style) + - Numbers (also `Python`_ style) + - (Mathematical) operators and symbols + - ``\TEXT{...}`` + + To switch in a text-mode that prohibits automatic expression + highlighting. + + A closing curly brace can be quoted with ``\}`` to not end the + text mode prematurely. + + - ``\NAME``, ``\CALL`` and ``\GETS`` + + - ``\REM`` and ``\REMARK`` for remarks (aka comments) + + - Names (`Name.Entity`) + + - :ref:`explicit-token-types` + + In the default-mode it recogzizes expressions and additionally all + sorts of comments and commands that look somewhat like `Algpseudocodex`_ + commands. + + In texts it recogzizes: + + - ``\EXPR`` or ``\EXPRESSION`` + + To switch to expression-mode. + + A closing curly brace can be quoted with ``\}`` to not end the expression + mode prematurely. + + - ``\REM`` and ``\REMARK`` for remarks (aka comments) + + - :ref:`explicit-token-types` + + +.. rubric:: Some Examples + +A synthetic example with many features: + +.. literalinclude:: examples/example-1.pseudocode + :language: algpseudocode + :lines: 2- + +With a customized `AlgPseudocodeLexer` and its `no_end` +option set to ``True``. + +.. literalinclude:: examples/example-1.pseudocode + :language: NoEndAlgPseudocode + :lines: 2- + +This is Wikipedia's description of *Dinic's Algorithm* +(see https://en.wikipedia.org/wiki/Dinic%27s_algorithm): + +.. literalinclude:: examples/algorithm-dinic.description + :language: algpseudocode + :lines: 2- + +This is Wikipedia's pseudocode of the *Ford–Fulkerson Algorithm* +(see https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm): + +.. literalinclude:: examples/algorithm-ford-fulkerson.pseudocode + :language: algpseudocode + :lines: 2- + +This is Wikipedia's pseudocode of the *Edmonds–Karp Algorithm* +(see https://en.wikipedia.org/wiki/Edmonds%E2%80%93Karp_algorithm) +with a custom lexer that skip all ``ENDxxx`` keywords: + +.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode + :language: NoEndAlgPseudocode + :lines: 2- + +And now the *Edmonds–Karp Algorithm* with french keywords: + +.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode + :language: algpseudocode-fr + :lines: 2- + +And again the *Edmonds–Karp Algorithm* with german keywords: + +.. literalinclude:: examples/algorithm-edmonds-karp.pseudocode + :language: algpseudocode-de + :lines: 2- + +More details you will find :ref:`here <details-algpseudocode>`. + + +.. _details-algpseudocode: + +Lexer Options +============= + + .. describe:: prohibit_raiseonerror_filter + + **Type:** :py:class:`bool` + + **Default:** `False` + + If ``True`` the `raiseonerror` filter is not allowed to be applied by + `Sphinx`_ when :py:meth:`Lexer.add_filter` is called. + + This setting does not apply to filters that are set by the standard + lexer option `filters`. + + .. describe:: no_end + + **Type:** :py:class:`bool` + + **Default:** `False` + + If ``True`` all the ``\ENDxxx`` commands will be skipped and yield + nothing. + + .. describe:: gets + + **Type:** :py:class:`str` or :py:obj:`None` + + **Default:** `None` (yields ``←``) + + The operator symbol to be printed by the command ``\GETS``. + + An often used alternative is ``:=``. + + .. describe:: remark + + **Type:** :py:class:`str` or :py:obj:`None` + + **Default:** `None` (yields ``▷``) + + The symbol to be printed as when starting comments with + ``\REMARK`` or ``\REM``. + + To use a lexer with non-default options in `Sphinx`_ see section + :ref:`customized-sphinx-lexers`. + + +Comments +======== + +- with the ``\REMARK`` or ``\REM`` keywords (this includes a leading symbol) +- multi-line comments with ``/* ... */``; they can be **nested** +- multi-line comments with ``(* ... *)``; they can be **nested** +- single-line comments with ``//`` or ``#`` (until the end of the line) + +.. code-block:: algpseudocode + + /* + * A single multiline comment + */ + + /* + * A multiline comment + * + * /* This is a nested multi-line comment */ + * + */ + + (* + * A multiline comment + * + * (* This is a nested multi-line comment *) + * + *) + + // A single-line comment + + # A single-line comment + + \REM A remark has a leading symbol + + +Literals +======== + +Strings and numbers as in `Python`_. String prefixes ``r``, ``f`` and ``t`` +are not supported -- ``u`` and ``b`` are. + +To yield non-string-delimiting single- and double-quotes you have to escape them +using ``\'`` or ``\"``. This must be used to typeset something as +:algpseudocode:`f\\'(x) = 0`. + +.. code-block:: algpseudocode + + 0 0xdead 0b100001 0o720 2.7 2.7e-54 + + "A string with an escaped double-quote \" " + + 'Another string with an escaped single-quote \' ' + + """A multiline + string + """ + + '''Another multiline string + + ''' + + b"A \x20 byte string" + + u'An explicit Unicode \u1234 string' + + \" a non string + + \' a non string also + + +(Mathematical) Symbols and Operators +==================================== + +Some ASCII symbol combinations are recognized and replaced by a +Unicode symbol: + +.. code-block:: algpseudocode + + \TEXT{<=>} <=> + \TEXT{<->} <-> + \TEXT{<-} <- + \TEXT{->} -> + \TEXT{=>} => + \TEXT{<=} <= + \TEXT{>=} >= + \TEXT{<>} <> + \TEXT{!=} != + \TEXT{:=} := + \TEXT{=:} =: + \TEXT{?=} ?= + +Unicode codepoints with property ``Sm`` are recognized as mathematical symbol +and highlighted accordingly. + + +Punctuation +=========== + +Runs of dots ``.``, ``..``, ``...``, ``....``, ... are handled +properly in expressions and yield a punctuation token. +They are not replaced by corresponding Unicode symbols. + + +Keywords +======== + +Explicit Keywords +----------------- + +- Start with a backslash character ``\`` +- Case-insensitive +- Translated if a translation is found + +Parameter handling is as follows: + +- Parameters are enclosed in curly braces ``{`` and ``}`` +- Escaping within the braces is possible using the backslash ``\`` +- Parameters are separated from the keyword/command by a (possibly empty) run + of space or TAB characters. + This is true for required and optional parameters. + +.. todo:: Escaping + + A single backslash is a Generic.Error token + + +With Required Parameters +~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: algpseudocode + + \TEXT{\PROGRAM {A Program\} or \PROG {A Program\}} \PROGRAM {A Program} + \TEXT{\ALGORITHM{An Algorithm\} or \ALGO{An Algorithm\}} \ALGORITHM{An Algorithm} + \TEXT{\PROCEDURE{A Procedure\} or \PROC{A Procedure\}} \PROCEDURE{A Procedure} + \TEXT{\FUNCTION{A Function\} or \FUNC{A Function\} or \FN{A Function\}} \FUNCTION{A Function} + \TEXT{\CLASS{A Class\}} \CLASS{A Class} + + \TEXT{\STATEMENT{the expression\} \STATE{the expression\} \BLOCK{the expression\}} \STATEMENT{the expression} + + \TEXT{expr1: \\EXPRESSION{expression a in b\} expr2: \\EXPR{expression b in a\}} \TEXT{expr1: \EXPRESSION{expression a in b} expr2: \EXPR{expression b in a}} + + \TEXT{\TEXTSTATEMENT{the text\} \TEXTSTATE{the text\} \TSTATEMENT{the text\} \TSTATE{the text\} \TEXTBLOCK{the text\} \TBLOCK{the text\}} \TEXTSTATEMENT{the text} + + \TEXT{\INPUT{Input 1\}} \INPUT{Input 1} + \TEXT{\INPUTS{Input 2\}} \INPUTS{Input 2} + + \TEXT{\OUTPUT{Output 1\}} \OUTPUT{Output 1} + \TEXT{\OUTPUTS{Output 2\}} \OUTPUTS{Output 2} + + \TEXT{\ENSURE{Whatever should be ensured!\}} \ENSURE{Whatever should be ensured!} + + \TEXT{\REQUIRE{Whatever should be required.\}} \REQUIRE{Whatever should be required.} + + \TEXT{\RETURNS{Return 2\}} \RETURNS{Return 2} + + \TEXT{\CALL{a function\}(p1, p2)} \CALL{a function}(p1, p2) + + \TEXT{\NAME{an entity name\}} \NAME{an entity name} + + +With Optional Parameters +~~~~~~~~~~~~~~~~~~~~~~~~ + +Some ``END``-keywords have optional parameters: + +.. code-block:: algpseudocode + + \TEXT{\ENDPROGRAM \ENDPROG} \ENDPROGRAM + \TEXT{\ENDALGORITHM \ENDALGO} \ENDALGORITHM + \TEXT{\ENDPROCEDURE \ENDPROC} \ENDPROCEDURE + \TEXT{\ENDFUNCTION \ENDFUNC \ENDFN} \ENDFUNCTION + \TEXT{\ENDCLASS} \ENDCLASS + +They are used like this: + +.. code-block:: algpseudocode + + \TEXT{\CLASS{Foo Bar Class\} ... \END CLASS {Foo Bar Class\}} \TEXT{yields} \CLASS{Foo Bar Class} ... \END CLASS {Foo Bar Class} + +.. seealso:: Syntax variants: `END-Keywords`_ + + +Without Parameters +~~~~~~~~~~~~~~~~~~ + +"Normal" Keywords +''''''''''''''''' + +.. code-block:: algpseudocode + + \TEXT{\IF} \IF + \TEXT{\THEN} \THEN + \TEXT{\ELSE} \ELSE + \TEXT{\ELSEIF or \ELSIF or \ELIF} \ELSEIF \text{or} \ELSIF \text{or} \ELIF + \TEXT{\DO} \DO + \TEXT{\WHILE} \WHILE + \TEXT{\FORALL} \FORALL + \TEXT{\FOR} \FOR + \TEXT{\FROM} \FROM + \TEXT{\TO} \TO + \TEXT{\STEP} \STEP + \TEXT{\IN} \IN + \TEXT{\LOOP} \LOOP + \TEXT{\REPEAT} \REPEAT + \TEXT{\UNTIL} \UNTIL + + \TEXT{\RETURN} \RETURN + + \TEXT{\BEGIN} \BEGIN + \TEXT{\END} \END + + \TEXT{\IS} \IS + \TEXT{\WITH} \WITH + + \TEXT{\GETS} \GETS + + \TEXT{\\REMARK or \\REM} \REMARK A comment with a leading symbol + +``\REMARK`` or ``\REM`` is special: all characters to the end of the +line are taken as comment; curly braces are not needed---in fact: +they are interpreted to be part of the comment. + + +END-Keywords +'''''''''''' + +The separator character can be empty, a run of ASCII spaces, a run of TAB characters, +a single underscore ``_`` or a single hyphen ``-`` like: + + ``\ENDIF``, ``\END IF``, ``\END-IF``, ``\END_IF`` or ``\END IF`` + + +.. code-block:: algpseudocode + + \text{\ENDIF} \ENDIF \rem empty + + \text{\END IF} \END IF \rem a single space + + \text{\END IF} \END IF \rem two spaces + + \text{\END-IF} \END-IF \rem a single hyphen + + \text{\END_IF} \END_IF \rem a single underscore + + \text{\END IF} \END IF \rem a single TAB character + +The list of END-keywords (here always just with ``-`` as separator): + +.. code-block:: algpseudocode + + \text{\END-PROGRAM \END-PROG} \END-PROGRAM + \text{\END-ALGORITHM \END-ALGO} \END-ALGORITHM + \text{\END-PROCEDURE \END-PROC} \END-PROCEDURE + \text{\END-FUNCTION \END-FUNC \END-FN} \END-FUNCTION + \text{\END-CLASS} \END-CLASS + \text{\END-IF} \END-IF + \text{\END-WHILE} \END-WHILE + \text{\END-FOR} \END-FOR + \text{\END-FORALL} \END-FORALL + \text{\END-LOOP} \END-LOOP + + +Names and Entities +================== + +In an expression context all other words are interpreted as entity +names (token type :py:class:`pygments.token.Token.Name.Entity`). + +Allowed characters in the words follow the corresponding `Python`_ rules. +As such, many Unicode characters are allowed. + +To highlight entity names with whitespace or other "special" characters in it +use the ``NAME`` command. + +.. code-block:: algpseudocode + + \TEXT{entity_name_1} entity_name_1 + + \TEXT{entity_name_2} entity_name_2 + + \TEXT{\NAME{entity-name 3\}} \NAME{entity-name 3} + + \TEXT{München} München + + \TEXT{Genève} Genève + +.. _explicit-token-types: + +Explicit Token Types +==================== + +Handle keywords and operators that are not handled by default or change +the default handling of some expressions. + +`XX` represents a `value` in the :py:data:`pygments.token.STANDARD_TYPES` +dict. +Its corresponding token type (the associated `key` in this `dict`) is +used as token type. + +``\\tt-XX/SINGLE-CHAR`` + + no escaping needed + + `SINGLE-CHAR` is a single character and can be *every* character + (including a carriage-return or line-feed) + +``\\ttx-XX{CHARACTERS}`` + +``\\ttx-XX(CHARACTERS)`` + +``\\ttx-XX[CHARACTERS]`` + +``\\ttx-XX<CHARACTERS>`` + +``\\ttx-XX<SEP>CHARACTERS<SEP>`` + + No escaping possible! There are enough alternatives available! + + `SEP` is one of ``/:|=*+!\$~``. + + +Examples: + +.. code-block:: algpseudocode + + \text{• \\tt-kc/C} \tt-kc/C \rem C as Keyword.Constant + \text{• \\tt-ow/∈} \tt-ow/∈ \rem ∈ as Operator.Word + \text{• \\ttx-kc{A New Constant Keyword\}} \ttx-kc{A New Constant Keyword} \rem As a new Keyword.Constant + \text{• \\ttx-nv{A New Variable Name\}} \ttx-nv{A New Variable Name} \rem An explicit Name.Variable + \text{• \\ttx-k(∈ ∌)} \ttx-k(∈ ∌) \rem ∈ and ∌ as (ordinary) Keywords + \text{• \\ttx-o<∈ ∌>} \ttx-o<∈ ∌> \rem ∈ and ∌ as (ordinary) Operators + /* + * The line below has ∈_∌ as (peculiar) function name. + * Their params are automatic (i.e. a normal expression). + */ + \text{• \\ttx-nf<∈_∌>(p1, p2)} \ttx-nf<∈_∌>(p1, p2) + /* + * The line below has ∈_∌ as (peculiar) decorator name (as used in Python). + * Their params are automatic (i.e. a normal expression). + */ + \text{• \\ttx-nd[∈_∌](p1, p2)} \ttx-nd[∈_∌](p1, p2) + /* + * This is a non-existing token type: you get some generic error marking + * with a Generic.Error token and no expansion. + */ + \text{• \\ttx-NON-EXISTING[∈_∌](p1, p2)} \ttx-NON_EXISTING[∈_∌](p1, p2) + +.. note:: Explicit token types are **case-sensitive**. + + +.. _customized-sphinx-lexers: + +Customized Lexers in Sphinx +=========================== + +Defining lexers with non-default options in `Sphinx`_ can be done in its +configuration file :file:`conf.py`. + +The first option is to apply the Sphinx config value ``highlight_options`` +properly. An existing lexer can be customized by options. + +A more flexible alternative is to define a new lexer in the Sphinx +application. The very same lexer class can be used with different options: + +.. code-block:: python + + from functools import partial + from pygments_lexer_pseudocode2.lexers.algpseudocode import AlgPseudocodeLexer + + def setup(app): + + # + # Add a custom lexer: AlgPseudocodeLexer with custom init + # option "no_end". + # + # In modern Sphinx versions given lexer must be callable and may + # not be a lexer instance. So use an indirection with "partial" + # here. + # + app.add_lexer("noend-algpseudocode", + partial(AlgPseudocodeLexer, no_end=True)) + +Similarily it works for custom styles and filters. + +.. note:: Lexers in Sphinx are instantiated with the `raiseonerror` filter + applied by default. + This is also true for custom lexers that are added by + :py:meth:`Sphinx.add_lexer`. + + Lexer *instances* that are added to + :py:data:`sphinx.highlighting.lexers` somehow are taken as is by + Sphinx and are not augmented with any default filters. + +For older Sphinx versions your mileage may vary.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/lexer-frpseudocode.rst Mon May 11 01:31:12 2026 +0200 @@ -0,0 +1,80 @@ +.. -*- coding: utf-8; indent-tabs-mode: nil; -*- + +.. _XXfr-pseudocode: + +************** + FrPseudocode +************** + +This is the pseudocode lexer from the original `pygments-lexer-pseudocode` +package. + +It has been changed somewhat: + +- renamed from ``Pseudocode`` to ``FrPseudocode`` +- changed aliases to ``fr-pseudocode``, ``fr-pseudo``, ``fr-algorithm`` + and ``fr-algo`` +- changed file extension to ``.fr-algo`` and ``.fr-pseudocode`` +- changed some exististing arrows and added some more +- numbers parsing is more flexible by following the rules of the `Pygments`_ + lexer for `Python`_ +- also allow ``!=`` as inequality operator (in addition to ``<>``) + +It mostly just recognizes some (french) keywords and highlights them. + +Comments are supported (``//`` and ``/* ... */`` (single-line only))- +"Directives" in "special" comments are to be enclosed in curly braces ``{ ... }``. + +It also implements some symbol replacements/conversions like +``<=`` to ``≤``, ``>=`` to ``≥`` or ``<>`` to ``≠``. + + +.. rubric:: Example: + +The follwing example + +.. code-block:: none + + /* foo bar */ + + fonction fonc-1({passage par valeur}param1) + début + si param1 <= 0 alors + b = 0 + sinon + b = 1 + a = param1 + répéter + a = a - 1 + b = b * 2 + tantque a <> 0 + fin si + retourner b + fin fonction + +will be highlighted as + +.. code-block:: fr-algorithm + + /* foo bar */ + + fonction fonc-1({passage par valeur}param1) + début + si param1 <= 0 alors + b = 0 + sinon + b = 1 + a = param1 + répéter + a = a - 1 + b = b * 2 + tantque a <> 0 + fin si + retourner b + fin fonction + + +Lexer Options +============= + +There are no lexer options besides the `Pygments`_ standard lexer options.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/lexers.rst Mon May 11 01:31:12 2026 +0200 @@ -0,0 +1,14 @@ +.. -*- coding: utf-8; indent-tabs-mode: nil; -*- + +******** + Lexers +******** + +.. toctree:: + + lexer-algpseudocode + lexer-frpseudocode + +The package contains the following lexers: + +.. include:: lexerlist.rst
