view docs/introduction.rst @ 682:7c41a52253bf

Docs
author Franz Glasner <fzglas.hg@dom66.de>
date Sat, 10 Jun 2023 04:08:40 +0200
parents 8647eb175bb9
children 3062b63325c4
line wrap: on
line source

.. -*- coding: utf-8; indent-tabs-mode: nil; -*-

.. _introduction:

Introduction
============

.. contents::
   :local:

The configurations can be read from different types of files:

- :ref:`YAML files <yaml-files>`
- :ref:`JSON files <json-files>`
- :ref:`INI files <ini-files>`
- :ref:`TOML files <toml-files>`
- :ref:`executable Python scripts <executable-python-scripts>`


.. _yaml-files:

YAML Files
----------

Need the :mod:`yaml` package (https://github.com/yaml/pyyaml)
(e.g. ``pip install pyyaml``)

.. note:: All strings are returned as Unicode text strings.

.. note:: The root object must be a *mapping* and therefore decode
          into a Python :class:`dict` alike. This is checked by the
          implementation.

An example is:

.. literalinclude:: ../tests/data/conf10.yml
   :language: yaml


.. _json-files:

JSON files
----------

Read the JSON file with the help of Python's native :mod:`json` package.

.. note:: All strings are returned as Unicode text strings.

.. note:: The root object must be an *object* and therefore decode
          into a Python :class:`dict` alike. This is checked by the
          implementation.

An example is:

.. literalinclude:: ../tests/data/conf10.json
   :language: js

For comments in JSON files see section :ref:`comments`.


.. _ini-files:

INI Files
---------

Read the file and all sections named in parameter `extract` are flattened
into the resulting dictionary. By default the section named ``config`` is
used as root section.

Normally all values are returned as Unicode text strings.
But values can be annotated and therefore interpreted as other types:

  ``:int:``
      The value is handled in the same way as a Python :class:`int`
      literal

  ``:float:``
      The value is interpreted as :class:`float`

  ``:bool:``
      The resulting value is a :class:`bool` where

        ``1``, ``true``, ``yes``, ``on``
           yield a Python ``True``

        ``0``, ``false``, ``no``, ``off``
           yield a Python ``False``

      The evaluation is done *case-insensitively*.

.. note:: All strings are returned as Unicode text strings.

.. note:: Contrary to the behaviour of the standard Python :mod:`configparser`
          module the INI file reader is *case-sensitive*.

The example INI style configuration below yields an equivalent
configuration to the YAML configuration above:

.. literalinclude:: ../tests/data/conf10.ini
   :language: ini

As can be seen in this example -- INI file internal value interpolation
is done as in Python's standard :mod:`configparser` module.

This example also illustrates how INI sections are used to build a
tree-ish configuration dictionary.


.. _toml-files:

TOML Files
----------

Read the TOML file with the help of the pure Python :mod:`toml`
package (https://github.com/uiri/toml) (e.g. ``pip install toml``).

All TOML features map seamingless to "ConfigMix".

The example TOML style configuration below yields an equivalent
configuration to the YAML configuration above:


.. literalinclude:: ../tests/data/conf10.toml
   :language: ini


.. _executable-python-scripts:

Executable Python Scripts
-------------------------

What will be exported:

1. If loading is done with the `extract` parameter only the given keys are
   extracted from the script.

2. Otherwise it is checked if the scripts defines an ``__all__``
   sequence. If there is one it's contents are the keys to be
   extracted.

3. If there is no ``__all__`` object all names not starting with an
   underscore ``_`` are found.

This is analogous to as Python modules behave when importing them with
``from module import *``.

.. note:: The Python configuration files are evaluated with ``exec`` and not
          imported.

The example configuration by Python script below yields an equivalent
configuration to the YAML configuration above:

.. literalinclude:: ../tests/data/conf10.py
   :language: python


.. _loading-and-merging:

Loading and Merging
-------------------

Basic usage of the API is as follows in this example::

    import configmix

    #
    # Note: With conf10 merging is rather pointless because the tree
    # files # are really the same configuration. But is doesn't harm
    # also here.
    #
    config = configmix.load("conf10.yml", "conf10.ini", "conf10.py")

    # Get a -- possibly interpolated -- configuration variable's value
    value1 = config.getvar_s("key1")

    # Get a -- possibly interpolated -- variable from within the tree
    value2 = config.getvar_s("tree1.tree2.key4")


By default filenames of the configuration files must have the extensions
(case-sensitivety depends on your OS):

  ``.ini``
    for INI configuration files

  ``.json``
    for JSON configuration files

  ``.py``
    for Python configuration files

  ``.toml``
    for TOML configuration file

  ``.yml`` or ``.yaml``
    for YAML configuration files


.. _getting-values:

Getting configuration variables
-------------------------------

Get a -- possibly interpolated -- configuration variable's value with ::

    value1 = config.getvar_s("key1")
    value2 = config.getvar_s("key1.subkey2")

or equivalently with ::

    value1 = config.getvarl_s("key1")
    value2 = config.getvarl_s("key1", "subkey2")

Get a raw configuration variable's value with ::

    value1_raw = config.getvar("key1")
    value2_raw = config.getvarl("key1.subkey2")

or equivalently with ::

    value1_raw = config.getvarl("key1")
    value2_raw = config.getvarl("key1", "subkey2")

Because the configuration is not only a plain list of but a tree of
key-value pairs you will want to fetch a nested configuration value
using two access basic methods:

  :py:meth:`.Configuration.getvar` and :py:meth:`.Configuration.getvar_s`

    Use a single key variable where the invidual level keys are joined
    using a dot (``.``)

  :py:meth:`.Configuration.getvarl` and :py:meth:`.Configuration.getvarl_s`

    Use just positional Python arguments for each level key

Also there exist variants of the basic access methods that coerce
returned variables into :py:class:`int` or :py:class:`bool` types
(:py:meth:`.Configuration.getintvar_s`,
:py:meth:`.Configuration.getboolvar_s`)

And with :py:meth:`.Configuration.getfirstvar`,
:py:meth:`.Configuration.getfirstvar_s`,
:py:meth:`.Configuration.getfirstintvar_s`,
:py:meth:`.Configuration.getfirstboolvar_s` and
:py:meth:`.Configuration.getfirstfloatvar_s` there exist variants that
accept a *list* of possible variables names and return the first one
that is found.

And again --- with :py:meth:`.Configuration.getfirstvarl`,
:py:meth:`.Configuration.getfirstvarl_s`,
:py:meth:`.Configuration.getfirstintvarl_s`,
:py:meth:`.Configuration.getfirstboolvarl_s` and
:py:meth:`.Configuration.getfirstfloatvarl_s` there exist variants
that accept a *list* of lists or tuples or dicts that describe
possible variables names and return the first one that is found.

For example -- these methods for retrieving the first found variables
can be used and are equivalent (Note that a caller that wants to use
variables from a non-default namespace must use a sequence of dicts
here)::

  value1 = config.getfirstvar_s("key1.subkey2", "key3.subkey4", default=None, namespace=None)
  value2 = config.getfirstvarl_s(*[["key1", "subkey2"], ["key3", "subkey4"]], default=None)
  value3 = config.getfirstvarl_s(*(("key1", "subkey2"), ("key3", "subkey4")), default=None)
  value4 = config.getfirstvarl_s(*[{"namespace": None, "path": ["key1", "subkey2"]}, {"namespace": None, "path": ("key3", "subkey4")}], default=None)

Looking at the example in chapter :ref:`yaml-files` -- when calling
``config.getvar_s("tree1.tree2.key4")`` you will get the value
``get this as `tree1.tree2.key4'``.

Alternatively ``config.getvarl_s("tree1", "tree2", "key4")`` can be called
with the very same result.

All four methods also perform direct :ref:`variable-interpolation` and
handle :ref:`variable-namespaces` -- yet in different ways.
Filtering is not supported.
So -- the variable name arguments of :py:meth:`.Configuration.getvar`
and :py:meth:`.Configuration.getvar_s` are of the form
``[namespace:]variable`` where for :py:meth:`.Configuration.getvarl`
and :py:meth:`.Configuration.getvarl_s` the namespace is given as
optional keyword parameter `namespace`.

.. note:: Special characters within namespace, key and filter names
          *must* be quoted (see :ref:`quoting`) when using
          :py:meth:`~.Configuration.getvar` or
          :py:meth:`~.Configuration.getvar_s` to retrieve variables.

          With :py:meth:`~.Configuration.getvarl` or
          :py:meth:`~.Configuration.getvarl_s` quoting is neither needed
          and not supported.


Direct Access to List Items
---------------------------

Direct access to list items is possible:

- Directly use the integer list index in :py:meth:`~.Configuration.getvarl`
  and its friends.
- Encode the index number to string format using the ``~INDEX~`` syntax and
  use :py:meth:`~.Configuration.getvar` and its friends.

  This syntax is also supported for variable interpolations.

Negative indexes are supported with Python semantics.

Examples:

- ``config.getvarl_s("mylist", 0)`` or ``config.getvar_s("mylist.~0~)"``
- ``config.getvarl_s("mylist", -1)`` or ``config.getvar_s("mylist.~-1~")``

This also works when using `Jailed Configurations`_.

Use `Quoting`_ for the ``~`` character (``%x7e``) when the ``~INDEX~``
syntax should not be interpreted as list index but as key string.


.. _merging-deletions:

Deletions
---------

By using the special value ``{{::DEL::}}`` the corresponding key-value
pair is deleted when merging is done.


.. _comments:

Comments
--------

By default all keys beginning with ``__comment`` or ``__doc`` are
filtered out and not given to the application. This allows comments in
JSON files -- but is not restricted to JSON files only.

For all types of configuration files their respective standard comments
are allowed too.


.. _variable-namespaces:

Variable Namespaces
-------------------

Currently there are 6 namespaces:

1. The unnamed namespace (which is also default).

   All the configuration variables are part of this namespace.

   .. seealso:: :ref:`quoting`

2. The namespace ``ref`` to be used for configuration references.

   This is a namespace that is handled special within "ConfigMix".

   Must be Filters are **not** supported.

   Think of them as symbolic links.

3. The namespace ``OS``

   Available functions:

     ``cwd``
         Contains the current working directory of the process

     ``node``
         Contains the current node's computername (or whatever
         :py:func:`platform.node` returns)

4. The namespace ``ENV``

   This namespace contains all the environment variables as they are
   available from :py:data:`os.environ`.

5. The namespace ``PY``

   Contains selected values from the running Python:

     ``version``
         The return value of :py:func:`platform.python_version`

     ``version_maj_min``
         Just the major and minor version of the running Python
         (``.`` separated)

     ``version_maj``
         Just the major version of the running Python

     ``implementation``
         The return value of :py:func:`platform.python_implementation`

6. The namespace ``AWS``

   Contains some metadata for AWS instances when running from within
   AWS:

     ``metadata.instance-id``

     ``metadata.placement.region``

     ``metadata.placement.availability-zone``

     ``dynamic.instance-identity.region``
       and all other properties of the instance-identity document
       (e.g. ``instanceId``, ``instanceType``, ``imageId``, ``pendingTime``,
       ``architecture``, ``availabilityZone``, ``privateIp``, ``version``
       et al.).


Examples
~~~~~~~~

Both ::

     config.getvar("OS:cwd")

or ::

     config.getvarl("cwd", namespace="OS")

yield the current working directory -- just as :py:func:`os.getcwd` does.


.. _variable-interpolation:

Variable Interpolation
----------------------

Configuration variable values that are read with
:py:meth:`.Configuration.getvar_s` or :py:meth:`.Configuration.getvarl_s`
are subject to variable interpolation.
The general syntactic pattern for this is::

    {{[namespace:]variable[|filter[|filter...]]}}

I.e.: between double curly braces an optional `namespace` name followed by
a colon ``:``, the `variable` and then zero or more filters, each one
introduced by a pipe symbol ``|``.

Variables are expanded *lately* at runtime -- exactly when calling
:py:meth:`.Configuration.getvar_s`,
:py:meth:`.Configuration.getvarl_s`,
:py:meth:`.Configuration.substitute_variables_in_obj` or
:py:meth:`.Configuration.expand_variable`

.. note:: Special characters within namespace, key and filter names
          *must* be quoted (see :ref:`quoting`) when using variable
          interpolation syntax.


Filter functions
~~~~~~~~~~~~~~~~

Interpolated values can be processed through a series of filter functions::

    {{my.variable|filter1|filter2}}

Available filter functions are:

  ``urlquote``

  ``urlquote_plus``

  ``saslprep``

  ``normpath``

  ``abspath``

  ``posixpath``

  ``lower``

  ``upper``

Also available are special filter functions ``None`` and ``Empty``.
They are useful in variable interpolation context because they
suppress possible lookup errors (aka :py:exc:`KeyError`) and instead
return with :py:obj:`None` or an empty string.


Examples
~~~~~~~~

::

    {{OS:cwd|posixpath}}

expands to the current working directory as POSIX path: on Windows all
backslashes are replaced by forward slashes.

::

    {{ENV:PATH}}

expands to the current search path from the process environment.

::

    {{PY:version}}

expands to the current running Python version (e.g. ``3.6.4``).

::

    {{PY:implementation|upper}}

expands to something like ``CPYTHON`` when using the standard Python
interpreter written in C.


Configuration tree references
-----------------------------

With ``{{ref:#my.other.key}}``

- think of it as a sort of a symbolic link to other parts of the
  configuration tree
- by employing the special namespace ``ref``
- can not be quoted currently in variable interpolation syntax
- No special handling when merging is done -- merging is agnostic of
  tree references
- Keys within :meth:`.Configuration.getvar_s`,
  :py:meth:`.Configuration.getvar`, :py:meth:`.Configuration.getvarl`
  and :py:meth:`.Configuration.getvarl_s`  are handled
- in :py:meth:`.Configuration.getvar` only, when it is the directly
  referenced value
- recursive expansion in :py:meth:`.Configuration.getvar_s` and
  :py:meth:`.Configuration.getvarl_s`:
  beware of recursive (direct or indirect) tree references


.. _quoting:

Quoting
-------

When using :py:meth:`.Configuration.getvar` and
:py:meth:`.Configuration.getvar_s` and when retrieving values in the
default namespace the namespace separator ``:`` or the hierarchy
separator ``.`` are characters with a special meaning. When using
:ref:`variable interpolation <variable-interpolation>` the filter
separator ``|`` is also special. To use them in key names they must be
quoted.

Quoting is done with a variant of the well-known percent-encoding in
URIs (:rfc:`3986`).

A percent-encoded character consists of the percent character ``%``,
followed by one of the characters ``x``, ``u`` or ``U``, followed by
the two, four or eight hexadecimal digits of the unicode codepoint
value of the character that is to be quoted. ``x`` must be followed by
two hex digits, ``u`` by four and ``U`` by eight.

Example:

  The character ``.`` with the Unicode (and ASCII) value 46 (hex 0x2e)
  can be encoded as ``%x2e`` or ``%u002e`` or ``%U0000002e``.

.. note:: Filters neeed no quoting -- and quoting within filters is *not*
          supported.

.. note:: Quoting the ``ref`` namespace name does not work currently when
          used in variable interpolation syntax.


.. _jailed-configuration:

Jailed Configurations
---------------------

With :meth:`configmix.config.Configuration.jailed` you get a `jailed`
(or `restricted`) configuration from a "normal" configuration.

Restriction is two-fold:

- The access to configuration variables in `config` is restricted
  to the configuration sub-tree that is configured in `path`.

- Not all access-methods of :class:`Configuration` are implemented
  yet.

This is somewhat analogous to a `chroot` environment for filesystems.

.. note:: The word "jail" is shamelessly stolen from FreeBSD jails.

Usage example::

    import configmix

    config = configmix.load("conf10.py")
    assert not config.is_jail
    value = config.getvar_s("tree1.tree2.key4")

    jailed_config1 = config.jailed(rootpath="tree1.tree2")
    assert jailed_config1.is_jail
    assert jailed_config1.base is config
    jvalue1 = jailed_config1.getvar_s("key4")

    jailed_config2 = config.jailed(root=("tree1", "tree2"))
    assert jailed_config2.is_jail
    assert jailed_config2.base is config
    jvalue2 = jailed_config.getvarl_s("key4")

    assert value == jvalue1 == jvalue2 == "get this as `tree1.tree2.key4'"

`jvalue1` and `jvalue2` (and `value`) yield the very same value
``get this as `tree1.tree2.key4'`` from the configuration.

All access methods in a jailed configuration automatically prepend the
given `root path` in order to get the effective key into the base
configuration.

It is possible to get a jailed configuration from an already jailed
configuration. This sub-jail inherits the unjailed base configuration from
the jailed configuration by default.

  ::

    import configmix

    config = configmix.load("conf10.py")
    assert not config.is_jail
    value = config.getvar_s("tree1.tree2.key4")

    jailed_config1 = config.jailed(rootpath="tree1")
    assert jailed_config1.is_jail
    assert jailed_config2.base is config

    jailed_config2 = jailed_config2.jailed(rootpath="tree2")
    assert jailed_config2.is_jail
    assert jailed_config2.base is config
    jvalue2 = jailed_config.getvarl_s("key4")

    assert value == jvalue2 == "get this as `tree1.tree2.key4'"

.. note:: A jailed configuration holds a strong reference to the unjailed
          base configuration.

.. note:: If a jail's root path points to a location with a variable
          substitutions the jail does not work: it is not possible to
          expand the substitution.

          Using ``ref`` namespaces instead works: think of
          symbolic links.


Custom filename extensions and custom loaders
---------------------------------------------

If you want to have custom configuration file extensions and/or custom loaders
for custom configuration files you have various possibilities:

  Associate an additional new extension (e.g. ".conf") with an
  existing configuration file style (e.g. YAML)::

    configmix.set_assoc("*.conf", configmix.get_assoc("*.yml"))

  Allow only files with extension ".cfg" in INI-style -- using the default
  loader for INI-files::

    configmix.clear_assoc()
    configmix.set_assoc("*.cfg", configmix.get_default_assoc("*.ini"))

  Only a new configuration file style::

    def my_custom_loader(filename):
        ...
        return some_dict_alike

    configmix.mode_loaders["myconfmode"] = my_custom_loader
    configmix.clear_assoc()
    configmix.set_assoc("*.my.configuration", "myconfmode")

  If :py:func:`~configmix.clear_assoc` will not be called then just a *new*
  configuration file style will be installed.

  To select the loader not by extension but by an Emacs-compatible mode
  declaration (e.g. ``mode: yaml``)  in the first two lines of a file use::

    configmix.set_assoc("*", configmix.try_determine_filemode)