YAML in Python: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=
* [[Python Code Examples#Subjects|Python Code Examples]]
* [[Python Code Examples#Subjects|Python Code Examples]]
* [[YAML]]
=PyYAML=
=PyYAML=
{{External|https://pyyaml.org/}}
{{External|https://pyyaml.org/}}
Line 21: Line 23:
<font color=darkkhaki>To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.</font>
<font color=darkkhaki>To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.</font>
=Overview=
=Overview=
=Concepts=
PyYAML core model is centered on [[#Constructor|constructors]], [[#Representer|representers]] and [[#Tag|tags]].
==Constructor==
{{External|https://matthewpburruss.com/post/yaml/#defining-pyyaml-constructors-going-from-yaml-to-python}}
A constructor allows you to take a serialized YAML node and return a class instance.
==Representer==
A representer is user-defined function that intercepts, as part of the YAML serialization process, data object instances to be serialized, optionally processes them, and then messages the <code>Dumper</code> instance received as argument to create the proper serialized representation for the given data object instance, as a <code>Node</code> instance. The instance thus created is returned as result of the representer function, contributing to the serialization result. The representer gets the <code>Dumper</code> instance as a first argument, and the data object as the second.


=Deserialize YAML=
This is an implementation of a representer that uses the [[Strings_in_YAML#Literal_Block_Scalar|literal block scalar]] representation for multi-line strings, and the default [[Strings_in_YAML#Plain_Flow_Scalar|plain flow scalar]] representation for everything else.
<syntaxhighlight lang='python'>
<syntaxhighlight lang='py'>
import yaml
def custom_scalar_representer(dumper: Dumper, data: str):
    # for multi-line strings, use the literal block scalar representation, use the default otherwise
    if data and '\n' in data:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, '|')
    else:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data)
</syntaxhighlight>
 
In this case, <code>tag:yaml.org,2002:str</code> is the [[#Tag|tag]] name for string nodes.


with open('some-file.yaml', 'rt') as f:
The representers are registered with <code>add_representer()</code>. Representers can be added for specific types (such as <code>str</code> or <code>int</code>):
  content = f.read()
<syntaxhighlight lang='py'>
data = yaml.load(content, Loader=yaml.Loader)
yaml.add_representer(str, custom_scalar_representer)
</syntaxhighlight>
</syntaxhighlight>


=Serialize YAML=
===Restoring the Default Representer===
{{Internal|Serializing YAML with PyYAML#Overview|Serializing YAML with PyYAML}}


<syntaxhighlight lang='python'>
There may be situations when we want to restore the representer that was replaced by a <code>add_representer()</code> method. This can be achieved by accessing directly the representer storage <code>yaml.representer.Representer.yaml_representers</code>:
import yaml


# with the default flow style, the document is rendered in a non-indented manner
:[[File:YAML_Representers.png]]
print (yaml.dump(data, default_flow_style=False))
</syntaxhighlight>


==YAML Serialization Configuration==
<syntaxhighlight lang='py'>
default_representer = yaml.representer.Representer.yaml_representers[str]


=Safely Navigate a Complex Data Structure=
# replace
Suggestions on how to safely recursively navigate a complex data structure:
yaml.add_representer(str, custom_scalar_representer)
{{Internal|Python Safely Navigate a Complex Data Structure#Overview|Safely Navigate a Complex Data Structure}}


=Representers=
yaml_string = yaml.dump(data)
* https://github.com/yaml/pyyaml/issues/98
==Configure <tt>yaml.dump()</tt> to render blank instead of <tt>null</tt>==
PyYAML <code>dump()</code> uses <code>Representer()</code> to represent <code>None</code>. By default, the representer <code>dump()</code> is configured with represents <code>None</code> as "null". To change that:


1. Define a method that "represents" <code>None</code>
# restore
yaml.add_representer(str, default_representer)


<syntaxhighlight lang='py'>
yaml_string = yaml.dump(data)
def representer_for_none(self, _):
    return self.represent_scalar('tag:yaml.org,2002:null', '')
</syntaxhighlight>
</syntaxhighlight>


2. Add it to the module:
==Tag==
The tag uses the special character <code>!</code> preceding the tag name to label a YAML node.
A tag helps PyYAML to know which [[#Constructor|constructor]] or [[#Representer|representer]] to call.


<syntaxhighlight lang='py'>
=Deserialize YAML=
<syntaxhighlight lang='python'>
import yaml
import yaml


[...]
with open('some-file.yaml', 'rt') as f:
  content = f.read()


yaml.add_representer(type(None), represent_none)
data = yaml.safe_load(content)
</syntaxhighlight>


This will render:
# alternative:
<syntaxhighlight lang='py'>
data = yaml.load(content, Loader=yaml.Loader)
d = {
    'a': None,
    'b': 'c'
}
</syntaxhighlight>
as:
<syntaxhighlight lang='yaml'>
a:
b: c
</syntaxhighlight>
</syntaxhighlight>


<font color=darkkhaki>what is the lifecycle of the method registered to the module?</font>
=Serialize YAML=
==Configure <tt>yaml.dump()</tt> to render | multi-lines==
{{Internal|Serializing YAML with PyYAML#Overview|Serializing YAML with PyYAML}}
<syntaxhighlight lang='py'>
def literal_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')


def default_presenter(dumper, data):
=Safely Navigate a Complex Data Structure=
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='')
Suggestions on how to safely recursively navigate a complex data structure:
 
{{Internal|Python Safely Navigate a Complex Data Structure#Overview|Safely Navigate a Complex Data Structure}}
try:
  yaml.add_representer(str, literal_presenter)
  yaml_text = yaml.dump(config)
  return yaml_text
finally:
    yaml.add_representer(str, default_presenter)
</syntaxhighlight>

Latest revision as of 02:06, 23 June 2023

Internal

PyYAML

https://pyyaml.org/
https://pyyaml.org/wiki/PyYAMLDocumentation
https://pypi.org/project/PyYAML/

PyYAML provides YAML serialization/deserialization in Python.

Installation

pip install pyyaml

requirements.txt:

pyyaml == 5.3.1

To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.

Overview

Concepts

PyYAML core model is centered on constructors, representers and tags.

Constructor

https://matthewpburruss.com/post/yaml/#defining-pyyaml-constructors-going-from-yaml-to-python

A constructor allows you to take a serialized YAML node and return a class instance.

Representer

A representer is user-defined function that intercepts, as part of the YAML serialization process, data object instances to be serialized, optionally processes them, and then messages the Dumper instance received as argument to create the proper serialized representation for the given data object instance, as a Node instance. The instance thus created is returned as result of the representer function, contributing to the serialization result. The representer gets the Dumper instance as a first argument, and the data object as the second.

This is an implementation of a representer that uses the literal block scalar representation for multi-line strings, and the default plain flow scalar representation for everything else.

def custom_scalar_representer(dumper: Dumper, data: str):
    # for multi-line strings, use the literal block scalar representation, use the default otherwise
    if data and '\n' in data:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, '|')
    else:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data)

In this case, tag:yaml.org,2002:str is the tag name for string nodes.

The representers are registered with add_representer(). Representers can be added for specific types (such as str or int):

yaml.add_representer(str, custom_scalar_representer)

Restoring the Default Representer

There may be situations when we want to restore the representer that was replaced by a add_representer() method. This can be achieved by accessing directly the representer storage yaml.representer.Representer.yaml_representers:

YAML Representers.png
default_representer = yaml.representer.Representer.yaml_representers[str]

# replace
yaml.add_representer(str, custom_scalar_representer)

yaml_string = yaml.dump(data)

# restore
yaml.add_representer(str, default_representer)

yaml_string = yaml.dump(data)

Tag

The tag uses the special character ! preceding the tag name to label a YAML node. A tag helps PyYAML to know which constructor or representer to call.

Deserialize YAML

import yaml

with open('some-file.yaml', 'rt') as f:
  content = f.read()

data = yaml.safe_load(content)

# alternative:
data = yaml.load(content, Loader=yaml.Loader)

Serialize YAML

Serializing YAML with PyYAML

Safely Navigate a Complex Data Structure

Suggestions on how to safely recursively navigate a complex data structure:

Safely Navigate a Complex Data Structure