YAML in Python: Difference between revisions
(→PyYAML) |
|||
(27 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=Internal= | =Internal= | ||
* [[Python Code Examples#Subjects|Python Code Examples]] | * [[Python Code Examples#Subjects|Python Code Examples]] | ||
* [[YAML]] | |||
=PyYAML= | =PyYAML= | ||
{{External|https://pyyaml.org/}} | {{External|https://pyyaml.org/}} | ||
{{External|https://pyyaml.org/wiki/PyYAMLDocumentation}} | {{External|https://pyyaml.org/wiki/PyYAMLDocumentation}} | ||
{{External|https://pypi.org/project/PyYAML/}} | |||
PyYAML provides YAML serialization/deserialization in Python. | PyYAML provides YAML serialization/deserialization in Python. | ||
==Installation== | |||
<syntaxhighlight lang='bash'> | |||
pip install pyyaml | |||
</syntaxhighlight> | |||
<code>requirements.txt</code>: | |||
<syntaxhighlight lang='text'> | |||
pyyaml == 5.3.1 | |||
</syntaxhighlight> | |||
<font color=darkkhaki>To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.</font> | <font color=darkkhaki>To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.</font> | ||
=Overview= | |||
=Concepts= | |||
PyYAML core model is centered on [[#Constructor|constructors]], [[#Representer|representers]] and [[#Tag|tags]]. | |||
==Constructor== | |||
{{External|https://matthewpburruss.com/post/yaml/#defining-pyyaml-constructors-going-from-yaml-to-python}} | |||
A constructor allows you to take a serialized YAML node and return a class instance. | |||
==Representer== | |||
A representer is user-defined function that intercepts, as part of the YAML serialization process, data object instances to be serialized, optionally processes them, and then messages the <code>Dumper</code> instance received as argument to create the proper serialized representation for the given data object instance, as a <code>Node</code> instance. The instance thus created is returned as result of the representer function, contributing to the serialization result. The representer gets the <code>Dumper</code> instance as a first argument, and the data object as the second. | |||
This is an implementation of a representer that uses the [[Strings_in_YAML#Literal_Block_Scalar|literal block scalar]] representation for multi-line strings, and the default [[Strings_in_YAML#Plain_Flow_Scalar|plain flow scalar]] representation for everything else. | |||
<syntaxhighlight lang='py'> | |||
def custom_scalar_representer(dumper: Dumper, data: str): | |||
# for multi-line strings, use the literal block scalar representation, use the default otherwise | |||
if data and '\n' in data: | |||
return dumper.represent_scalar('tag:yaml.org,2002:str', data, '|') | |||
else: | |||
return dumper.represent_scalar('tag:yaml.org,2002:str', data) | |||
</syntaxhighlight> | |||
In this case, <code>tag:yaml.org,2002:str</code> is the [[#Tag|tag]] name for string nodes. | |||
The representers are registered with <code>add_representer()</code>. Representers can be added for specific types (such as <code>str</code> or <code>int</code>): | |||
<syntaxhighlight lang='py'> | |||
yaml.add_representer(str, custom_scalar_representer) | |||
</syntaxhighlight> | |||
===Restoring the Default Representer=== | |||
There may be situations when we want to restore the representer that was replaced by a <code>add_representer()</code> method. This can be achieved by accessing directly the representer storage <code>yaml.representer.Representer.yaml_representers</code>: | |||
:[[File:YAML_Representers.png]] | |||
<syntaxhighlight lang='py'> | |||
default_representer = yaml.representer.Representer.yaml_representers[str] | |||
# replace | |||
yaml.add_representer(str, custom_scalar_representer) | |||
yaml_string = yaml.dump(data) | |||
# restore | |||
yaml.add_representer(str, default_representer) | |||
yaml_string = yaml.dump(data) | |||
</syntaxhighlight> | |||
==Tag== | |||
The tag uses the special character <code>!</code> preceding the tag name to label a YAML node. | |||
A tag helps PyYAML to know which [[#Constructor|constructor]] or [[#Representer|representer]] to call. | |||
=Deserialize YAML= | =Deserialize YAML= | ||
Line 15: | Line 81: | ||
with open('some-file.yaml', 'rt') as f: | with open('some-file.yaml', 'rt') as f: | ||
content = f.read() | content = f.read() | ||
data = yaml.safe_load(content) | |||
# alternative: | |||
data = yaml.load(content, Loader=yaml.Loader) | data = yaml.load(content, Loader=yaml.Loader) | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=Serialize YAML= | =Serialize YAML= | ||
{{Internal|Serializing YAML with PyYAML#Overview|Serializing YAML with PyYAML}} | |||
=Safely Navigate a Complex Data Structure= | =Safely Navigate a Complex Data Structure= | ||
Suggestions on how to safely recursively navigate a complex data structure: | Suggestions on how to safely recursively navigate a complex data structure: | ||
{{Internal|Python Safely Navigate a Complex Data Structure#Overview|Safely Navigate a Complex Data Structure}} | {{Internal|Python Safely Navigate a Complex Data Structure#Overview|Safely Navigate a Complex Data Structure}} |
Latest revision as of 02:06, 23 June 2023
Internal
PyYAML
PyYAML provides YAML serialization/deserialization in Python.
Installation
pip install pyyaml
requirements.txt
:
pyyaml == 5.3.1
To install, see the "Installation" section from https://pyyaml.org/wiki/PyYAMLDocumentation.
Overview
Concepts
PyYAML core model is centered on constructors, representers and tags.
Constructor
A constructor allows you to take a serialized YAML node and return a class instance.
Representer
A representer is user-defined function that intercepts, as part of the YAML serialization process, data object instances to be serialized, optionally processes them, and then messages the Dumper
instance received as argument to create the proper serialized representation for the given data object instance, as a Node
instance. The instance thus created is returned as result of the representer function, contributing to the serialization result. The representer gets the Dumper
instance as a first argument, and the data object as the second.
This is an implementation of a representer that uses the literal block scalar representation for multi-line strings, and the default plain flow scalar representation for everything else.
def custom_scalar_representer(dumper: Dumper, data: str):
# for multi-line strings, use the literal block scalar representation, use the default otherwise
if data and '\n' in data:
return dumper.represent_scalar('tag:yaml.org,2002:str', data, '|')
else:
return dumper.represent_scalar('tag:yaml.org,2002:str', data)
In this case, tag:yaml.org,2002:str
is the tag name for string nodes.
The representers are registered with add_representer()
. Representers can be added for specific types (such as str
or int
):
yaml.add_representer(str, custom_scalar_representer)
Restoring the Default Representer
There may be situations when we want to restore the representer that was replaced by a add_representer()
method. This can be achieved by accessing directly the representer storage yaml.representer.Representer.yaml_representers
:
default_representer = yaml.representer.Representer.yaml_representers[str]
# replace
yaml.add_representer(str, custom_scalar_representer)
yaml_string = yaml.dump(data)
# restore
yaml.add_representer(str, default_representer)
yaml_string = yaml.dump(data)
Tag
The tag uses the special character !
preceding the tag name to label a YAML node.
A tag helps PyYAML to know which constructor or representer to call.
Deserialize YAML
import yaml
with open('some-file.yaml', 'rt') as f:
content = f.read()
data = yaml.safe_load(content)
# alternative:
data = yaml.load(content, Loader=yaml.Loader)
Serialize YAML
Suggestions on how to safely recursively navigate a complex data structure: