Serializing YAML with PyYAML

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

The process of serialization to YAML is rendering an in-memory data structure as a YAML-formatted string. The simplest sequence of statements that does that is:

import yaml

data = {
    'color': 'red',
    'size': 10,
    'parts': ['top', 'middle', 'bottom']
}

yaml_string = yaml.dump(data)

The YAML-formatted string will be:

color: red
parts:
- top
- middle
- bottom
size: 10

PyYAML Concepts

PyYAML core model is centered on constructors, representers and tags.

Constructor

https://matthewpburruss.com/post/yaml/#defining-pyyaml-constructors-going-from-yaml-to-python

A constructor allows you to take a serialized YAML node and return a class instance.

Representer

https://matthewpburruss.com/post/yaml/#defining-pyyaml-representers-going-from-python-to-yaml

A representer is a function intercepts the data object instances to be serialized, as part of the YAML serialization process, optionally processes them, and then messages the Dumper instance to create the proper serialized representation for the data object instance, as a Node instance. The instance thus created is returned as result of the function. The representer gets the Dumper instance as a first argument, and the data object as the second.

def my_representer(dumper: SafeDumper, data):
  return dumper.represent_scalar()

The representers are registered with add_representer(). Representers can be added for specific types (such as str or int), or for ???

yaml.add_representer(str, my_representer)

Tag

The tag uses the special character ! preceding the tag name to label a YAML node. A tag helps PyYAML to know which constructor or representer to call.

Customizing Output

Customizing Output with dump() Parameters

Customizing Output with Representers