Serializing YAML with PyYAML

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

The process of serialization to YAML is rendering an in-memory data structure as a YAML-formatted string. The simplest sequence of statements that does that is:

import yaml

data = {
    'color': 'red',
    'size': 10,
    'parts': ['top', 'middle', 'bottom']
}

yaml_string = yaml.dump(data)

The YAML-formatted string will be:

color: red
parts:
- top
- middle
- bottom
size: 10

PyYAML Concepts

PyYAML core model is centered on constructors, representers and tags.

Constructor

https://matthewpburruss.com/post/yaml/#defining-pyyaml-constructors-going-from-yaml-to-python

A constructor allows you to take a serialized YAML node and return a class instance.

Representer

https://matthewpburruss.com/post/yaml/#defining-pyyaml-representers-going-from-python-to-yaml

A representer is a function intercepts the data object instances to be serialized, optionally processes them, and then messages a Dumper instance to create the proper serialized representation for the data object instance, as a Node. The representer gets the Dumper instance as a first argument, and the data object as the second.

def my_representer(dumper: SafeDumper, data):
  return dumper.represent_scalar()

The representers are registered with add_representer(). Representers can be added for specific types (such as str or int), or for ???

yaml.add_representer(str, my_representer)

Tag

The tag uses the special character ! preceding the tag name to label a YAML node. A tag helps PyYAML to know which constructor or representer to call.

Customizing Output

Customizing Output with dump() Parameters

Customizing Output with Representers