Strings in YAML: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 5: Line 5:
* http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html
* http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html
* https://yaml-multiline.info
* https://yaml-multiline.info
* https://www.educative.io/answers/what-is-flow-style-in-yaml


=Internal=
=Internal=

Revision as of 20:22, 7 December 2022

External

Internal

Overview

There is wide variety of choices when it comes to representing strings in YAML. Strings can represented as flow scalars and block scalars. A flow scalar can be plain, single quoted and double quoted. A block scalar can be literal or folded.

Flow Scalars

Plain Flow Scalar

Single Quoted Flow Scalar

Double Quoted Flow Scalar

Block Scalars

Literal Block Scalar

Folded Block Scalar

Multi-Line Strings

TODO

TODO:

  • How can it be parsed and dumped with python.

TODEPLETE

Strings do not require quotation, but it is recommended to quote them, to explicitly specify they are strings and type inference should not be attempted.

The following representations are equivalent for a string:

s1: bare words string
s2: "a double-quoted string"
s3: 'a single-quoted string'

All forms represented above are named "inline", in that the strings must be rendered on one line.

In the bare word format, characters cannot be escaped.

Double-quoted strings can have specific characters escaped with \. Double quotes can be escaped with \" and line breaks can be escaped with \n.

Single-quoted strings are "literal" strings, they do not use \ to escape characters. The only escape sequence is '' (two single quotes), which is decoded as a single '.


Literal Style

Multi-line strings can be written using the '|' character followed by a new line. To be considered multi-line content, the first line under the '|'-terminated line must be indented on a level deeper that the line containing the '|' and the subsequent multi-line lines must be indented with the same offset as the first line. Trailing white space is stripped. The new line characters are preserved.

This is correct multi-line (the multi-lines must be indented under 'data'):

data: |
  This is a
  multi-line
  text section

The value of data is equivalent with "This is a\nmulti-line\ntext selection\n".

This is NOT a correct mult-line, because the "mult-lines" are not correctly indented:

data: |
This is not
a correct multi-line

The "|" multi-line operator implies that a trailing newline will be added to the string. In the correct above example, the data value will be equivalent with "This is a\nmulti-line\ntext selection\n" - note the trailing new line. If we want the YAML processor to strip off the trailing newline, we should use "|-" instead of "|":

dataWithoutTrailingNewLine: |-
  This is a
  multi-line
  text section

The dataWithoutTrailingNewLine value is equivalent with "This is a\nmulti-line\ntext selection". Note the lack of trailing newline.

If we want the trailing whitespace to be preserved, we should use "|+" instead of "|":

dataWithTrailingWhitespacePreserved: |+
  This is a
  multi-line
  text section


another: value

The dataWithTrailingWhitespacePreserved value is equivalent with "This is a\nmulti-line\ntext selection\n\n\n".

Folded Style

The '>' character followed by a new line folds all the new lines, after removing trailing white space and new lines. All but the last newline will be converted to space.

data: >
   This is another
   multi-line
   text section
   but in the final form
   it will be just one long string
   without new lines
   except the last one

The data value is equivalent with "This is another multi-line text section but in the final form it will be just one long string without new lines except the last one\n"

If we wan to drop the trailing newline instead of preserving it, use ">-" instead of ">":

dataWihtoutTrailingNewLine: >-
   Something
   else

dataWihtoutTrailingNewLine value is equivalent with "Something else". Note the lack of trailing newline.