PyYAML not parsing all examples












2















I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:



PyYAML features

- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.


If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.



I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.



But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?










share|improve this question

























  • Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

    – torek
    Jan 1 at 12:43











  • @torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

    – Anthon
    Jan 1 at 13:41











  • Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

    – tinita
    Jan 1 at 14:55
















2















I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:



PyYAML features

- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.


If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.



I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.



But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?










share|improve this question

























  • Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

    – torek
    Jan 1 at 12:43











  • @torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

    – Anthon
    Jan 1 at 13:41











  • Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

    – tinita
    Jan 1 at 14:55














2












2








2








I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:



PyYAML features

- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.


If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.



I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.



But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?










share|improve this question
















I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:



PyYAML features

- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.


If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.



I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.



But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?







python yaml pyyaml






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 1 at 13:34









Anthon

30.6k1795147




30.6k1795147










asked Jan 1 at 12:26









SusinSusin

154




154













  • Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

    – torek
    Jan 1 at 12:43











  • @torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

    – Anthon
    Jan 1 at 13:41











  • Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

    – tinita
    Jan 1 at 14:55



















  • Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

    – torek
    Jan 1 at 12:43











  • @torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

    – Anthon
    Jan 1 at 13:41











  • Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

    – tinita
    Jan 1 at 14:55

















Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

– torek
Jan 1 at 12:43





Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.

– torek
Jan 1 at 12:43













@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

– Anthon
Jan 1 at 13:41





@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable) collections.abc.Mapping.

– Anthon
Jan 1 at 13:41













Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

– tinita
Jan 1 at 14:55





Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.

– tinita
Jan 1 at 14:55












1 Answer
1






active

oldest

votes


















4














To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.



Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.



PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list and
that cannot be used as a key for a Python dict. This is a
restriction of Python's dict implementation and IMO one of the deficiencies of PyYAML.



import sys
import yaml as pyyaml

yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23

? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""

for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)


gives:



StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()


So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):



pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)


as this gives:



---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}


Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.





In ruamel.yaml (disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.



from pprint import pprint
import ruamel.yaml
import yaml as pyyaml

yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)


as that gives:



{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]





share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995421%2fpyyaml-not-parsing-all-examples%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4














    To start with your last question: this is not because of the examples
    coming from a later specification. Assuming you restrict yourself to
    the Preview chapter/section in the spec (as does the online parser),
    and taking into account that I have only compared the examples
    visually (i.e. not on a character for character basis), the examples
    in the 1.2 and 1.1 specification for chapter/section 2 are the same.



    Your misinterpretation comes from the use of the word parser in the
    title of the online parser. What it actually tries to do, is load the
    YAML and then dump to JSON, Python or canonical YAML. The loading in
    PyYAML consists of the stages mentioned in the Processing
    Overview picture in the YAML
    spec (same for 1.1 and 1.2), starting with the character based
    document: a parsing, composing and construction step.



    PyYAML doesn't fail on the parsing step, it fails on the construction
    step, because (as @torek indicates) PyYAML constructs a list and
    that cannot be used as a key for a Python dict. This is a
    restriction of Python's dict implementation and IMO one of the deficiencies of PyYAML.



    import sys
    import yaml as pyyaml

    yaml_1_1_example_2_11 = """
    ? - Detroit Tigers
    - Chicago cubs
    :
    - 2001-07-23

    ? [ New York Yankees,
    Atlanta Braves ]
    : [ 2001-07-02, 2001-08-12,
    2001-08-14 ]
    """

    for event in pyyaml.parse(yaml_1_1_example_2_11):
    print(event)


    gives:



    StreamStartEvent()
    DocumentStartEvent()
    MappingStartEvent(anchor=None, tag=None, implicit=True)
    SequenceStartEvent(anchor=None, tag=None, implicit=True)
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
    SequenceEndEvent()
    SequenceStartEvent(anchor=None, tag=None, implicit=True)
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
    SequenceEndEvent()
    SequenceStartEvent(anchor=None, tag=None, implicit=True)
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
    SequenceEndEvent()
    SequenceStartEvent(anchor=None, tag=None, implicit=True)
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
    ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
    SequenceEndEvent()
    MappingEndEvent()
    DocumentEndEvent()
    StreamEndEvent()


    So PyYAML can parse this correctly. Not only that, if the online "parser"
    would not try to load, then dump, when emitting canonical YAML, it could
    process this example (replacing the last two lines of the above
    code):



    pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)


    as this gives:



    ---
    {
    ? [
    ! "Detroit Tigers",
    ! "Chicago cubs",
    ]
    : [
    ! "2001-07-23",
    ],
    ? [
    ! "New York Yankees",
    ! "Atlanta Braves",
    ]
    : [
    ! "2001-07-02",
    ! "2001-08-12",
    ! "2001-08-14",
    ],
    }


    Stating that PyYAML parses all examples, is like me stating that I can
    read Greek. I learned the Greek alphabet back in the 70's, so I can
    read (the) Greek (characters), but I don't understand the words they form.





    In ruamel.yaml (disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
    dump the loaded data.



    from pprint import pprint
    import ruamel.yaml
    import yaml as pyyaml

    yaml = ruamel.yaml.YAML(typ='safe')
    data = yaml.load(yaml_1_1_example_2_11)
    pprint(data)
    print('*' * 50)
    yaml.dump(data, sys.stdout)
    print('*' * 50)
    pyyaml.safe_dump(data, sys.stdout)


    as that gives:



    {('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
    ('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
    datetime.date(2001, 8, 12),
    datetime.date(2001, 8, 14)]}
    **************************************************
    ? [Detroit Tigers, Chicago cubs]
    : [2001-07-23]
    ? [New York Yankees, Atlanta Braves]
    : [2001-07-02, 2001-08-12, 2001-08-14]
    **************************************************
    ? [Detroit Tigers, Chicago cubs]
    : [2001-07-23]
    ? [New York Yankees, Atlanta Braves]
    : [2001-07-02, 2001-08-12, 2001-08-14]





    share|improve this answer






























      4














      To start with your last question: this is not because of the examples
      coming from a later specification. Assuming you restrict yourself to
      the Preview chapter/section in the spec (as does the online parser),
      and taking into account that I have only compared the examples
      visually (i.e. not on a character for character basis), the examples
      in the 1.2 and 1.1 specification for chapter/section 2 are the same.



      Your misinterpretation comes from the use of the word parser in the
      title of the online parser. What it actually tries to do, is load the
      YAML and then dump to JSON, Python or canonical YAML. The loading in
      PyYAML consists of the stages mentioned in the Processing
      Overview picture in the YAML
      spec (same for 1.1 and 1.2), starting with the character based
      document: a parsing, composing and construction step.



      PyYAML doesn't fail on the parsing step, it fails on the construction
      step, because (as @torek indicates) PyYAML constructs a list and
      that cannot be used as a key for a Python dict. This is a
      restriction of Python's dict implementation and IMO one of the deficiencies of PyYAML.



      import sys
      import yaml as pyyaml

      yaml_1_1_example_2_11 = """
      ? - Detroit Tigers
      - Chicago cubs
      :
      - 2001-07-23

      ? [ New York Yankees,
      Atlanta Braves ]
      : [ 2001-07-02, 2001-08-12,
      2001-08-14 ]
      """

      for event in pyyaml.parse(yaml_1_1_example_2_11):
      print(event)


      gives:



      StreamStartEvent()
      DocumentStartEvent()
      MappingStartEvent(anchor=None, tag=None, implicit=True)
      SequenceStartEvent(anchor=None, tag=None, implicit=True)
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
      SequenceEndEvent()
      SequenceStartEvent(anchor=None, tag=None, implicit=True)
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
      SequenceEndEvent()
      SequenceStartEvent(anchor=None, tag=None, implicit=True)
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
      SequenceEndEvent()
      SequenceStartEvent(anchor=None, tag=None, implicit=True)
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
      ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
      SequenceEndEvent()
      MappingEndEvent()
      DocumentEndEvent()
      StreamEndEvent()


      So PyYAML can parse this correctly. Not only that, if the online "parser"
      would not try to load, then dump, when emitting canonical YAML, it could
      process this example (replacing the last two lines of the above
      code):



      pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)


      as this gives:



      ---
      {
      ? [
      ! "Detroit Tigers",
      ! "Chicago cubs",
      ]
      : [
      ! "2001-07-23",
      ],
      ? [
      ! "New York Yankees",
      ! "Atlanta Braves",
      ]
      : [
      ! "2001-07-02",
      ! "2001-08-12",
      ! "2001-08-14",
      ],
      }


      Stating that PyYAML parses all examples, is like me stating that I can
      read Greek. I learned the Greek alphabet back in the 70's, so I can
      read (the) Greek (characters), but I don't understand the words they form.





      In ruamel.yaml (disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
      dump the loaded data.



      from pprint import pprint
      import ruamel.yaml
      import yaml as pyyaml

      yaml = ruamel.yaml.YAML(typ='safe')
      data = yaml.load(yaml_1_1_example_2_11)
      pprint(data)
      print('*' * 50)
      yaml.dump(data, sys.stdout)
      print('*' * 50)
      pyyaml.safe_dump(data, sys.stdout)


      as that gives:



      {('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
      ('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
      datetime.date(2001, 8, 12),
      datetime.date(2001, 8, 14)]}
      **************************************************
      ? [Detroit Tigers, Chicago cubs]
      : [2001-07-23]
      ? [New York Yankees, Atlanta Braves]
      : [2001-07-02, 2001-08-12, 2001-08-14]
      **************************************************
      ? [Detroit Tigers, Chicago cubs]
      : [2001-07-23]
      ? [New York Yankees, Atlanta Braves]
      : [2001-07-02, 2001-08-12, 2001-08-14]





      share|improve this answer




























        4












        4








        4







        To start with your last question: this is not because of the examples
        coming from a later specification. Assuming you restrict yourself to
        the Preview chapter/section in the spec (as does the online parser),
        and taking into account that I have only compared the examples
        visually (i.e. not on a character for character basis), the examples
        in the 1.2 and 1.1 specification for chapter/section 2 are the same.



        Your misinterpretation comes from the use of the word parser in the
        title of the online parser. What it actually tries to do, is load the
        YAML and then dump to JSON, Python or canonical YAML. The loading in
        PyYAML consists of the stages mentioned in the Processing
        Overview picture in the YAML
        spec (same for 1.1 and 1.2), starting with the character based
        document: a parsing, composing and construction step.



        PyYAML doesn't fail on the parsing step, it fails on the construction
        step, because (as @torek indicates) PyYAML constructs a list and
        that cannot be used as a key for a Python dict. This is a
        restriction of Python's dict implementation and IMO one of the deficiencies of PyYAML.



        import sys
        import yaml as pyyaml

        yaml_1_1_example_2_11 = """
        ? - Detroit Tigers
        - Chicago cubs
        :
        - 2001-07-23

        ? [ New York Yankees,
        Atlanta Braves ]
        : [ 2001-07-02, 2001-08-12,
        2001-08-14 ]
        """

        for event in pyyaml.parse(yaml_1_1_example_2_11):
        print(event)


        gives:



        StreamStartEvent()
        DocumentStartEvent()
        MappingStartEvent(anchor=None, tag=None, implicit=True)
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
        SequenceEndEvent()
        MappingEndEvent()
        DocumentEndEvent()
        StreamEndEvent()


        So PyYAML can parse this correctly. Not only that, if the online "parser"
        would not try to load, then dump, when emitting canonical YAML, it could
        process this example (replacing the last two lines of the above
        code):



        pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)


        as this gives:



        ---
        {
        ? [
        ! "Detroit Tigers",
        ! "Chicago cubs",
        ]
        : [
        ! "2001-07-23",
        ],
        ? [
        ! "New York Yankees",
        ! "Atlanta Braves",
        ]
        : [
        ! "2001-07-02",
        ! "2001-08-12",
        ! "2001-08-14",
        ],
        }


        Stating that PyYAML parses all examples, is like me stating that I can
        read Greek. I learned the Greek alphabet back in the 70's, so I can
        read (the) Greek (characters), but I don't understand the words they form.





        In ruamel.yaml (disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
        dump the loaded data.



        from pprint import pprint
        import ruamel.yaml
        import yaml as pyyaml

        yaml = ruamel.yaml.YAML(typ='safe')
        data = yaml.load(yaml_1_1_example_2_11)
        pprint(data)
        print('*' * 50)
        yaml.dump(data, sys.stdout)
        print('*' * 50)
        pyyaml.safe_dump(data, sys.stdout)


        as that gives:



        {('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
        ('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
        datetime.date(2001, 8, 12),
        datetime.date(2001, 8, 14)]}
        **************************************************
        ? [Detroit Tigers, Chicago cubs]
        : [2001-07-23]
        ? [New York Yankees, Atlanta Braves]
        : [2001-07-02, 2001-08-12, 2001-08-14]
        **************************************************
        ? [Detroit Tigers, Chicago cubs]
        : [2001-07-23]
        ? [New York Yankees, Atlanta Braves]
        : [2001-07-02, 2001-08-12, 2001-08-14]





        share|improve this answer















        To start with your last question: this is not because of the examples
        coming from a later specification. Assuming you restrict yourself to
        the Preview chapter/section in the spec (as does the online parser),
        and taking into account that I have only compared the examples
        visually (i.e. not on a character for character basis), the examples
        in the 1.2 and 1.1 specification for chapter/section 2 are the same.



        Your misinterpretation comes from the use of the word parser in the
        title of the online parser. What it actually tries to do, is load the
        YAML and then dump to JSON, Python or canonical YAML. The loading in
        PyYAML consists of the stages mentioned in the Processing
        Overview picture in the YAML
        spec (same for 1.1 and 1.2), starting with the character based
        document: a parsing, composing and construction step.



        PyYAML doesn't fail on the parsing step, it fails on the construction
        step, because (as @torek indicates) PyYAML constructs a list and
        that cannot be used as a key for a Python dict. This is a
        restriction of Python's dict implementation and IMO one of the deficiencies of PyYAML.



        import sys
        import yaml as pyyaml

        yaml_1_1_example_2_11 = """
        ? - Detroit Tigers
        - Chicago cubs
        :
        - 2001-07-23

        ? [ New York Yankees,
        Atlanta Braves ]
        : [ 2001-07-02, 2001-08-12,
        2001-08-14 ]
        """

        for event in pyyaml.parse(yaml_1_1_example_2_11):
        print(event)


        gives:



        StreamStartEvent()
        DocumentStartEvent()
        MappingStartEvent(anchor=None, tag=None, implicit=True)
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
        SequenceEndEvent()
        SequenceStartEvent(anchor=None, tag=None, implicit=True)
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
        ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
        SequenceEndEvent()
        MappingEndEvent()
        DocumentEndEvent()
        StreamEndEvent()


        So PyYAML can parse this correctly. Not only that, if the online "parser"
        would not try to load, then dump, when emitting canonical YAML, it could
        process this example (replacing the last two lines of the above
        code):



        pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)


        as this gives:



        ---
        {
        ? [
        ! "Detroit Tigers",
        ! "Chicago cubs",
        ]
        : [
        ! "2001-07-23",
        ],
        ? [
        ! "New York Yankees",
        ! "Atlanta Braves",
        ]
        : [
        ! "2001-07-02",
        ! "2001-08-12",
        ! "2001-08-14",
        ],
        }


        Stating that PyYAML parses all examples, is like me stating that I can
        read Greek. I learned the Greek alphabet back in the 70's, so I can
        read (the) Greek (characters), but I don't understand the words they form.





        In ruamel.yaml (disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
        dump the loaded data.



        from pprint import pprint
        import ruamel.yaml
        import yaml as pyyaml

        yaml = ruamel.yaml.YAML(typ='safe')
        data = yaml.load(yaml_1_1_example_2_11)
        pprint(data)
        print('*' * 50)
        yaml.dump(data, sys.stdout)
        print('*' * 50)
        pyyaml.safe_dump(data, sys.stdout)


        as that gives:



        {('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
        ('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
        datetime.date(2001, 8, 12),
        datetime.date(2001, 8, 14)]}
        **************************************************
        ? [Detroit Tigers, Chicago cubs]
        : [2001-07-23]
        ? [New York Yankees, Atlanta Braves]
        : [2001-07-02, 2001-08-12, 2001-08-14]
        **************************************************
        ? [Detroit Tigers, Chicago cubs]
        : [2001-07-23]
        ? [New York Yankees, Atlanta Braves]
        : [2001-07-02, 2001-08-12, 2001-08-14]






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 1 at 15:14

























        answered Jan 1 at 13:34









        AnthonAnthon

        30.6k1795147




        30.6k1795147
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995421%2fpyyaml-not-parsing-all-examples%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Monofisismo

            Angular Downloading a file using contenturl with Basic Authentication

            Olmecas