PyYAML not parsing all examples
I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:
PyYAML features
- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.
If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.
I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.
But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?
python yaml pyyaml
add a comment |
I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:
PyYAML features
- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.
If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.
I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.
But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?
python yaml pyyaml
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)collections.abc.Mapping
.
– Anthon
Jan 1 at 13:41
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55
add a comment |
I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:
PyYAML features
- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.
If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.
I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.
But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?
python yaml pyyaml
I'm trying to understand the claim on https://pyyaml.org/wiki/PyYAML that:
PyYAML features
- a complete YAML 1.1 parser. In particular, PyYAML can parse all
examples from the specification.
If you go to the online YAML parser that uses PyYAML (http://yaml-online-parser.appspot.com/), then several of the examples taken from the specification do not work.
I understand that you would need to have tags defined for some of these failures, and that the online parser can only handle single document YAML, I know how to "fix" that when I use PyYAML.
But example 11 fails as well and it has no special tags and is a single document. How can PyYAML claim it can parse all examples, where it obviously doesnt? Is this because PyYAML is for YAML 1.1 and the examples are from the YAML 1.2 specification?
python yaml pyyaml
python yaml pyyaml
edited Jan 1 at 13:34
Anthon
30.6k1795147
30.6k1795147
asked Jan 1 at 12:26
SusinSusin
154
154
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)collections.abc.Mapping
.
– Anthon
Jan 1 at 13:41
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55
add a comment |
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)collections.abc.Mapping
.
– Anthon
Jan 1 at 13:41
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)
collections.abc.Mapping
.– Anthon
Jan 1 at 13:41
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)
collections.abc.Mapping
.– Anthon
Jan 1 at 13:41
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55
add a comment |
1 Answer
1
active
oldest
votes
To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.
Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.
PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list
and
that cannot be used as a key for a Python dict
. This is a
restriction of Python's dict
implementation and IMO one of the deficiencies of PyYAML.
import sys
import yaml as pyyaml
yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""
for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)
gives:
StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()
So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):
pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)
as this gives:
---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}
Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.
In ruamel.yaml
(disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.
from pprint import pprint
import ruamel.yaml
import yaml as pyyaml
yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)
as that gives:
{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995421%2fpyyaml-not-parsing-all-examples%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.
Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.
PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list
and
that cannot be used as a key for a Python dict
. This is a
restriction of Python's dict
implementation and IMO one of the deficiencies of PyYAML.
import sys
import yaml as pyyaml
yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""
for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)
gives:
StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()
So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):
pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)
as this gives:
---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}
Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.
In ruamel.yaml
(disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.
from pprint import pprint
import ruamel.yaml
import yaml as pyyaml
yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)
as that gives:
{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
add a comment |
To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.
Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.
PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list
and
that cannot be used as a key for a Python dict
. This is a
restriction of Python's dict
implementation and IMO one of the deficiencies of PyYAML.
import sys
import yaml as pyyaml
yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""
for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)
gives:
StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()
So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):
pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)
as this gives:
---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}
Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.
In ruamel.yaml
(disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.
from pprint import pprint
import ruamel.yaml
import yaml as pyyaml
yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)
as that gives:
{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
add a comment |
To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.
Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.
PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list
and
that cannot be used as a key for a Python dict
. This is a
restriction of Python's dict
implementation and IMO one of the deficiencies of PyYAML.
import sys
import yaml as pyyaml
yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""
for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)
gives:
StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()
So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):
pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)
as this gives:
---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}
Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.
In ruamel.yaml
(disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.
from pprint import pprint
import ruamel.yaml
import yaml as pyyaml
yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)
as that gives:
{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
To start with your last question: this is not because of the examples
coming from a later specification. Assuming you restrict yourself to
the Preview chapter/section in the spec (as does the online parser),
and taking into account that I have only compared the examples
visually (i.e. not on a character for character basis), the examples
in the 1.2 and 1.1 specification for chapter/section 2 are the same.
Your misinterpretation comes from the use of the word parser in the
title of the online parser. What it actually tries to do, is load the
YAML and then dump to JSON, Python or canonical YAML. The loading in
PyYAML consists of the stages mentioned in the Processing
Overview picture in the YAML
spec (same for 1.1 and 1.2), starting with the character based
document: a parsing, composing and construction step.
PyYAML doesn't fail on the parsing step, it fails on the construction
step, because (as @torek indicates) PyYAML constructs a list
and
that cannot be used as a key for a Python dict
. This is a
restriction of Python's dict
implementation and IMO one of the deficiencies of PyYAML.
import sys
import yaml as pyyaml
yaml_1_1_example_2_11 = """
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
"""
for event in pyyaml.parse(yaml_1_1_example_2_11):
print(event)
gives:
StreamStartEvent()
DocumentStartEvent()
MappingStartEvent(anchor=None, tag=None, implicit=True)
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Detroit Tigers')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Chicago cubs')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-23')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='New York Yankees')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='Atlanta Braves')
SequenceEndEvent()
SequenceStartEvent(anchor=None, tag=None, implicit=True)
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-07-02')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-12')
ScalarEvent(anchor=None, tag=None, implicit=(True, False), value='2001-08-14')
SequenceEndEvent()
MappingEndEvent()
DocumentEndEvent()
StreamEndEvent()
So PyYAML can parse this correctly. Not only that, if the online "parser"
would not try to load, then dump, when emitting canonical YAML, it could
process this example (replacing the last two lines of the above
code):
pyyaml.emit(pyyaml.parse(yaml_1_1_example_2_11), stream=sys.stdout, canonical=True)
as this gives:
---
{
? [
! "Detroit Tigers",
! "Chicago cubs",
]
: [
! "2001-07-23",
],
? [
! "New York Yankees",
! "Atlanta Braves",
]
: [
! "2001-07-02",
! "2001-08-12",
! "2001-08-14",
],
}
Stating that PyYAML parses all examples, is like me stating that I can
read Greek. I learned the Greek alphabet back in the 70's, so I can
read (the) Greek (characters), but I don't understand the words they form.
In ruamel.yaml
(disclaimer: I am the author of that package) you can load this example, and you can even use PyYAML to
dump the loaded data.
from pprint import pprint
import ruamel.yaml
import yaml as pyyaml
yaml = ruamel.yaml.YAML(typ='safe')
data = yaml.load(yaml_1_1_example_2_11)
pprint(data)
print('*' * 50)
yaml.dump(data, sys.stdout)
print('*' * 50)
pyyaml.safe_dump(data, sys.stdout)
as that gives:
{('Detroit Tigers', 'Chicago cubs'): [datetime.date(2001, 7, 23)],
('New York Yankees', 'Atlanta Braves'): [datetime.date(2001, 7, 2),
datetime.date(2001, 8, 12),
datetime.date(2001, 8, 14)]}
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
**************************************************
? [Detroit Tigers, Chicago cubs]
: [2001-07-23]
? [New York Yankees, Atlanta Braves]
: [2001-07-02, 2001-08-12, 2001-08-14]
edited Jan 1 at 15:14
answered Jan 1 at 13:34
AnthonAnthon
30.6k1795147
30.6k1795147
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995421%2fpyyaml-not-parsing-all-examples%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is this because ... the examples are from [a later] specification? Presumably, yes. The actual failure is that the key is not hashable—pyyaml reads the key into a list rather than a tuple. Probably pyyaml should notice that the key is being used as a key, and tuple-ize it, though full generality would be hard.
– torek
Jan 1 at 12:43
@torek The examples are the same for those specifications (at least in chapter two), but your analysis of why it doesn't load is spot-on: PyYAML doesn't create a tuple when encountering a sequence that is a key. It also fails when using a mapping as a key, which it should create as a (hashable)
collections.abc.Mapping
.– Anthon
Jan 1 at 13:41
Like Anthon said, PyYAML cannot produce lists (tuples) as keys, but I'm working on it and it will hopefully be in one of the next versions. Then it will also be able to use nested tuples as keys.
– tinita
Jan 1 at 14:55