Say your XML file is:
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
</CD>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
</CD>
If you transform it into JSON, should it be (version 1):
{
"CATALOG": {
"CD": [
{
"ARTIST": "Bob Dylan",
"TITLE": "Empire Burlesque"
},
{
"ARTIST": "Bob Dylan",
"TITLE": "Empire Burlesque"
},
]
}
}
Or (version 2):
{
"CATALOG": [
{
"CD": {
"ARTIST": "Bob Dylan",
"TITLE": "Empire Burlesque"
}
},
{
"CD": {
"ARTIST": "Bob Dylan",
"TITLE": "Empire Burlesque"
}
}
]
}
My feeling is that version 1 is more correct but I’m wondering if there is a norm?
Thanks for your feedback – Christian
2
Answers
It depends on what you mean by more correct.
Both Version 1 and Version 2
JSON
formats above are valid under RFC 8259.JSON
with it here.However, with that said, most online
XML
toJSON
converters (like this one here) would handle the conversion with Version 1. Making it more compact and readable (and easy to consume).There is no correct answer and no standard.
The best answer depends on the semantics of the data, which no tool is likely to know. We can guess the semantics of your data, because we are familiar with words like "artist" and "title". But even knowing that, we have to make guesses. Will all the things in a catalog be CDs, as in your example, or is this just a special case? Is the order of CDs in the catalog significant? Is the order of ARTIST and TITLE significant? Might there be CDs with multiple ARTIST children in the XML, and if so, would you want the single-artist case to use the same structure?
The ideal conversion works from the data model implemented by the XML — which takes into account questions such as whether a CD has multiple artists and whether the order of their names is signficant — and then designs a JSON data structure to match that data model. Inferring a data model from one example data-set involves a lot of guesswork.