I took @Michael Kay’s comment at How to convert JSON to XML using XSLT? to heart when he said:
XSLT 3.0 isn’t actually that good at processing JSON using template
rules: it can be done, but it isn’t very convenient. It’s usually more
convenient to use functions.
Given a slightly more than just trivial JSON as input, namely:
{
"name": "Alice",
"age": 30,
"children": [
{"name": "Charlie", "age": 5},
{"name": "Daisy", "age": 3}
]
}
I wanted to
- extract the children
- process them by raising their ages by 10 respectively
- and create a result XML with derived element- and attribute-names.
The XML result should be:
<olderChildren>
<child name="Charlie" age="15"/>
<child name="Daisy" age="13"/>
</olderChildren>
To compare the newer function-based approach with the traditional template-based approach I created 2 stylesheets to compare their metrics side by side.
This is the function-based stylesheet:
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
xmlns:array="http://www.w3.org/2005/xpath-functions/array"
exclude-result-prefixes="map array">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="json-data">
{
"name": "Alice",
"age": 30,
"children": [
{"name": "Charlie", "age": 5},
{"name": "Daisy", "age": 3}
]
}
</xsl:param>
<xsl:variable name="parsed-json" select="parse-json($json-data)"/>
<xsl:template match="/">
<xsl:variable name="children" select="map:get($parsed-json, 'children')"/> <!-- Extract children array -->
<xsl:variable name="older-children" as="array(*)" select="array:for-each($children, function($child) {map:put($child, 'age', map:get($child, 'age') + 10)})"/> <!-- Create array of children whose age is raised by 10 years-->
<xsl:element name="olderChildren"><!-- Output result XML -->
<xsl:for-each select="$older-children?*">
<xsl:element name="child">
<xsl:attribute name="name" select="map:get(., 'name')"/>
<xsl:attribute name="age" select="map:get(., 'age')"/>
</xsl:element>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
and this is the template-based stylesheet:
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
exclude-result-prefixes="fn">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:param name="json">
{
"name": "Alice",
"age": 30,
"children": [
{"name": "Charlie", "age": 5},
{"name": "Daisy", "age": 3}
]
}
</xsl:param>
<xsl:variable name="XMLfromJSON" select="json-to-xml($json)"/>
<xsl:template match="/">
<xsl:apply-templates select="$XMLfromJSON/fn:map/fn:array"/>
</xsl:template>
<xsl:template match="fn:array[@key = 'children']">
<xsl:element name="{fn:concat('older',fn:upper-case(fn:substring(@key,1,1)),fn:substring(@key,2))}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="fn:array[@key = 'children']/fn:map">
<xsl:element name="{fn:substring(../@key,1,5)}">
<xsl:attribute name="{./fn:string/@key}" select="./fn:string"/>
<xsl:attribute name="{./fn:number/@key}" select="./fn:number + 10"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Comparing off-the-top-of-my-hat metrics, I come to this side by side comparison:
- Function-based wins in terms of lines of code (but only by 2 lines and only if the lengthy
array:for-each
-loop is squeezed into one line). - The template-based approach’s biggest stregth in my view is the implicit iteration of the XSLT-processor. For the function-based approach this effectively has to be bypassed and 2 loops have to be written manually (
array:for-each
and<xsl:for-each>
). - In terms of derivability of result XML identifiers from the original JSON, I see the template-based version winning (with expressions like
{fn:concat('older',fn:upper-case(fn:substring(@key,1,1)),fn:substring(@key,2))}
and{fn:substring(../@key,1,5)}
) since I observed in the debugger that the name "children" is lost during the transformation of the original JSON into maps and arrays. - In terms of modularization, I see the template-based version winning again with its 3 neat and clear little templates nicely reflecting the structural hierarchy of the original JSON. By contrast, the function-based version is one big monolith inside the
<xsl:template match="/">
. - The function-based solutions’s biggest advantage might perhaps be time and resources, since it might be less effort to parse the JSON into maps and arrays which can be read and manipulated directly as opposed to the template-based version which first parses and transforms the JSON into this standard XML which then is the foundation of the XSLT-processing. But that’s just an assumption. I haven’t done any benchmark tests yet.
Overall, given the techniques used in these 2 solutions I see a lot of merit in the templated-based solution.
So my first question is: Am I missing more important metrics that would swing the pendulum over to favour the function-based solution?
My second question is: Am I missing some more function-based techniques which would remedy my perceived shortcomings of the function-based solution?
My third question is then: Could you perhaps write a better function-based solution which is cleary superior to both of my solutions and which clearly wins in all or most metrics?
Besides that I have a more techincal question:
Question 4: In the function-based solution the contents of <xsl:variable name="older-children" />
was obviously created by creating a deep copy of <xsl:variable name="children">
before processing the new copy (by adding 10 to the ages within all child-maps). One can see that when replacing <xsl:for-each select="$older-children?*">
with <xsl:for-each select="$children?*">
during XML result generation. Then the result is:
<olderChildren>
<child name="Charlie" age="5"/>
<child name="Daisy" age="3"/>
</olderChildren>
, i.e. the ages are the original values.
The select-expression of <xsl:variable name="older-children" />
however seems to reference the maps inside the $children
-variable, at least when looking at the code at face-value: That function($child)
which is called on every $child
of that iteration seems to overwrite field 'age'
in that current $child
of $children
, NOT a newly created deep copy of $children
(select="array:for-each($children, function($child) {map:put($child, 'age', map:get($child, 'age') + 10)}
). Where in the XSLT 3.0 spec does it say that such a deep copy is what is happening?
2
Answers
the ages are the original values. The select-expression of
<xsl:variable name="older-children" />
however seems to reference the maps inside the$children-variable
, at least when looking at the code at face-value:Maps in XPath 3.1/XDM 3.1/XSLT 3.0 are immutable so any
map:put
never manipulates the original map, rather it returns a new map with the changed property.https://www.w3.org/TR/xpath-functions-31/#map-functions: