<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1d1 20130915//EN" "JATS-journalpublishing1.dtd"[]>
<article dtd-version="1.1d1" article-type="review-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en">
<front>
<?covid19?>
I need to find if the <?covid19?>
processing instruction is present in the XML or not.
Pseudo-code in jQuery:
$("<?covid19?>").length
2
Answers
Seems like a typical use case for XPath
XPath allows you to query XML in a very flexible way.
This tutorial could help:
https://www.w3schools.com/xml/xpath_intro.asp
While there are no CSS selectors to select processing instructions like elements, there are two nice APIs which you can use to avoid iterating over the DOM tree manually.
Let’s say your XML document is the
XMLDocument
theDocument
; you can create one by parsing the XML string with theDOMParser
API:NodeIterator
APIFinding all processing instructions is possible using the
NodeIterator
API (usingtheDocument.createNodeIterator
).iteratorAll
is aNodeIterator
which shows all processing instructions.iteratorCOVID19
is aNodeIterator
which shows all processing instructions with the namecovid19
.The
TreeWalker
API (usingtheDocument.createTreeWalker
) is very similar to theNodeIterator
API.XPath Iterator API
Finding all processing instructions is also possible using XPath (using
theDocument.evaluate
).The results are
XPathResult
s.Explanation of the XPath syntax:
//
processing-instruction()
processing-instruction('covid19')
covid19
The
XPathResult.ORDERED_NODE_ITERATOR_TYPE
is useful to ensure that the nodes get returned in document-order.xPathAll
is anXPathResult
iterator which shows all processing instructions.xPathCOVID19
is anXPathResult
iterator which shows all processing instructions with the namecovid19
.Iteration helper
The two APIs are great in terms of browser support, but that means they’re old enough that they don’t have a modern iteration protocol.
But this is where a generator proves useful.
This code defines the generator function
consumeDOMIterator
which will just fully consume all nodes found by either iterator.Since the
NodeIterator
API’s method to get the next result is callednextNode
, and the XPath method is callediterateNext
, this function checks which of these method names to use.If it can’t find the appropriate method, it’ll defer to the default iteration protocol.
Then, a simple
while
loop repeatedly calls one of these methods andyield
s them untilnull
is returned.Now the function can be used to create an Array from the iterator.
Array.from
can be used to achieve this easily:To check the existence of a processing instruction, simply check the Array’s
length
or ifiteratorCOVID19.nextNode()
orxPathCOVID19.iterateNext()
return aNode
.Note that its name includes
consume
for a reason: once you start iterating over the API results using this function to create an Array, the state of the results changes.Once you reach the end, either iterator will be at the “end” of the document, so there is no next node.
While the
NodeIterator
API has apreviousNode
, the XPath Iterator API does not have a corresponding method; in general iterators can only be iterated once.XPath Snapshot API
Alternatively,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE
may be used to get a more direct set of results that can be iterated over more easily.Now, since the
XPathResult
is a snapshot, thesnapshotLength
can be used in order to get all thesnapshotItem
s.Again,
Array.from
can be used to achieve this easily:The only difference to the iterator approach is that the
XPathResult
does not change when the underlying document is mutated.To check the existence of a processing instruction, simply check the Array’s
length
or ifsnapshotCOVID19.snapshotItem(0)
returns aNode
.Full code
This code snippet demonstrates in full how to get all processing instructions of the form
<?covid19?>
and, for example, get theirnodeValue
: