I have some annoying XML from an API response that looks like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><report><QueryResult>
<ResumptionToken>123456</ResumptionToken>
<IsFinished>true</IsFinished>
<ResultXml>
<rowset xmlns="urn:schemas-microsoft-com:xml-analysis:rowset">
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:saw-sql="urn:saw-sql" targetNamespace="urn:schemas-microsoft-com:xml-analysis:rowset">
<xsd:complexType name="Row">
<xsd:sequence>
<xsd:element maxOccurs="1" minOccurs="1" name="Column0" saw-sql:aggregationRule="none" saw-sql:aggregationType="nonAgg" saw-sql:columnHeading="0" saw-sql:displayFormula="0" saw-sql:length="4" saw-sql:precision="12" saw-sql:scale="0" saw-sql:tableHeading="" saw-sql:type="integer" type="xsd:int"/>
<xsd:element maxOccurs="1" minOccurs="0" name="Column1" saw-sql:aggregationRule="none" saw-sql:aggregationType="nonAgg" saw-sql:columnHeading="ISBN" saw-sql:displayFormula=""Bibliographic Details"."ISBN"" saw-sql:length="255" saw-sql:precision="255" saw-sql:scale="0" saw-sql:tableHeading="Bibliographic Details" saw-sql:type="varchar" type="xsd:string"/>
<xsd:element maxOccurs="1" minOccurs="0" name="Column2" saw-sql:aggregationRule="none" saw-sql:aggregationType="nonAgg" saw-sql:columnHeading="ISSN" saw-sql:displayFormula=""Bibliographic Details"."ISSN"" saw-sql:length="255" saw-sql:precision="255" saw-sql:scale="0" saw-sql:tableHeading="Bibliographic Details" saw-sql:type="varchar" type="xsd:string"/>
<xsd:element maxOccurs="1" minOccurs="0" name="Column3" saw-sql:aggregationRule="none" saw-sql:aggregationType="nonAgg" saw-sql:columnHeading="Publication Date" saw-sql:displayFormula=""Bibliographic Details"."Publication Date"" saw-sql:length="255" saw-sql:precision="255" saw-sql:scale="0" saw-sql:tableHeading="Bibliographic Details" saw-sql:type="varchar" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
<Row>
<Column0>0</Column0>
<Column1>55555555 444444445</Column1>
<Column3>[2019]</Column3>
</Row>
<Row>
<Column0>0</Column0>
<Column1>555555555</Column1>
<Column3>©2009.</Column3>
</Row>
I’m using PHP’s SimpleXML to parse this data, but am struggling to access the column headers located in the non-default namespace under xsd:element. For example, I need to access the value: saw-sql:columnHeading="Publication Date", as this column can be dynamic and isn’t always "Publication Date". So I’m looking to pluck out the values for saw-sql[@columnHeading].
I’ve tried all manners of registering the namespaces with Xpath, using attributes() etc etc. The closest I got was:
$ResponseXml->registerXPathNamespace('xsd','http://www.w3.org/2001/XMLSchema');
$elements = $ResponseXml->xpath('//xsd:element[@minOccurs]');
This actually got me the default namespace attributes, but I need the ones for saw-sql, and the same method of:
$ResponseXml->registerXPathNamespace('saw-sql','urn:saw-sql');
$elements = $ResponseXml->xpath('//saw-sql:element[@columnHeading]');
does not get me any results.
3
Answers
Your XPath
//saw-sql:element[@columnHeading]
is looking for elements namedelement
(in the saw-sql namespace), which have attributes namedcolumnHeading
(in no namespace), but the element names are actually in thexsd
namespace, while the attributes are in thesaw-sql
namespace.So I believe what you want is:
fwiw you could use DOMDocument to parse it instead of SimpleXML, for example
yields
SimpleXMLElement::attributes()
allows you to access the attributes of a specific namespace providing the namespace URI as a parameter.But first I would suggest defining a constant (or variable) for the namespaces that you are using. This will make your code a lot more readable and avoid typos.
Be aware that "rowset" redefines the default namespace for itself and the descendant element nodes, they are not in the "empty/none" namespace.
Output:
The Xpath Expression
//xsd:complexType
//xsd:complexType[@name="Row"]
//xsd:complexType[@name="Row"]/xsd:sequence/xsd:element
The part in
[]
are conditions for nodes returned the previous location path. So//foo[@bar]
would return thefoo
element nodes with abar
attribute, while//foo/@bar
would return thebar
attributes of allfoo
element nodes.DOM
This solution would not look much different with DOM. The Xpath processor is a separate object and here are specific methods to work with namespaces (suffix "NS"). DOM is more specific and powerful then SimpleXML.