I have an XML document that has a default namespace attached to it, eg
<foo xmlns="http://www.example.com/ns/1.0">
...
</foo>
In reality this is a complex XML document that conforms to a complex schema. My job is to parse out some data from it. To aid me, I have a spreadsheet of XPath. The XPath is rather deeply nested, eg
level1/level2/level3[@foo="bar"]/level4[@foo="bar"]/level5/level6[2]
The person who generate the XPath is an expert in the schema, so I am going with the assumption that I can't simplify it, or use object traversal shortcuts.
I am using SimpleXML to parse everything out. My problem has to do with how the default namespace gets handled.
Since there is a default namespace on the root element, I can't just do
$xml = simplexml_load_file($somepath);
$node = $xml->xpath('level1/level2/level3[@foo="bar"]/level4[@foo="bar"]/level5/level6[2]');
I have to register the namespace, assign it to a prefix, and then use the prefix in my XPath, eg
$xml = simplexml_load_file($somepath);
$xml->registerXPathNamespace('myns', 'http://www.example.com/ns/1.0');
$node = $xml->xpath('myns:level1/myns:level2/myns:level3[@foo="bar"]/myns:level4[@foo="bar"]/myns:level5/myns:level6[2]');
Adding the prefixes isn't going to be manageable in the long run.
Is there a proper way to handle default namespaces without needing to using prefixes with XPath?
Using an empty prefix doesn't work ($xml->registerXPathNamespace('', 'http://www.example.com/ns/1.0');
). I can string out the default namespace, eg
$xml = file_get_contents($somepath);
$xml = str_replace('xmlns="http://www.example.com/ns/1.0"', '', $xml);
$xml = simplexml_load_string($xml);
but that is skirting the issue.
See Question&Answers more detail:os