I have two types of XML files that hold details of retro games. I’m trying to dynamically edit the files to search and replace a games title.
I’m not sure if this is a problem that will need two distinct solutions or not but thought I would detail it here and, hopefully, you will be able to steer me towards a solution.
The first XML file has this type of structure:
<?xml version="1.0" standalone="yes"?>
<LaunchBox>
<Game>
<Status>Imported ROM</Status>
<DatabaseID>22190</DatabaseID>
<Title>ME-Title Match Pro Wrestling</Title>
<UseDosBox>false</UseDosBox>
<Version>(USA)</Version>
</Game>
<Game>
<Status>Imported ROM</Status>
<DatabaseID>30128</DatabaseID>
<Title>Skeet Shoot</Title>
<UseDosBox>false</UseDosBox>
<Version>(USA)</Version>
</Game>
<Game>
<Status>Imported ROM</Status>
<DatabaseID>28694</DatabaseID>
<Title>Star Strike</Title>
<UseDosBox>false</UseDosBox>
<Version>(USA)</Version>
</Game>
</LaunchBox>
The second file (the extension is .dat
but it looks like an xml inside) has this structure:
<?xml version="1.0"?>
<!DOCTYPE datafile PUBLIC "-//Logiqx//DTD ROM Management Datafile//EN">
<datafile>
<header>
<name>MatchingDATTest</name>
</header>
<game name="KeepAway (USA)">
<category>Games</category>
</game>
<game name="Super Football Another Game">
<category>Games</category>
</game>
<game name="River Raid II (USA)">
<category>Games</category>
</game>
<game name="London Blitz (USA)">
<category>Games</category>
</game>
</datafile>
I would like to be able to have code that basically says…
Find <Title>Star Strike</Title>
and replace with <Title>Star Wars</Title>
Or, in the case of the other type of file…
Find <game name="KeepAway (USA)">
and replace with <game name="KeepAway 2 (USA)">
I wasn’t sure the best way to approach this problem? Whether to fopen
the file and somehow go line by line through it but these files can get very large (up to 100,000 games) so I thought that might not be practical? Following a lot of Googling and experimenting over the last few days I cobbled together this code from other solutions to similar problems.
This was trying to deal with the first example of XML that I posted.
$reader = new XMLReader();
$reader->open($fileToEdit);
$document = new DOMDocument();
$xpath = new DOMXpath($document);
$found = false;
// look for the document element
do {
$found = $found ? $reader->next() : $reader->read();
} while (
$found &&
$reader->localName !== 'LaunchBox'
);
// go to first child of the document element
if ($found) {
$found = $reader->read();
}
// found a node at depth 1
while ($found && $reader->depth === 1) {
// We need to check if we're dealing with an Element
if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'Game')
{
// Let's inspect the node's content as well
while ($reader->read())
{
if ($reader->nodeType == XMLReader::TEXT)
{
$thisValue = $reader->value;
if( $thisValue == $gameTitleToSearchFor ){
$reader->value = $gameTitleToReplaceWith;
break 2;
}
}
}
}
$found = $reader->next();
}
$dom->save($fileToEdit);
$reader->close();
But I get an error saying Cannot write to read-only property
so I’m waving the white flag as this is beyond my ability at the moment so wondered if anyone could help?
2
Answers
If this is a one-off requirement then there are many ad-hoc ways of solving it; global replace in an editor, preferably an XML editor, would do the job quite adequately. But if it’s part of a production workflow, then XSLT is definitely the right took for the job. The main drawback is that there’s a bit of a learning curve, which means that if you’re a PHP programmer you might prefer to use the tools you know rather than learning something new.
XSLT version 1.0 is very widely available (it comes "out of the box") and is quite capable of simple jobs like this, though it’s a bit more verbose than the latest version, XSLT 3.0, which requires you to install a third party library such as SaxonC (my company’s product).
You’ve given two examples of substitutions you want to perform and the simplest solution would be to write a separate XSLT stylesheet for each one. But if it’s a general problem requiring a general solution, and these are just two specific examples, then you could write a single generic stylesheet that takes as parameters the element name, the old text, and the replacement text. A lot depends on whether you want something quick and dirty, or something of production quality that will handle a wide variety of tasks and serve you well for years to come.
Consider a parameterized XSLT using PHP’s XSLTProcessor class where you pass parameters from PHP to the stylesheet prior to transformation.
XSLT (save as .xsl, a special .xml file)
PHP
Should you need to pass multiple values for each XML document, update the method to receive equal length arrays for find and replace values. Then, incorporate a for loop inside method to iteratively call
setParameter
.