skip to Main Content

The following php script gives count of elements in a single xml file in the folder uploads. But I have number of xml files in the folder. What to modify in the following script so that I get result in tabular format with the file name and element count for all the xml files in the folder.

<?php
$doc = new DOMDocument;
$xml = simplexml_load_file("uploads/test.xml");
//file to SimpleXMLElement 
$xml = simplexml_import_dom($xml);
print("Number of elements: ".$xml->count());    
?>

2

Answers


  1. First, create a function with the logic you have:

    function getXML($path) {
        $doc = new DOMDocument;
        $xml = simplexml_load_file($path);
        //file to SimpleXMLElement 
        $xml = simplexml_import_dom($xml);
        return $xml;
    }
    

    Note that I:

    • have converted the path into a parameter, so you can reuse the same logic for your files
    • separated the parsing of XML from showing it
    • returned the XML itself, so you can get the count or you can do whatever else you may want with it

    This is how you can get the files of a given path:

    $files = array_diff(scandir('uploads'), array('.', '..'));
    

    we get all files except for . and .., which are surely not of interest here. Read more about scandir here: https://www.php.net/manual/en/function.scandir.php

    You received an array of filenames on success, so, let’s loop it and perform the logic you need:

    $xmls = [];
    foreach ($files as $file) {
        if (str_ends_with($file, '.xml')) {
            $xmls[] = $file . "t" . getXML('uploads/' . $file)->count();
        }
    }
    echo implode("n", $xmls);
    

    EDIT

    As @Juan kindly explained in the comment section, one can use

    $files = glob("./uploads/*.xml");
    

    instead of scandir and that would ensure that we no longer need a call for array_diff and later we can avoid the if inside the loop:

    $xmls = [];
    foreach ($files as $file) {
        $xmls[] = $file . "t" . getXML('uploads/' . $file)->count();
    }
    echo implode("n", $xmls);
    
    Login or Signup to reply.
  2. You’re first loading the XML file into a SimpleXMLElement then import it into a DOMElement and call the method count() on it. This method does not exists on DOMElement – only on SimpleXMLElement. So the import would not be necessary.

    You can use a GlobIterator to iterate the files:

    $directory = __DIR__.'/uploads';
    
    // get an iterator for the XML files
    $files = new GlobIterator(
      $directory.'/*.xml', FilesystemIterator::CURRENT_AS_FILEINFO
    );
    
    $results = [];
    foreach ($files as $file) {
      // load file using absolute file path 
      // the returned SimpleXMLElement wraps the document element node
      $documentElement = simplexml_load_file($file->getRealPath());
      $results[] = [
        // file name without path
        'file' => $file->getFilename(),
        // "SimpleXMLElement::count()" returns the number of children of an element
        'item-count' => $documentElement->count(),
      ];
    }
    
    var_dump($results);
    

    With DOM you can use Xpath to fetch specific values from the XML.

    $directory = __DIR__.'/uploads';
    
    // get an iterator for the XML files
    $files = new GlobIterator(
      $directory.'/*.xml', FilesystemIterator::CURRENT_AS_FILEINFO
    );
    
    // only one document instance is needed
    $document = new DOMDocument();
    
    $results = [];
    foreach ($files as $file) {
      // load the file into the DOM document
      $document->load($file->getRealPath());
      // create an Xpath processor for the loaded document
      $xpath = new DOMXpath($document);
      $results[] = [
        'file' => $file->getFilename(),
        // use an Xpath expression to fetch the value
        'item-count' => $xpath->evaluate('count(/*/*)'),
      ];
    }
    
    var_dump($results);
    

    The Xpath Expression

    • Get the document element /*
    • Get the child elements of the document element /*/*
    • Count them count(/*/*)

    * is an universal selector for any element node. If you can you should be more specific and use the actual element names (e.g. /list/item).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search