skip to Main Content

So I’m looking for a XML library that would be suitable for batch processing, I’d like to limit how many records are selected from XML file and an offset from where to start reading the records. I could not find anything relevant although I have went through github search, or maybe my search terms are not accurate enough, some libraries don’t even have a documentation so it’s really hard to tell right of the bat. But maybe some of you have already used ones that do just that and could share your findings.

Any help is appreciated

2

Answers


  1. Take a look at XMLReader, which works more like a cursor reading through the XML, and which you can terminate and close whenever you want.

    For example :

    <building_data>
      <building address="some address" lat="28.902914" lng="-71.007235" />
      <building address="some address" lat="48.892342" lng="-75.0423423" />
      <building address="some address" lat="58.929753" lng="-79.1236987" />
    </building_data>
    

    Then

    $reader = new XMLReader();
    
    if (!$reader->open("data.xml")) {
        die("Failed to open 'data.xml'");
    }
    
    while($reader->read()) {
      if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'building') {
        $address = $reader->getAttribute('address');
        $latitude = $reader->getAttribute('lat');
        $longitude = $reader->getAttribute('lng');
    
        // abort at the first "building" node
        break;
      }
    }
    
    $reader->close();
    
    Login or Signup to reply.
  2. XMLReader allows you to optimize memory consumption mostly but it has a XMLReader::next() to iterate a list of sibling elements. Combined with some counters it should be pretty efficient but maintainable.

    After you moved to the right element you can read data using the XMLReader methods or expand it to DOM. The right solution/balance depends on how complex the structure of the "list item" elements is.

    $xmlUri = 'books.xml';
    
    $reader = new XMLReader();
    $reader->open($xmlUri);
    
    $document = new DOMDocument();
    $xpath = new DOMXpath($document);
    
    // look for the first book element
    while ($reader->read() && $reader->localName !== 'book') {
      continue;
    }
    
    $start = 50;
    $end = $start + 10;
    $offset = 0;
    
    // while here is a book element
    while ($reader->localName === 'book') {
      // ignore until start
      if ($offset < $start) {
        continue;
      }
      // expand to DOM and read data
      $book = $reader->expand($document);
      var_dump(
        $xpath->evaluate('string(title/@isbn)', $book),
        $xpath->evaluate('string(title)', $book)
      );
      // break loop after end
      if ($offset >= $end) {
        break;
      }
      // increment offset counter
      $offset++;
      // move to the next book sibling
      $reader->next('book');
    }
    $reader->close();
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search