skip to Main Content

I’m changing a database using phpmyadmin with several html pages inside it and I would like to remove, from all these pages, all the <div> and other tags that contain a certain class or id.

Example:

Case 1

<div class="undesirable">
  <div class="container">
    <div class="row">
      <div class="col1"></div> 
    </div>
   </div>
</div>

Case 2

<div class="undesirable">
  <div class="container">
    <div class="row">
      <div class="col1"></div>
      <div class="col2"></div> 
    </div>
   </div>
</div>

i would like to remove all <div> that contain the class="undesirable". In some cases, there is still the possibility of appearing as class="pre_undesirable", or something similar.

Initially I thought of using regex, but as there are variations in htmls, code breaks are occurring, as there is no way to know when the <div> will end.
Possibly the answer would be HTML parser, but I can’t understand how to use it. Any indication of where to start?

2

Answers


  1. We can make use D3JS to remove or append any the HTML elements by class name or id.
    We can make use of Select() and Selectall() for the selection of the particular elements in the HTML. Incase if we want to append any div tag use append(‘div’) to insert the div for the data.

    <script>
          function remove() 
             {
            d3.select(.undesirable)
              .selectAll("li")
              .exit()
              .remove()
             }
    
    </script>
    
    Login or Signup to reply.
  2. Since you are dealing with html, you probably should use an html parser and search for the removal target using xpath. To demonstrate, I’ll change your html a bit:

    $original= 
    '<html><body>
    <div class="undesirable">
      <div class="container">
        <div class="row">
          <div class="col1"></div> 
        </div>
       </div>
    </div>
    <div class="keepme">
      <div class="container">
        <div class="row">
          <div class="col1"></div>
          <div class="col2"></div> 
        </div>
       </div>
    </div>
    
    <div class="pre_undesirable">
      <div class="container">
        <div class="row">
          <div class="col1"></div>
          <div class="col2"></div> 
        </div>
       </div>
    </div>
    <div class="keepme">
      <div class="container">
        <div class="row">
          <div class="col1"></div>
          <div class="col2"></div> 
        </div>
       </div>
    </div>
    </body>
    </html>
    ';
    $HTMLDoc = new DOMDocument();
    $HTMLDoc->loadHTML($original);
    $xpath = new DOMXPath($HTMLDoc);
    
    $targets = $xpath->query('//div[contains(@class,"undesirable")]');
    foreach($targets as $target){
            $target->parentNode->removeChild($target);
    }
    echo $HTMLDoc->saveHTML();
    

    The output should include only the two "keep me" <div>s.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search