I’m changing a database using phpmyadmin with several html pages inside it and I would like to remove, from all these pages, all the <div>
and other tags that contain a certain class
or id
.
Example:
Case 1
<div class="undesirable">
<div class="container">
<div class="row">
<div class="col1"></div>
</div>
</div>
</div>
Case 2
<div class="undesirable">
<div class="container">
<div class="row">
<div class="col1"></div>
<div class="col2"></div>
</div>
</div>
</div>
i would like to remove all <div>
that contain the class="undesirable"
. In some cases, there is still the possibility of appearing as class="pre_undesirable"
, or something similar.
Initially I thought of using regex
, but as there are variations in htmls, code breaks are occurring, as there is no way to know when the <div>
will end.
Possibly the answer would be HTML parser, but I can’t understand how to use it. Any indication of where to start?
2
Answers
We can make use D3JS to remove or append any the HTML elements by class name or id.
We can make use of Select() and Selectall() for the selection of the particular elements in the HTML. Incase if we want to append any div tag use append(‘div’) to insert the div for the data.
Since you are dealing with html, you probably should use an html parser and search for the removal target using xpath. To demonstrate, I’ll change your html a bit:
The output should include only the two "keep me"
<div>
s.