extracting <h2> title text from html where title text might include newlines
I have an html file with some <h2> tags such as a <- '<section id="sec-standard-stoet-geary" class="level2" data-number="9.4"> <h2 data-number="9.4" class="anchored" data-anchor-id="sec-standard-stoet-geary"> <span class="header-section-number">9.4</span> Standardising PISA results</h2>' b <- '<span class="fu">read_parquet</span>(<span class="st">"<folder>PISA_2015_student_subset.parquet"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre> </div> </div> </section><section id="sec-leftjoin"…