convert heading to ul list by php

behroz
September 3, 2023
199 views
0 votes
2 Answers

I am learning php language.
I want to show the table of contents for the article. Convert the headings (h2,h3,h4,...) into a list and create links.
This is my php code.

$Post = '
<h2>Title 01</h2>
<h3>Title 01.01</h3>
<h3>Title 01.02</h3>
<h2>Title 02</h2>
<h3>Title 02.02</h3>
';

$c = 1;
$r = preg_replace_callback('~<h*([^>]*)>~i', function($res) use (&$c){
    return '<li><a id="#id'.$c++.'">'.$res[1].'</a></li>';
}, $Post);
$Post = $r;


echo '<ul>';
echo $Post;
echo '</ul>';

The output shows as below, but the above code works wrongly.

<ul>
<li><a id="#id1">2</a></li>Title 01<li><a id="#id2">/h2</a></li>
<li><a id="#id3">3</a></li>Title 01.01<li><a id="#id4">/h3</a></li>
<li><a id="#id5">3</a></li>Title 01.02<li><a id="#id6">/h3</a></li>
<li><a id="#id7">2</a></li>Title 02<li><a id="#id8">/h2</a></li>
<li><a id="#id9">3</a></li>Title 02.02<li><a id="#id10">/h3</a></li>
</ul>

I know that the Java code is written incorrectly.‌ But i want to show the output as below.

<ul>
<li><a href="#id1">Title 01</a></li>
<li><a href="#id2">Title 01.01</a></li>
<li><a href="#id3">Title 01.02</a></li>
<li><a href="#id4">Title 02</a></li>
<li><a href="#id5">Title 02.02</a></li>
</ul>

Answers

- Simon
- September 3, 2023 at 1:19 am
- 0 votes
0
Your regular expression is needlessly complex.

You could just use <h.>(.*)</h.> to correctly match what you are trying to match.

I added it to your snippet above to show your desired result:
```
$post = '
<h2>Title 01</h2>
<h3>Title 01.01</h3>
<h3>Title 01.02</h3>
<h2>Title 02</h2>
<h3>Title 02.02</h3>
';

$c = 1;
$list_elements = preg_replace_callback('~<h.>(.*)</h.>~i', function($res) use (&$c){
    return '<li><a id="#id'.$c++.'">'.$res[1].'</a></li>';
}, $post);


echo '<ul>';
echo $list_elements;
echo '</ul>';
```
Although, as suggested in the comments, you should probably use a parser here, if this turns into anything more than a toy example. Then regular expressions are almost always a sure way to shoot yourself in the foot.
Login or Signup to reply.

- HorusKol
- September 3, 2023 at 1:59 am
- 0 votes
0
Your regex is wrong for what you’re trying to do:
```
~<h*([^>]*)>~i
```
<h* means that it will match an angle bracket followed by zero or more h’s. Which basically means your regex is matching everything between each <> pairing, (including </…>).

You could do this to extract the titles from your headings:
```
~<h[1-6]>([^<]*)<h[1-6]>~i
```
But those linked need to target the IDs in the headings, so you need to do this to extract them:
```
~<h[1-6] id="([^"]*)">([^<]*)<h[1-6]>~i
```
But what if you’ve got other attributes on the heading?
```
~<h[1-6][^>]*(id="([^"*])"[^>]*)?>([^<]*)<h[1-6]>~i
```
Or markup inside the heading?

Regex is not a great way to parse HTML. It is a powerful tool, and it is possible to use it for this, but there are better ways.
```
$doc = new DOMDocument();
$doc->loadHTML($post);

$xpath = new DOMXPath($doc);

$headings = $xpath->query('html/body//*[self::h1 or self::h2 or self::h3]');

$nav = $xpath->query('html/body//nav/ul');

foreach ($headings as $heading) {
  $link = $doc->createElement('a');
  $link->setAttribute('href', '#' . $heading->getAttribute('id'));
  $link->textContent = $heading->textContent;

  $nav->appendChild(
   $doc->createElement('li')
     ->appendChild($link)
  );
}
```
I’ve assumed there is no markup in the headings, but only a couple of changes are needed to copy inner markup if necessary.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.