skip to Main Content

Hi i have a markdow text like the one below and i want to slice it in H2 tilte, H2 content

## **Intro**


* bla bla
* bla bla bla


## Tortilla

* chico
* chica

### 1. sub-section

* and another bla.

Regex result should be :

title 1:

Intro

content 1:

* bla bla
* bla bla bla

title 2:

Tortilla

content 2:

* chico
* chica

### 1. sub-section

* and another bla.

I tried with this regex

/^## (?<title>.*)(?<content>.*(?:n(?!##).*)*)/gm

But doesn’t catch the sub-section content.

Can someone help please?

2

Answers


  1. The regex below produces the output desired from the input you have given.

    ^##s(?<title>.*)n(?<content>(?:(?!##s).*n?)+)
    

    Explanation of most pieces:

    piece description
    ^##s match lines that begin with H2
    (?<title>.*)n capture everything after the H2 and discard the new line
    (?:(?!##s).*n?)+ match all text that doesn’t start a line with H2, including any new lines

    The primary issue with your attempt is that you are trying to account for the new line after title by including it in content when it just needs to be discarded.

    The secondary issue is that you’re not providing a differentiator for lines that start with ## that are not H2 (i.e. there must be whitespace after the ##)

    Note: the optional new line (n?) at the end of content is required when the input does not end with a new line

    Login or Signup to reply.
  2. I would choose an approach based on the combination of a simple regex like … /^##s(.*)/gm … which will be utilized for splitting the markdown string. The relevant array data then gets reduced into the final result, an array which features all the H2 related items, each item consisting of a sanitized title value and a likewise sanitized content value …

    const markdown = `
    
    # Main Topic
    
    maintopic content
    
    maintopic content
    
    
    ## **Intro**
    
    
    * bla bla
    * bla bla bla
    
    
    ## Tortilla
    
    * chico
    * chica
    
    ### 1. sub-section
    
    * and another bla.`;
    
    
    const result = markdown
    
      // see ... [https://regex101.com/r/WQrkqS/1]
      .split(/^##s(.*)/gm)
      .splice(1)
    
      .reduce((result, title, idx, arr) => {
        if (idx % 2 === 0) {
          // see ... [https://regex101.com/r/WQrkqS/2]
          const regXTrimNonNumbersLetters = /^[^p{L}p{N}]+|[^p{L}p{N}]+$/gu;
    
          result
            .push({
              title: title.replace(regXTrimNonNumbersLetters, '').trim(),
              content: arr[idx + 1].trim(),
            });
        }
        return result;
      }, []);
    
    console.log({ result });
    .as-console-wrapper { min-height: 100%!important; top: 0; }
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search