Javascript - Regex to match markdown headings and text nested under specific heading

lukaskupfing
September 29, 2023
278 views
0 votes
2 Answers

I am using Obsidian (which uses ECMAScript) with the Obsidian_to_Anki-Plugin and I have this page structure:

# Heading 1 ⤵
## Heading 1.1
Text of Heading 1.1
Text can span over multiple lines
Even more text
## Heading 1.2 
Text of Heading 1.2
# Heading 2
## Heading 2.1
Text of Heading 2.1
## Heading 2.2
Text of Heading 2.2
# Heading 3 ⤵
## Heading 3.1
Text of Heading 3.1
## Heading 3.2
Text of Heading 3.2
# Heading 4

I need a RegExp that matches all ## Headings and Text of Headings that are nested under # Heading ⤵. The ⤵ should function as a kind of switch here. All ## Headings and Text of headings should be matched with capturing groups. So Content nested under # Heading without the ⤵ should not be matched. Hence the matched text should be:

## Heading 1.1
Text of Heading 1.1
More text
Even more text
## Heading 1.2
Text of Heading 1.2
## Heading 3.1
Text of Heading 3.1
## Heading 3.2
Text of Heading 3.2

Here’s what I came up with regex101. My problem is, that this way only the first ## headings and texts get matched and I can’t find a solution.

Thanks in advance!

Answers

To match the desired headings and text using regex, you can use a pattern like the following:

# Heading 1 ⤵(?:[sS]*?## (.*?)(?:(?=#)|$)([sS]*?)(?=(?:## |$)))?

Here’s an example in JavaScript:

const input = `# Heading 1 ⤵
## Heading 1.1
Text of Heading 1.1
More text
Even more text
## Heading 1.2 
Text of Heading 1.2
# Heading 2
## Heading 2.1
Text of Heading 2.1
## Heading 2.2
Text of Heading 2.2
# Heading 3 ⤵
## Heading 3.1
Text of Heading 3.1
## Heading 3.2
Text of Heading 3.2
# Heading 4`;

const regex = /# Heading 1 ⤵(?:[sS]*?## (.*?)(?:(?=#)|$)([sS]*?)(?=(?:## |$)))?/g;

let match;
while ((match = regex.exec(input)) !== null) {
  const nestedHeading = match[1];
  const nestedText = match[2];
  console.log(`## ${nestedHeading}`);
  console.log(nestedText);
}

- Thefourthbird
- September 29, 2023 at 3:23 pm
- 0 votes
0
You might use:
```
(?<=^# .*⤵(?:n(?!# ).*)*)n(^## .*)n(?!^##? )(.*(?:n(?!^##? ).*)*)
```
The pattern matches:
- (?<= Positive lookbehind, assert that to the left is
  - ^# .*⤵ Match # and the rest of the line ending on ⤵
  - (?:n(?!# ).*)* Optionally match all lines that do not start with 1+ # chars and a space
  - n Match a newline
- ) Close the lookbehind
- (^## .*) Capture group 1, match ## followed by the rest of the line
- n Match a newline
- (?!^##? ) Negative lookahead, assert that the line does not start with # or ## and a space
- ( Capture group 2
  - .* Match the whole line
  - (?:n(?!^##? ).*)* Optionally match all lines that do not start with # or ## and a space
- ) Close group 2
Regex demo
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Javascript – Regex to match markdown headings and text nested under specific heading

Answers