Javascript - Select code blocks but ignore all curly braces inside these blocks

gretarsson
November 25, 2023
324 views
0 votes
2 Answers

I’m trying to post-process auto-generated TypeScript code. The generated files contain interfaces, and the properties of the interfaces have doc comments. Some of these doc comments include RexEx patterns with curled braces.

I need a RegEx pattern that selects the individual interfaces, but not the blank lines or maybe comments in between them. What I’m struggling with is the curly braces inside the comments because they make it very difficult to find a pattern that matches the whole interface body from its signature until its closing brace.

The files I try to post-process look like this:

export interface SomeInterface {
  /**
   * Some comment on property1
   */
  property1: string;
  /**
   * Some comment on property2, including RegEx pattern with curly braces such as [a-z1-9]{2}
   */
  property2: string;
  /**
   * Some comment on property3
   */
  property3: string;

  // some more properties and doc comments, some of which have curly braces inside too
}

// This comment has to be excluded

export interface AnotherInterface {
  // internally very similar to 'SomeInterface' above
}

What I tried so far is

/export interface .*{([^}])+}/ g

and

/export interface .*{([^])+}/ g

Both don’t work. The first one only selects the substring from the start of the signature of the first interface until the first closing curly brace in the doc comments.
The second one selects all interface bodies at once (i.e., from the signature of the first interface until the closing curly brace of the last interface and everything in between) which is not what I want.

Any help and suggestions are highly appreciated.

Tags: javascript regex

Answers

I would create two regular expresions:

rex1 – A regular expression to match and remove comments. This could result in creating additional empty lines: ///.*|/*[sS]*?*//g
rex2 – A regular expression to match and remove empty lines: /^s*$n/mg. Note that [sS] matches any character including newline.

I would apply the two regular expressions in succession using the string replace function as follows:

const text = `export interface SomeInterface {
  /**
   * Some comment on property1
   */
  property1: string;
  /**
   * Some comment on property2, including RegEx pattern with curly braces such as [a-z1-9]{2}
   */
  property2: string;
  /**
   * Some comment on property3
   */
  property3: string;

  // some more properties and doc comments, some of which have curly braces inside too
}

// This comment has to be excluded

export interface AnotherInterface {
  // internally very similar to 'SomeInterface' above
}`;

const rex1 = ///.*|/*[sS]*?*//g; // remove comments
const rex2 = /^s*$n/mg; // remove blank lines

console.log(text.replace(rex1, "").replace(rex2, ""));

- Thefourthbird
- November 25, 2023 at 4:13 pm
- 0 votes
0
A regex matches text and has no notion of programming language structures.

Using a pattern to do so is best effort only and can have numerous edge cases.

If the structure of the data is always the same and matching optional leading spaces:
```
^[^Sn]*export interface .*{[^]*?n[^Sn]*}
```
The pattern matches:
- ^ Start of string
- [^Sn]* Match optional whitespace chars without newlines
- export interface .*
- { Match {
- [^]*? Optionally repeat matching any char including newlines (Javascript notation), as few as possible
- n Match a newline
- [^Sn]* Match optional whitespace chars without newlines
- } Match literally
Regex demo

If there are no leading/trailing spaces:
```
^export interface .*{[^]*?n}
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Javascript – Select code blocks but ignore all curly braces inside these blocks

Answers