skip to Main Content

I’m trying to post-process auto-generated TypeScript code. The generated files contain interfaces, and the properties of the interfaces have doc comments. Some of these doc comments include RexEx patterns with curled braces.

I need a RegEx pattern that selects the individual interfaces, but not the blank lines or maybe comments in between them. What I’m struggling with is the curly braces inside the comments because they make it very difficult to find a pattern that matches the whole interface body from its signature until its closing brace.

The files I try to post-process look like this:

export interface SomeInterface {
  /**
   * Some comment on property1
   */
  property1: string;
  /**
   * Some comment on property2, including RegEx pattern with curly braces such as [a-z1-9]{2}
   */
  property2: string;
  /**
   * Some comment on property3
   */
  property3: string;

  // some more properties and doc comments, some of which have curly braces inside too
}

// This comment has to be excluded

export interface AnotherInterface {
  // internally very similar to 'SomeInterface' above
}

What I tried so far is

/export interface .*{([^}])+}/ g

and

/export interface .*{([^])+}/ g

Both don’t work. The first one only selects the substring from the start of the signature of the first interface until the first closing curly brace in the doc comments.
The second one selects all interface bodies at once (i.e., from the signature of the first interface until the closing curly brace of the last interface and everything in between) which is not what I want.

Any help and suggestions are highly appreciated.

2

Answers


  1. I would create two regular expresions:

    1. rex1 – A regular expression to match and remove comments. This could result in creating additional empty lines: ///.*|/*[sS]*?*//g
    2. rex2 – A regular expression to match and remove empty lines: /^s*$n/mg. Note that [sS] matches any character including newline.

    I would apply the two regular expressions in succession using the string replace function as follows:

    const text = `export interface SomeInterface {
      /**
       * Some comment on property1
       */
      property1: string;
      /**
       * Some comment on property2, including RegEx pattern with curly braces such as [a-z1-9]{2}
       */
      property2: string;
      /**
       * Some comment on property3
       */
      property3: string;
    
      // some more properties and doc comments, some of which have curly braces inside too
    }
    
    // This comment has to be excluded
    
    export interface AnotherInterface {
      // internally very similar to 'SomeInterface' above
    }`;
    
    const rex1 = ///.*|/*[sS]*?*//g; // remove comments
    const rex2 = /^s*$n/mg; // remove blank lines
    
    console.log(text.replace(rex1, "").replace(rex2, ""));
    Login or Signup to reply.
  2. A regex matches text and has no notion of programming language structures.

    Using a pattern to do so is best effort only and can have numerous edge cases.

    If the structure of the data is always the same and matching optional leading spaces:

    ^[^Sn]*export interface .*{[^]*?n[^Sn]*}
    

    The pattern matches:

    • ^ Start of string
    • [^Sn]* Match optional whitespace chars without newlines
    • export interface .*
    • { Match {
    • [^]*? Optionally repeat matching any char including newlines (Javascript notation), as few as possible
    • n Match a newline
    • [^Sn]* Match optional whitespace chars without newlines
    • } Match literally

    Regex demo

    If there are no leading/trailing spaces:

    ^export interface .*{[^]*?n}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search