skip to Main Content

I’m trying to build a regex to replace all the couples of symbols "$$" with some HTML tag, say, <someTag></someTag>.

I use this regular expression but it doesn’t cover all the cases:

$$(S[^*]+S)$$
'aaa $$123$$ c$ ddd'.replace(/$$(S[^*]+S)$$/g, '<a1>$1</a1>') // works

'aaa $$123$$ c$ $$ddd$$'.replace(/$$(S[^*]+S)$$/g, '<a1>$1</a1>') // doesn't work, should be 'aaa <a1>123</a1> c$ <a1>ddd</a1>'
console.log('aaa $$123$$ c$ ddd'.replace(/$$(S[^*]+S)$$/g, '<a1>$1</a1>')) // works

console.log('aaa $$123$$ c$ $$ddd$$'.replace(/$$(S[^*]+S)$$/g, '<a1>$1</a1>')) // doesn't work, should be 'aaa <a1>123</a1> c$ <a1>ddd</a1>'

5

Answers


  1. Not a regex solution, but this works. Explanation: Split the string using your delimiter ($$). Then create a new string result and insert each part of the array. Then check if the current index is odd or even and depending on that add either the opening tag (prefix) or the closing tag (suffix). I hope this helps!

    function replaceTag(string, delimiter, prefix, suffix){
      
      let parts = string.split(delimiter);
      let result = '';
      
      for(let index = 0; index < parts.length; index++){
      
        result += parts[index];
      
        if(index % 2 == 0 && index < parts.length - 1){
        
          result += prefix;
        
        }
        else if(index < parts.length - 1){
        
          result += suffix;
        
        }
      
      }
      
      return result;
    
    }
    
    console.log(replaceTag('aaa $$123$$ c$ ddd', '$$', '<my-tag>', '</my-tag>'));
    console.log(replaceTag('aaa $$123$$ c$ $$ddd$$', '$$', '<my-tag>', '</my-tag>'));
    Login or Signup to reply.
  2. The problem here is that your regex matches greedily, meaning it matches the biggest part of the string it is able to match. To make it ungreedy, you have to add the ? "lazy quantifier". I suggest using a regex like this one:

    const regex = /$$(w+?)$$/g
    

    It matches two $ signs and then at least one word character.

    Login or Signup to reply.
  3. The pure regex solution you can use is this regex,

    $$((?:(?!$$).)+)$$
    

    Where ((?:(?!$$).)+) part (aka tampered greedy token) will match anything except $$ and the content you are looking for gets captured in group1 and then you can easily place it in the tags like you want.

    This regex will even allow you to capture a single $ if it is present in the content and will work even if your content is spread across multiple lines for which you can either replace . with [wW] or you can change . to something else as per your char matching needs,

    Demo

    Login or Signup to reply.
  4. You could use /$$(S+)$$/g.

    Segment Description
    $$
    Match literal "$$"…
    (S+)
    …capture one or more non-whitespace characters…
    $$
    …then match literal "$$".

    This ensures that a successful match requires any number of characters adjacent to two sets of "$$" with no spaces.

    const str = 'aaa $$123$$ $$c$ $$x $$ $$ddd$$';
    const rgx = /$$(S+)$$/g;
    const sub = '<a1>$1</a1>';
    
    const res = str.replace(rgx, sub);
    
    console.log(res);
    Login or Signup to reply.
  5. The fastest way is to use the non greedy dot-all approach: /$$(.*?)$$/sg
    https://regex101.com/r/upveAX/1
    Using the Dot alone will always be faster since it doesn’t rely on assertion or class structures,
    which add 3 times more overhead in performance.

    console.log('aaa $$123$$ c$ ddd'.replace(/$$(.*?)$$/sg, '<a1>$1</a1>'));
    
    console.log('aaa $$123$$ c$ $$ddd$$'.replace(/$$(.*?)$$/sg, '<a1>$1</a1>'));

    Should you choose to not use the Dot approach, you run the risk of orphaned
    $$ in your text as the regex mainly expect pairs of $$..$$.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search