skip to Main Content

I’ve got many of the following type of thing: <span class="c">a+2da + 2d</span>

The first part of the span’s content is part of a math expression a+2d (without spaces) and the second part has the same text, but with spaces around the operator: a + 2d. I need to be able to capture the a+2d so I can remove it.

Some examples of the expressions I’ve got:
x-y=3x - y = 3, a+b-c=da + b - c = d, x-y=3x - y = 3 and it could involve underscores and brackets, like a_n=a+(n−1)da_n = a + (n - 1)d

I can find the first half (only) of the simpler examples using the regex:

/<span class="c">((w*[+|-|=]w*)*)</span>/

and the second half (only) with spaces using

/<span class="c">((w* [+|-|=] w*)*)</span>/

But I have no idea how to match the 2nd half given the first, or vice-versa. None of the look ahead or look behind examples I came across fit the situation, and became complicated very quickly. TIA.

3

Answers


  1. ((w*)([+-=]).*)(2 3.*)
    

    Only for that expression. We first match an operator with its preceding content (w*)([+-=])

    And then use 2 3 to search for the same content ahead but containing a space before the operator.

    The first half is captured in the 1st group, and the second half is captured in the 4th group.

    https://regex101.com/r/AW9Ba5/1

    Login or Signup to reply.
  2. Since each token in the second half is separated by a space, you can identify beginning of the second half by simply matching the first token, which is repeated before the first space or, if it’s the only token, </span>:

    <span class="c">K(S+)S*(?=1(?= |</span>))
    

    and replace the match with an empty string to remove it.

    Demo: https://regex101.com/r/pmfB4f/1

    Login or Signup to reply.
  3. Try matching:

    <span class="c">K(?<first>[^-+=s]+)(?:(?<operation>[-+=])S*)?(?=(?P=first)(?(operation)s+(?P=operation)))
    

    and replacing with nothing.

    See: regex101 (borrowed from blhsing)


    Explanation

    MATCH:

    1. <span class="c">: Ensure formula is preceeded by (i.e. is inside of) a span tag
    • K: and disregard this.
    1. (?<first>[^-+=s]+): capture first argument
    • (?: ... )?: and optionally
      • (?<operation>[-+=]): the succeeding operation
      • S*: as well as the rest of the formula.
    1. (?= ... ): Now validate
    • (?P=first): that the "spacy" equation begins with the first argument
    • (?(operation) ... ): and if an operation was found
      • s+(?P=operation): the same operation preceeded by a space comes after it.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search