skip to Main Content

Good afternoon,

Today I was struggling with writing Regex in JavaScript, as I usually do. The idea is to use the replace method in combination with a regular expression to remove all whitespace characters outside of double-quote pairs.

So far I’ve been able to find the pairs using /(".*?")/g, then ignoring those by turning it into /(?=".*?")/g then in a futile attempt at selecting all the whitespace outside of the capture group a turned it into /s+(?=".*?")/g, this comes close.

string in:       this "string" should "give me "not so much trouble
_ as found:      this_"string"_should_"give me "not_so_much_trouble
after replace:   this"string"should"give me "notsomuchtrouble

My intend was to use this on JSON strings { "key": "My value " } would then turn into {"key":"My value"}, hence the need to not touch the whitespace inside the double-quotes.

I’m sure this is doable with Regex I just simply can’t find it, hopefully I’ve given enough information for someone with actual knowledge to help out. Any help is appreciated, thank you in advance!


yet another hurdle

Second part is how do I ignore " in the pattern for the "-pairs?

{      "key":   "he said "how would I do this" sadly?  "    }

becomes

{"key":"he said "how would I do this" sadly?  "}

4

Answers


  1. Chosen as BEST ANSWER

    I forgot to mention this, but this question is mostly aimed at myself experimenting with Regex.

    The solution that currently seems to work for me is /(".*?")?(?:s.*?)(".*?")?/g

    Breaking it down:

    (".*?") creates a capture group to match everything between a pair of double quotes. (in-qoutes)

    (?:s.*?) matches a continuous sequence of whitespace but does not store it's group (whitespace)

    ? makes the previous group optional

    /g any number of occurances within the string

    combining them as (in-quotes)?(whitespace)(in-quotes)?

    Then when replacing in js I can use the groups of all non-whitespace text:

    '{ "key"     :  "my value with spaces    
       and new-lines   "
    
       }'.replace(/(".*?")?(?:s.*?)(".*?")?/g, '$1$2')
    

    EDIT:

    This however breaks when there are escaped quotes in the string


  2. I don’t see how this is doable via regex.

    Any reason you can’t just:

    const theString = "{      "key":   "My value "    }";
    const theCleanedUpString = JSON.stringify(JSON.parse(theString));
    

    ?

    Login or Signup to reply.
  3. The regular expression does a look head for the pattern quote any word and quote pattern as a capture group. if it finds a space prior to the look ahead capture group than it replaces it with a space.

    txt="""this "string" should "give me "not so much trouble"""
    
    result=re.sub(r's+(?=(?:[^"]*"[^"]*")*[^"]*$)', '', txt)
    print(result)
    
    this"string"should"give me "notsomuchtrouble
    
    Login or Signup to reply.
  4. If this just pertains to a JSON string and not source code.
    To remove whitespace outside of quotes:

    var str = "{      "key":   "he said \"how would I do this\" sadly?  "    }";
    
    console.log("Before:  " + str);
    
    str = str.replace(/(?<!\)((?:\\)*)(?:("[^"\]*(?:\[Ss][^"\]*)*")|\?s+)/g, '$1$2', str);
    
    console.log("After:   " + str);
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search