let myString = "Hello 'How are you' foo bar abc 'Strings are cool' d b s ;12gh gh76;"
const myRegEx = / w+ "w* +" | ;w +; +/g // This what i have figured but its not working :(
const splitedString = myString.split(myRegEx)
console.log(splitedString)
Epected Output: ["Hello", "How are you", "foo", "bar", abc, "Strings are cool", "d", "b", "s", "12gh-gh76"]
Letme try to explain more:
First off all split whole string on basis of a space " ", except string inside ''
or ;;
, like:
"Hello 'Yo what's up'"
–> ["Hello", "Yo-what's-up"]
(Notice Here’s extra '
in what'
s, so handle that too.)
Then if string is inside ;;
then concat (i beleive thats the right name) it with -
like:
Hello ;hi there;
–> ["Hello", "hi-there"]
and in the end return a array of all the formatting done… as expected output.
3
Answers
You can use
matchAll
instead of split, to find match content of quotes, pair of semicolons or separate words with regex([';]).+?1|w+
.And later remove wrapping and replace spaces where needed.
Notice, that this solution works on the assumption, that wrapper (quotes and semicolons) are not nested.
If you need to account nested wrappers, regex is not best tool for the job, since you’ll need to check "parenthesis"-balance, and while it’s possible with regex, where easier ways to do that.
You might capture the parts that you want to reformat and then after process them by checking for the capture group number:
The pattern matches:
'
Match'
(
Capture group 1[^']+'
match 1+ chars other than'
followed by'
(?:'[^s'][^']*)*
Optionally repeat a single non whitespace char other than'
followed by optional chars other than'
)
Close group'
Match'
|
Or;([^;]+);
Match from;...;
and capture in group 2 what is inside|
OrS+
Match 1+ whitspace charsRegex demo
One needs at least a two folded approach
First one has to
replace
any semicolon delimited range by replacing each of its whitespace sequence(s) with a single dash which would look like …… where the regex is …
/;([^;]*);/g
… and the result will be …Secondly one needs to come up with a
split
ting regex which can handle both, splitting at any whitespace (sequence) but only if it is not part of a single quotes enclosed substring. The latter needs to be captured in order to be preserved while splitting. The above example code then continues to look like …… where the splitting regex is …
/'(.*?(?<!\))'|s+/
… and the resulting array does contain a lot of empty values like empty string values and undefined values. Thus thesplit
task needs to be accompanied by areduce
based cleanup task …The next provided example code just does proof the explanation of the above approach …