I have this regex:
"(WORD1.*WORD2.*WORD3)|(WORD1.*WORD3.*WORD2)|(WORD2.*WORD1.*WORD3)|(WORD2.*WORD3.*WORD1)|(WORD3.*WORD1.*WORD2)|(WORD3.*WORD2.*WORD1)"
It matches to these words:
WORD1WORD2WORD3
WORD1AWORD2BWORD3C
WORD3WORD1WORD2
WORD1WORD2WORD3WORD1
But not these words:
WORD1WORD1WORD2
WORD1AWORD1BWORD2C
This regex matches when it finds a string where there are the 3 words (WORD1
, WORD2
, WORD3
) in any order.
I would like to do the same thing with more words but the problem is that the size of the regex increases exponentially with the number of words.
Is it possible to simplify the way this regex is constructed to solve this problem (that the size does not increase exponentially) ?
3
Answers
You could use positive lookahead for each of the words.
Simply iterate over all strings and filter out all those which does not include all keywords:
(a terser version can be found in the snippet below)
Try it:
Not sure I understood, but if I did:
Regex101.com demo