For instance, say we’re looking at a query string, all-lowercase, all non-numeric, no special character (just [a-z]
and =
):
?some=querystring&ssembly=containing&n=indeterminate&mount=of&ll=potentially&ccordant=matches
Let us take as a given we know there will be three key-value pairs we wish to capture, and even that they are located at the beginning of said string:
some=querystring
ssembly=containing
n=indeterminate
Now, intuitively, it seems like I should be able to use something like…
^?(&?[a-z=]+){3}.*$
…or possibly…
^?(?:&?([a-z=]+)){3}.*$
…but, of course, the only capture this yields is
n=indeterminate
Is there a syntax that would allow me to capture all three groups (as independent, accessible values, natch) without having to resort to the following?
^?([a-z=]+)&([a-z=]+)&([a-z=]+).*$
I know there’s no way to capture n instances (an arbitrarily-large set), but, given this is a finite number of captures I wish to obtain from my finite automata…
I know full well there are any number of ways to accomplish this in Javascript, or any other language for that matter. I’m specifically trying to ascertain if I’m stuck with the WET expression above.
2
Answers
There’s no recursion in EcmaScript regular expressions. Reference documentation is here, you’ll see there’s no recursion operator. You can also check regular-expressions.info; it tells which engines support recursion: Perl 5.10, PCRE 4.0, Ruby 2.0, Delphi, PHP, and R.
JavaScript has no concept of recursion in its regex syntax, but the example you have given is not about recursion, but adjacent repetition of the same pattern.
In that case I would suggest using a regex that just matches one occurrence of that pattern, but with the
g
flag, and use it withmatchAll
. This returns an iterator, and so you just consume the part that you need.If it is guaranteed that you will have three matches, you can do:
This is just an example that is targeting your example. As
matchAll
returns an iterator, you can use the power of JS to work with iterators (like afor
loop, destructuring assignment, spread syntax, …etc).Alternative: dynamically built regex
The repetitive nature of the regex you are troubled about can be taken over by the
repeat()
method: