I’m working with 4 to 8 digit numeral strings, range 0001 to 99999999. Examples are:
- 0010
- 877565
- 90204394
I need to check whether a numeral string can be formed out of a defined set. Think of it as a Scrabble bag of loose characters. The set contains:
- 2 times 0 (00)
- 4 times 1 (1111)
- 3 times 2 (222)
- 2 times 3 (33)
- 3 times 4 (444)
- 5 times 5 (55555)
- 2 times 6 (66)
- 5 times 7 (77777)
- 2 times 8 (88)
- 2 times 9 (99)
With this defined set of numerals, the string 0010
cannot be formed because it has 1 zero too many: it needs 3 but the set only provides 2. Outcome should be: false
.
In contrast, the string 90204394
can be formed because the defined set provides a sufficient number of each numeral. It falls within parameters; desired output: true
.
I thought to carry out the check by means of regex because that will return either true or false, which is perfect in this case. I came up with the following:
preg_match('/(0{0,2}1{0,4}2{0,3}3{0,2}4{0,3}5{0,5}6{0,2}7{0,5}8{0,2}9{0,2})/', $string);
Unfortunately I end up with the outcome that every tested string outputs true
, even when it clearly cannot be formed; like 08228282
(as it contains one 8 and one 2 too many).
What am I missing here?
2
Answers
Scott,
It looks like regex is not what you’re looking for.
This regex is returning true because it gets the number of ocurrences of each digit sequentially.
For example, in case of the 08228282, at first, it gets the number of occurences of the digit 0, it happens 1 time, which is between 0 and 2 times ( {0,2} ), then it gets the digit 8, which happens only 1 time too and is true for {0,2} occurences. And the verification stops there, nothing else needs to be validated, because everything else can happen 0 times.
Another example: 877565 it only validates the occurences of digit 8.
I think the solution you need is not with regex, since you just need the total occurences of each digit.
You should look forward for splitting the number in parts and count occurences. Try something like this:
I don’t understand the format of your whitelisted number counts, but if it’s hardcoded like you’ve posted in your question, you can use full string lookaheads to validate the count of each whitelisted number. Demo
Output:
If you are building your regex from an array of numbers and their counts, then you can use:
Alternatively, you could return false when a number’s max is exceeded and invert the boolean result shown in the first snippet. Demo
Or
Or with negated lookaheads for an exceeded count limit:
This can be performed without regex as well. For each input string, split the string into an array of numbers, count those numbers, filter out the non-violating elements, then check if that result is empty or not. Demo
I guess what I am saying is that there will be many ways to skin this cat.