My page receives the conditions as strings in an array from somewhere I can’t change. The array values look like this:
$array_conditions[0] = column1 IN (1, 2, 3, 4)
I have a loop set up to go through these conditions and build the query string, which ends up looking like this:
SELECT * FROM table WHERE column1 IN (1, 2, 3, 4) AND column2 IN (1, 2, 3, 4)
So, The issue is, that array can sometimes contain two conditions under the same column name. There are way too many possible columns so I can’t treat each individually, and column names can possibly repeat any number of times.
$array_conditions[0] = column1 IN (1, 2, 3, 4)
$array_conditions[1] = column2 IN (1, 2, 3)
$array_conditions[2] = column1 IN (4, 5, 6)
SELECT * FROM table WHERE column1 IN (1, 2, 3, 4) AND column2 IN (1, 2, 3) AND column1 IN (4, 5, 6)
How would I go about to get an output like one of these two?
SELECT * FROM table WHERE column1 IN (1, 2, 3, 4, 5, 6) AND column2 IN (1, 2, 3)
SELECT * FROM table WHERE (column1 IN (1, 2, 3, 4) OR column1 IN (4, 5, 6)) AND column2 IN (1, 2, 3)
After a bit of searching on PHP functions, I realized regex might be the way to solve this. But I didn’t know any regex at all. I tried searching for some solutions online and even studying regex for a while but I haven’t come across the solution I’ve been looking for. All threads I could find could be solved by something like:
(?<=,|^)([^,]*)(?=.*\b\1\b)(?=,|$)
But a solution like this wouldn’t be enough, because I still need to either keep the values inside the parenthesis or keep the whole condition, bring it next to the other condition with the same column name then replace the "AND" with an "OR"
Best thing I managed to get was out from ChatGPT. I can’t undestand a thing about it and it can only replace one condition with two matches at a time. I tried coming up with a loop to replace it as many times as needed but I couldn’t figure out a way to make it run only if there were still duplicates.
$sql = preg_replace_callback('/(w+)s+ins+((d+(,s*d+)*)).*?1s+ins+((d+(,s*d+)*)/',
function($matches) {return "({$matches[1]} in ({$matches[2]}) or {$matches[1]} in ({$matches[4]}))";}, $sql);
2
Answers
A possible solution would be to break it down to 2 passes.
The first goes through all of the options and extracts the column name and the values (assuming they are all in clauses)…
This would mean you now have a list of the variables and all of the values form the in clauses in one array.
So now iterate over this array and build an array of the conditions, imploding the separated values into one in clause…
Finally output the array of parts with the and (change to or if needed)
which with your sample input gives…
Using a regex is a good idea, but I would not overcomplicate it.
Here I strip the field names and the values and store it in an array. Then recompose it. With array_unique you can ensure not having duplicates and the intval does some sanitizing.
giving output of: