I need to convert a text query from standard "verbal" boolean to the FULLLTEXT MATCH AGAINST + and – system.
For example; this value:
((word1 AND word2) OR (word3 AND word4 AND word5)) OR word6 NOT word7
Needs to be converted to:
((+word1 +word2) (+word3 +word4 +word5) word6 -word7
The issue I’m having is when I try to use REGEXP_REPLACE with a pattern like this:
\b([[:alnum:]_]+)\b
I can’t figure out how to have the function use the word that matched to append/amend to it. I’ve seen other examples trying to use 1 or something like that to represent the matching value, but it simply doesn’t work.
So how can I command REGEXP_REPLACE to dynamically scan a string for "(word +word" and replace it with "(+word +word" ? Also needs to do a leading simple one like:
It’s also a bit more complex, because if we have:
(word1 AND word2) AND (word3 OR word4)
That would need to be:
+(+word1 +word2) +(word3 word4)
Any guidance on the best way to do this without writing an ultra slow parser would be appreciated.
2
Answers
I figured out how to convert a AND/OR/NOT boolean statement like the below to a compatible MATCH AGAINST solution, and hope this helps other people.
This converts:
TO
CODE:
In a way, this SQL-style solution was made for fun, but maybe it will help someone.
db<>fiddle