I am in need to use the Word Boundary(y) as a condition to match the exact word which is been searched as a Query Param. I am writing this regex and executing in PostgreSQL but it is returning FALSE.
text_search_value = "apple." (Which needs to be exactly matched in the {doc_table} which is presented in the code)
My code
vector_split_text = re.sub(".|-|/", " ", text_search_value)
text_search_value = re.escape((re.sub(")",")",re.sub("(","(",text_search_value))))
converted_part_query = f""""{doc_table}".lni IN (SELECT BTRIM(xxx) FROM "{docsplit_table}" WHERE yyy IN ({all_rcis}) AND doc_text_vector @@ phraseto_tsquery('simple', '{vector_split_text}'))"""
converted_part_query = (
"("
+ converted_part_query
+ f""" and "{doc_table}".document::text ~ 'y{text_search_value}y'"""
+ ")"
)
PostgreSQL Query
select 'apple.' ~ 'yapple.y'
Expected Result
The Select Query should return TRUE while executing in the PostgreSQL Table with the help of regex fixed.
2
Answers
I’m sorry, I might be confused, but if you’re using a Python regex command, y is not a word boundary. Instead, you’re using the literal "y" letter to match your word. You should use b for a word boundary. The correct regex syntax should look like the following:
select ‘apple.’ ~ ‘bappleb’
made with regex101
The literal dot
.
doesn’t belong to theword
character class:That, combined with two meanings of
y
:y
And the note under the table this is from:
Means your pattern expects another word to start at the second
y
.demo at db<>fiddle
If you’re looking for an
apple
followed by a dot and not preceded or followed by any word characters, use'yapple.Y'
. Or a'yappley'
if you don’t really need the dot.I’m not sure what it is you’re trying to achieve exactly but if you’re trying to wrap user input, you might need to
trim()
non-word characters before adding anm
on the front and anM
at the end to avoid this, or switch betweeny
andY
based on the leading/trailing character, if you don’t want to match anything with another word glued to it.