I am working on a search function, where the matches are weighted based on certain conditions. One of the conditions I want to add weight to is matches where the character length of the query string in a LIKE match is longer than 4.
This is what I want to the query to look like, roughly. %s
is meant to represent the actual match found by LIKE, but I don’t think it does. I’m wondering if there is a special variable in MySQL that does represent the precise character match found by LIKE.
SELECT help.*,
IF(CHAR_LENGTH(%s) > 4, 2, 0) w
FROM help
WHERE (
(title LIKE '%this%' OR title LIKE '%testy%' OR title LIKE '%test%') OR
(content LIKE '%this%' OR content LIKE '%testy%' OR content LIKE '%test%')
) LIMIT 1000
edit: I could in the PHP split the search string array into two arrays based on the character length of the elements, with two separate queries that return different values for ‘w’, then combine the results, but I’d rather not do that, as it seems to me that would be awkward, messy, and slow.
2
Answers
I think the answer to my question is that what I wanted to do isn't possible. There is no special variable in MySQL representing the core character match in a WHERE condtional where LIKE is the operator. The match is the contents of the returned data row.
What I did to reach my objective was took the original dynamic list of search tokens, iterated through that list, and performed a search on each token, with the SQL tailored to the conditions that matched each token.
As I did this I built an array of the search results, using the id for the database row as the index for the array. This allowed me to perform calculations with the array elements, while avoiding duplicates.
I'm not posting the PHP code because the original question was about the SQL.
Check out
FULLTEXT
as another way to discover rows. It will be faster, but won’t address your question.This probably has the effect you want.
Note that the "match" in your
LIKEs
includes the%
, so it is the entire length of the string. I don’t think that is what you wanted.REGEXP "(this|testy|that)"
will match either 4 or 5 characters (in this example). It may be possible to do something withREGEXP_REPLACE
to replace that with the empty string, then see how much it shrank.