I have two tables keywords
and posts
in my PieCloudDB Database.
Each topic can be expressed by one or more keywords. If a keyword of a certain topic exists in the content of a post (case insensitive) then the post has this topic.
For example:
topic_id | keyword |
---|---|
1 | basketball |
2 | music |
3 | food |
4 | war |
post_id | content |
---|---|
1 | A typhoon warning has been issued in southern Japan |
2 | We are going to play neither basketball nor volleyball |
3 | I am indulging in both the delightful music and delectable food |
4 | That basketball player fouled again |
Now I want to find the topics of each post according to the following rules:
-
If the post does not have keywords from any topic, its topic should be "
Vague!
". -
If the post has at least one keyword of any topic, its topic should be a string of the IDs of its topics sorted in ascending order and separated by commas ‘,’.
For the above example data, the results should be:
post_id | topics |
---|---|
1 | Vague! |
2 | 1 |
3 | 2,3 |
4 | 1 |
SELECT post_id, COALESCE(array_to_string(array_agg(DISTINCT topic_id ORDER BY topic_id), ','), 'Vague!') AS topic
FROM (
SELECT p.post_id, k.topic_id
FROM Posts p
LEFT JOIN Keywords k
ON LOWER(content) LIKE '% ' || keyword || ' %' OR content LIKE keyword || ' %' OR content LIKE '% ' || keyword
) a
GROUP BY post_id
ORDER BY post_id
I tried this query but the results I got were not exactly correct. I don’t know why the output of post 1 is null
:
post_id | topics |
---|---|
1 | |
2 | 1 |
3 | 2,3 |
4 | 1 |
Can anyone give me a correct answer?(If you don’t know the database I use, you can use PostgreSQL instead, thanks)
3
Answers
You’re not distinguishing between
NULL
and the empty string (''
) – this first query shows you where the problem is, the second shows a solution (all of the code below is available on the fiddle here):Note the use of the
PG_TYPEOF()
function (manual) – can be handy for debugging!Result:
So, the correct approach is:
Result:
You can simplify your query and just use
STRING_AGG
, so you get rid of your subquery. By the way, please take care to always put the table alias before the column name, not only sometimes.See this db<>fiddle with your data.
The fiddle also shows you could use
CASE
rather thanCOALESCE
in your orignal query which would solve your issue too. The issue wasCOALESCE
replacesNULL
values, but you got an empty string, notNULL
. Anyway, your previous query is far too complicated for your use case.You might also be able to simplify those
LIKE
conditions and rather use a Regex approach. Since I don’t use Pie DB and am not sure about regex functionality there, I leave this part up to you.To match keywords more accurately, you can try to use
regex
on your pieclouddb database.(The following query was tested on PostgreSQL)