I have three tables:
forums
with columnsid, label, name
discussions
with columnsid, name, creation_date, open, forum_id, user_id
messages
with columnsid, content, date, updated_date, discussion_id, user_id
I want to get each forum and for each forum get the number of discussions and the number of messages. I currently have the following code:
SELECT
"Forum"."id", "Forum"."label", "Forum"."name",
COUNT("Discussions"."id") AS "nbDiscussions",
COUNT("Discussions->Messages"."id") AS "nbMessages"
FROM
"forums" AS "Forum"
LEFT OUTER JOIN
"discussions" AS "Discussions" ON "Forum"."id" = "Discussions"."forum_id"
LEFT OUTER JOIN
"messages" AS "Discussions->Messages" ON "Discussions"."id" = "Discussions->Messages"."discussion_id"
GROUP BY
"Forum"."id"
ORDER BY
"Forum"."id";
It almost works except that nbDiscussions
have the same value as nbMessages
. Why are they the same?
On the result that I have, it looks like nbDiscussions
took the value of nbMessages
.
2
Answers
You may correct the counting problem in
nbDiscussions
by instead counting:Full query:
The point to realize here is that the final join from
discussions
tomessages
will result in a final table where a given discussionid
may be appear across multiple records, one for each message. UsingDISTINCT
avoids the over counting.While aggregating all (or most) rows, it is substantially faster to aggregate tables on the "many"-side first and join later. It rules out wrong (multiplied) counts, and columns from the primary table do not have to be aggregated at all (after those have not been multiplied to begin with):
① Add
COALESCE
only if there can be forums without discussions or discussions without messages, and you want to display 0 instead of null.See: