skip to Main Content

See the query below where I get the reputation value from direct addition without a function.

SELECT
    "User"."id",
    "User"."name",
    "User"."title",
    "User"."about",
    "User"."location",
    "User"."isModerator",
    "User"."createdAt",
    (
--      question vote
        COALESCE(SUM("PostVote"."value" * 10) FILTER (WHERE "value" = 1 AND "Post"."questionId" IS NULL), 0)::int 
        +
        COALESCE(SUM("PostVote"."value" * 2) FILTER (WHERE "value" = -1 AND "Post"."questionId" IS NULL), 0)::int 
        +
--      answer vote
        COALESCE(SUM("PostVote"."value" * 10) FILTER (WHERE "value" = 1 AND "Post"."questionId" IS NOT NULL), 0)::int 
        +
        COALESCE(SUM("PostVote"."value" * 2) FILTER (WHERE "value" = -1 AND "Post"."questionId" IS NOT NULL), 0)::int 
        +
--      accepted answer
        (COALESCE(COUNT("answer"."id"), 0)::int * 15)
    ) as reputation
FROM "User"
LEFT JOIN "Post" ON "Post"."authorId" = "User"."id"
LEFT JOIN "PostVote" ON "PostVote"."postId" = "Post"."id"
LEFT JOIN "Post" as answer ON "Post"."acceptedAnswerId" = "answer"."id"
GROUP BY "User"."id"
ORDER BY reputation DESC, id

I did this because SUM cannot be nested, I originally wanted to do something like this.

-- ...
    SUM(
--      question vote
        COALESCE(SUM("PostVote"."value" * 10) FILTER (WHERE "value" = 1 AND "Post"."questionId" IS NULL), 0)::int 
        +
        COALESCE(SUM("PostVote"."value" * 2) FILTER (WHERE "value" = -1 AND "Post"."questionId" IS NULL), 0)::int 
        +
--      answer vote
        COALESCE(SUM("PostVote"."value" * 10) FILTER (WHERE "value" = 1 AND "Post"."questionId" IS NOT NULL), 0)::int 
        +
        COALESCE(SUM("PostVote"."value" * 2) FILTER (WHERE "value" = -1 AND "Post"."questionId" IS NOT NULL), 0)::int 
        +
--      accepted answer
        (COALESCE(COUNT("answer"."id"), 0)::int * 15)
    ) as reputation
-- ...

I’m curious if there are any performance impacts or summations that might potentially result in incorrect values? I have tested with negative values and it works fine, maybe there is something I missed.

I could have used the SUM function by nesting the queries, but I thought this would be too verbose.

2

Answers


  1. The difference is that SUM is an aggregation function — its argument is a single expression, which is evaluated for each relevant row, and it gives the sum of all of the results (or NULL if there were no results) — whereas + is a normal binary operator: it appears inside an expression, its operands are two subexpressions, and for any given row, it gives the sum of what the two subexpressions give.

    Additionally, they handle NULLs a bit differently; + gives NULL when either subexpression gives NULL, whereas SUM skips over NULLs (and returns NULL only if it receives no non-NULL values).

    I don’t think it’s meaningful to discuss relative performance, since they’re used in very different ways, so there are no queries that differ in using one vs. the other.

    Login or Signup to reply.
  2. I could have used the SUM function by nesting the queries …

    No you couldn’t have. You are missing the fundamental difference between an aggregate function (sum()) and a plain operator (+). ruakh‘s answer already goes into detail.

    But since you expressed concern for performance impacts: you can optimize your use of count() which never returns null. See:

    Assuming referential integrity, there is no need to actually join to the accepted answer. Just count not-null values in p."acceptedAnswerId". (I am not convinced you are counting "accepts" correctly in the first place, but I stuck to your logic.)

    Actually looks like you can replace all instances of sum() with a faster count(). Consequently, you need no COALESCE at all:

    SELECT u.id, u.name, u.title, u.about, u.location, u."isModerator", u."createdAt"
         ,( -- q & a upvote
           count(*) FILTER (WHERE pv."value" =  1)::int * 10
           -- q & a downvote 
         + count(*) FILTER (WHERE pv."value" = -1)::int * -2
           -- accepted answer
         + count(p."acceptedAnswerId")::int * 15  -- cheaper
          ) AS reputation
    FROM   "User"          u
    LEFT   JOIN "Post"     p  ON p."authorId" = u.id
    LEFT   JOIN "PostVote" pv ON pv."postId" = p.id
    -- LEFT JOIN "Post"     a  ON a.id = p."acceptedAnswerId"  -- noise?! 
    GROUP  BY u.id
    ORDER  BY reputation DESC, u.id;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search