skip to Main Content

Im trying to get my head around the most effective way to achieve intended result.
After some reading i realize turning only_full_group_by=off is a bad idea.

I have the following data:

Table test

id task user
1 21 4
2 22 4
3 23 3
4 25 3
5 25 4
6 22 4

Goal

i want to return all task values that do not have an other user assigned to it. So user=4should return [21,22] and user=3 => [23]

Tried solutions

to achieve this using only_full_group_by=off i put together the following SQL

SELECT task, count(distinct user), user  FROM test
GROUP BY task
HAVING count(distinct user) = 1 AND user=4

SQL fiddle 1

But to achieve a similar result with only_full_group_by=on the query seems unnecessarily complex (maybe it isnt, let me know)

SELECT DISTINCT test.task, test.user FROM test, (
  SELECT task, count(distinct user)  from test
  GROUP BY task
  HAVING count(distinct user) = 1
) as tmp WHERE user = 4 AND test.task=tmp.task

SQL fiddle2

Both queries returns "same" output but the latter seems unnecessarily complex and possibly inefficient. But im no SQL expert.

Is there a more efficient way to achive this outcome?

2

Answers


  1. You could do this with a LEFT JOIN that looks for another user on the same task, then enforce the constraint that this should not match any rows:

    SELECT DISTINCT test.task, test.user
    FROM test
    LEFT JOIN test t2 ON
        test.task = t2.task
        AND test.user <> t2.user
    WHERE test.user = 4
    AND t2.id IS NULL;
    
    Login or Signup to reply.
  2. This query is essentially the same as Rob Eyre’s answer but written using NOT EXISTS, instead of the traditional anti-join:

    SELECT DISTINCT t1.task, t1.user
    FROM test t1
    WHERE user = 4
    AND NOT EXISTS (
        SELECT 1 FROM test t2
        WHERE t1.task = t2.task AND t1.user <> t2.user
    );
    

    Unless there is a reason for having the surrogate PK, I would remove it and make (task user) the PK and add a secondary on (user, task), after removing the duplicates.

    DELETE t1
    FROM test t1
    JOIN (
      SELECT id, ROW_NUMBER() OVER (PARTITION BY task, user ORDER BY id) rn FROM test
    ) t2 ON t1.id = t2.id
    WHERE t2.rn > 1;
    
    ALTER TABLE test
      DROP COLUMN id,
      ADD PRIMARY KEY (task, user),
      ADD INDEX (user, task);
    

    Here’s a db<>fiddle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search