Postgresql: distinct by one field and order by another

Joymaker
February 23, 2023
141 views
0 votes
2 Answers

I am following along the example of this excellent question and first answer:

PostgreSQL – fetch the rows which have the Max value for a column in each GROUP BY group

But I need to do something slightly different. I want to select distinctly based on events.uuid, because events.start might not be unique. But I want to order by events.start because uuid’s are generated in no particular order. Database doesn’t like it. What is a good way of doing what I really want to do?

SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start;

In English I would state it like this: "For each event, give me exactly one play, and order the results by event start time."

Tags: postgresql

Answers

- jjanes
- February 23, 2023 at 5:00 pm
- 0 votes
0
You just need two ORDER BYs, which means you need two levels of select:
```
SELECT * from (
    SELECT DISTINCT ON (events.uuid) events.nickname, plays.id, events.start
    FROM events
    JOIN plays on plays.eventuuid = events.uuid
    ORDER BY events.uuid
) foo 
ORDER BY start;
```
You can actually omit the inner ORDER BY altogether, as it will be implied. But the implied ORDER BY still needs to be on a different query level than the explicit one.

Starting in v15, the dummy alias ("foo" here) can also be omitted.
Login or Signup to reply.

- GeorgeS
- February 23, 2023 at 10:02 pm
- 0 votes
0
If the uuid is a primary key for the events table, then you’re in luck because the set:
```
SELECT DISTINCT ON (events.start, events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.start, events.uuid;
```
Is the same as that returned by:
```
SELECT DISTINCT ON (events.uuid) events.nickname, plays.id
FROM events
JOIN plays on plays.eventuuid = events.uuid
ORDER BY events.uuid;
```
But just in a different order.
Having both events.start and events.uuid in the distinct clause will mean the set of the two is unique, but if uuid is already unique per event row, then adding other columns from the events table to the distinct on clause should not change the result set.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.