Suppose the following:
create schema bv;
create table bv.user(id bigint primary key);
create table bv.user_photo (
id bigint primary key,
url varchar(255) not null,
user_id bigint references bv.user(id)
);
insert into bv.user values (100), (101);
insert into bv.user_photo values
(1, 'https://1.com', 100),
(3, 'https://3.com', 100),
(4, 'https://4.com', 101),
(2, 'https://2.com', 100);
I’d like to query for and build an object for every user, and include only the latest image in the result.
Here’s what I have:
select
json_build_object(
'id', u.id,
'latest_image', up.url
) user
from bv.user u
left join bv.user_photo up
on u.id = up.user_id
However this returns,
[
{"id" : 100, "url" : "https://2.com"},
{"id" : 100, "url" : "https://3.com"},
{"id" : 100, "url" : "https://1.com"},
{"id" : 101, "url" : "https://4.com"}
]
However, the expected result is:
[
{"id" : 100, "url" : "https://3.com"},
{"id" : 101, "url" : "https://4.com"}
]
I’ve tried using distinct
,
select distinct on(u.id)
json_build_object(
'id', u.id,
'url', up.url
) user
from bv.user u
left join bv.user_photo up
on u.id = up.user_id
order by u.id, up.id DESC
But my question is whether or not this is the correct approach? I feel like I shouldn’t be using distinct
in such a situation.
2
Answers
This is the correct approach and you by all means should be using
distinct on
in situations like this. Plaindistinct
– no,distinct on
– yes:One thing I’m missing here is that you’re expecting a single array of those objects, but your query returns each object in a separate row. You need a
json_agg()
to collect them: demoWith few photos per user, and while you return all (or most) users,
DISTINCT ON
is the best approach.But it’s typically faster to get distinct photos before you join:
For many photos per user, an emulated index-skip scan is (much) faster. See:
To produce one summary array, you can skip
json_build_object()
.json_agg()
can aggregate the row directly:fiddle
Aside: Don’t use the reserved word "user" as identifier. It works while schema-qualified, but fails without.