This question is similar to another one, but I’m providing a simpler example. The other query was too advanced to make sense to me.
Sample (fake) data
id | gender | kg |
---|---|---|
4f5a07ca-02e0-8981-3c30-4d9924a169a3 | male | 103 |
4f5a07ca-02e0-8981-3c30-4d9924a169a3 | male | 85 |
4f5a07ca-02e0-8981-3c30-4d9924a169a3 | male | 469 |
e05d54e9-8292-b26c-5618-8a3712b4fc44 | female | 33 |
Desired outcome
id | gender | kg |
---|---|---|
4f5a07ca-02e0-8981-3c30-4d9924a169a3 | male | 85 |
e05d54e9-8292-b26c-5618-8a3712b4fc44 | female | 33 |
e05d54e9-8292-b26c-5618-8a3712b4fc44 | female | 36 |
01f8bbfd-cfc6-3b97-8bc1-8da6f0b4a9a8 | female | 92 |
(Goal is having the same id only show up once, and just picking the first match, given an ordering by kg
)
QUERY:
SELECT
p.id,
p.gender,
p.kg
FROM patient p
ORDER BY p.kg
GROUP BY 1
Error:
‘p.gender’ must be an aggregate expression or appear in GROUP BY clause
And if I change it to GROUP BY 1, 2
, I get the same error, one column over:
‘p.kg’ must be an aggregate expression or appear in GROUP BY clause
How can I solve this?
3
Answers
You can use an aggregate function to get min/max/avg kg. Otherwise you need to replace p.kg with a subquery returning a single version, something like:
what you need here is a window function first_value combined with distinct, rather than group-by / aggregate.
the window function + distinct pattern can sometimes be used to replicate aggregate query
"Ordering by kg" combined with selecting only one value sounds awfully like
min
/max
aggregate function. In this casemin
(for gender you can usemin_by
orarbitrary
):