I’ve tried all afternoon to dedup a table that looks like this:
ID1 | ID2 | Date | Time |Status | Price
----+-----+------------+-----------------+--------+-------
01 | A | 01/01/2022 | 10:41:47.000000 | DDD | 55
01 | B | 02/01/2022 | 16:22:31.000000 | DDD | 53
02 | C | 01/01/2022 | 08:54:03.000000 | AAA | 72
02 | D | 03/01/2022 | 11:12:35.000000 | DDD |
03 | E | 01/01/2022 | 17:15:41.000000 | DDD | 67
03 | F | 01/01/2022 | 19:27:22.000000 | DDD | 69
03 | G | 02/01/2022 | 06:45:52.000000 | DDD | 78
Basically, I need to dedup based on two conditions:
Status
: where AAA > BBB > CCC > DDD. So, pick the highest one.- When the
Status
is the same given the sameID1
, pick the latest one based onDate
andTime
.
The final table should look like:
ID1 | ID2 | Date | Time |Status | Price
----+-----+------------+-----------------+--------+-------
01 | B | 02/01/2022 | 16:22:31.000000 | DDD | 53
02 | C | 01/01/2022 | 08:54:03.000000 | AAA | 72
03 | G | 02/01/2022 | 06:45:52.000000 | DDD | 78
Is there a way to do this in Redshift SQL / PostgreSQL?
I tried variations of this, but everytime it doesn’t work because it demands that I add all columns to the group by, so then it defeats the purpose
select a.id1,
b.id2,
b.date,
b.time,
b.status,
b.price,
case when (status = 'AAA') then 4
when (status = 'BBB') then 3
when (status= 'CCC') then 2
when (status = 'DDD') then 1
when (status = 'EEE') then 0
else null end as row_order
from table1 a
left join table2 b
on a.id1=b.id1
group by id1
having row_order = max(row_order)
and date=max(date)
and time=max(time)
Any help at all is appreciated!
3
Answers
Windowing functions are good at this:
See it work here:
You can use
ROW_NUMBER()
like so:This is a typical top-1-per-group problem. The canonical solution indeed involves window functions, as demonstrated by Joel Coehoorn and Aaron Dietz.
But Postgres has a specific extension, called
distinct on
, which is built exactly for the purpose of solving top-1-per-group problems. The syntax is neater, and you benefit built-in optimizations:Here is a demo on DB Fiddle based on that of Joel Coehoorn.