I’m using Postgresql,and I have a table named stock,this table has no index or id:
open | high | low | close | volume | datetime |
---|---|---|---|---|---|
383.97 | 384.22 | 383.66 | 384.08 | 1298649 | 2022-12-16 14:25:00 |
383.59 | 384.065 | 383.45 | 383.98 | 991327 | 2022-12-16 14:20:00 |
383.59 | 384.065 | 383.45 | 383.98 | 991327 | 2022-12-16 14:20:00 |
383.59 | 384.065 | 383.45 | 383.98 | 991327 | 2022-12-16 14:20:00 |
383.64 | 384.2099 | 383.54 | 383.61 | 1439271 | 2022-12-16 14:15:00 |
How can I remove the rows that have duplicated datetime ,and only keep 1 row of it,only keep the latest row of it,by using Postgresql sql ?
The output should be:
open | high | low | close | volume | datetime |
---|---|---|---|---|---|
383.97 | 384.22 | 383.66 | 384.08 | 1298649 | 2022-12-16 14:25:00 |
383.59 | 384.065 | 383.45 | 383.98 | 991327 | 2022-12-16 14:20:00 |
383.64 | 384.2099 | 383.54 | 383.61 | 1439271 | 2022-12-16 14:15:00 |
Something like:
delete from stock where datetime duplicated > 1
2
Answers
One possible option to solve this problem is to:
These steps can be condensed in the following three queries:
Check the demo here.
As your table does not contain the primary key you’ll have to use
ctid
This query reports the duplication "index" – all rows with
rn > 1
are duplicatesNote that you set the
partition by
to your unique key and withorder by
you can controll which row will be preserved.Than you use the
ctid
of the duplicated row to get rid of themSample (simplified) data