Postgresql - Keep only duplicates including original value

clubkli
December 4, 2024
208 views
0 votes
2 Answers

I am using the row_number() functon to identify duplicates, partitioned by ID and Name:

SELECT *
, ROW_NUMBER() OVER(PARTITION BY "ID", "Name" ORDER BY "date" ASC) as row_number
from table;

My data looks like:

ID	Name	date	row_number
1	Name1	2024-11-19	2
1	Name1	2024-11-18	1
2	Name1	2024-11-18	1
3	Name2	2024-11-19	1
4	Name3	2024-11-12	3
4	Name3	2024-11-11	2
4	Name3	2024-11-10	1
10	Name7	2024-11-10	1

Now I want to keep only the rows where we have multiple times the same ID and Name combination.
So the third, fourth and last row should be removed and the output should be as below. How can this be done?

ID	Name	date	row_number
1	Name1	2024-11-19	2
1	Name1	2024-11-18	1
4	Name3	2024-11-12	3
4	Name3	2024-11-11	2
4	Name3	2024-11-10	1

Tags: postgresql

Answers

- TimBiegeleisen
- December 4, 2024 at 12:12 pm
- 0 votes
0
I would use COUNT() here as a window function, with the same partition as you were using with ROW_NUMBER():
```
WITH cte AS (
    SELECT *, COUNT(*) OVER (PARTITION BY ID, Name) cnt,
              ROW_NUMBER() OVER (PARTITION BY ID, Name ORDER BY "date") rn
    FROM yourTable
)

SELECT ID, Name, "date", rn
FROM cte
WHERE cnt > 1
ORDER BY ID, "date" DESC;
```
Login or Signup to reply.

-- create a table
CREATE TABLE Temp (
    ID INT,
    Name VARCHAR(100),
    date DATE,
    row_num INT
);

-- insert some values
INSERT INTO Temp (ID, Name, date, row_num) VALUES
    (1, 'Name1', '2024-11-19', 2),
    (1, 'Name1', '2024-11-18', 1),
    (2, 'Name1', '2024-11-18', 1),
    (3, 'Name2', '2024-11-19', 1),
    (4, 'Name3', '2024-11-12', 3),
    (4, 'Name3', '2024-11-11', 2),
    (4, 'Name3', '2024-11-10', 1),
    (10, 'Name7', '2024-11-10', 1);

-- query to group
WITH Data AS (
    SELECT *,
           ROW_NUMBER() OVER(PARTITION BY ID, Name ORDER BY date ASC) AS r_num,
           COUNT(*) OVER(PARTITION BY ID, Name) AS count_duplicates
    FROM Temp
)
SELECT ID, Name, date, r_num
FROM Data
WHERE count_duplicates > 1
ORDER BY ID, Name, date;

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Keep only duplicates including original value

Answers