In postgres, I have two tables: blog and blog_history. blog_history is to keep track of users and which blogs they read.
Tables structures are as follows:
blog table:
CREATE TABLE blog (
id serial PRIMARY KEY,
title text,
description text,
body text,
created_at timestamp without time zone,
);
blog_history table:
CREATE TABLE blog_history (
customer_username text NOT NULL,
created_by text NOT NULL,
created_at timestamp without time zone,
post_id integer,
CONSTRAINT fk_post_id
FOREIGN KEY(post_id)
REFERENCES blog(id)
);
I have some duplicate rows in blog_history table that has the same post_id which is unnecessary.
I want to remove all duplicate rows by keeping one distinct for each created_by and each customer_username.
Example:
SELECT customer_username, created_by, post_id from blog_history;
customer_username | created_by | post_id
-------------------+-------------------+---------
companyA | [email protected] | 1
companyA | [email protected] | 3
companyA | [email protected] | 2
companyA | [email protected] | 2
companyA | [email protected] | 2
companyA | [email protected] | 3
companyB | [email protected] | 3
companyB | [email protected] | 3
companyA | [email protected] | 1
companyA | [email protected] | 3
companyA | [email protected] | 3
companyA | [email protected] | 2
After deleting duplicates, result should be like this:
customer_username | created_by | post_id
-------------------+-------------------+---------
companyA | [email protected] | 1
companyA | [email protected] | 3
companyA | [email protected] | 2
companyB | [email protected] | 3
companyA | [email protected] | 1
companyA | [email protected] | 3
companyA | [email protected] | 2
So, I want to leave only one distinct post_id and delete all duplicate ones with the same post_id for same customer_username and same created_by.
2
Answers
You need a column that would identify each row. Add the primary key to the table:
Now you can easily identify rows with the lowest
id
in groups of duplicated rows.Test it in db<>fiddle.
Fiddle