Postgresql - Checking two distinct lists with AND operation on same column

peti446
August 7, 2024
147 views
1 vote
2 Answers

I have the following example data structure of customer that can be part of multiple groups using a junction table and data:

CREATE TABLE customer (
    id INT NOT NULL
);

CREATE TABLE groups (
    id INT NOT NULL
);

CREATE TABLE customers_to_groups (
    id serial,
    group_id INT,
    customer_id INT
);

INSERT INTO customer(id) VALUES(0);
INSERT INTO customer(id) VALUES(1);
INSERT INTO customer(id) VALUES(2);
INSERT INTO customer(id) VALUES(3);
INSERT INTO groups(id) VALUES(1);
INSERT INTO groups(id) VALUES(3);
INSERT INTO groups(id) VALUES(5);
INSERT INTO groups(id) VALUES(6);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(0, 1);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(0, 5);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(1, 1);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(1, 90);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(2, 1);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(3, 3);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(3, 5);
INSERT INTO customers_to_groups(customer_id, group_id) VALUES(3, 90);

I need to get customers that have specific groups they are part of, and I need to get a list of all customers that are part of at least 1 group in multiple lists of group. For example I want to get all customers that are in group [5 OR 6] AND [1 OR 3], so for example a customer in group 5 and 1 wold be a returned, but somebody in group 1 and 90 or just group 1 not. With the provided sample data we would get the customer of id 0 and 3 only as they conform to the given rules above.

Just doing WHERE group_id IN (5,6) AND group_id IN (1,3) does not seem to work, so I am looking for alternative.

I got this so far that works:

SELECT DISTINCT c.id
FROM customer c
INNER JOIN customers_to_groups at1 ON c.id = at1.customer_id
INNER JOIN customers_to_groups at2 ON c.id = at2.customer_id
WHERE at1.group_id IN (5, 6)
  AND at2.group_id IN (1, 3);

Expected Results:

id
0
3

Is there a way to do it that is more performant?

Tags: postgresql sql

Answers

- Rahuljangid
- August 7, 2024 at 12:40 pm
- 0 votes
0
You can achieve the desired result with a more optimized query by using a GROUP BY and HAVING clause. This approach avoids the need for multiple self-joins
```
SELECT customer_id
```
FROM customers_to_groups
WHERE group_id IN (5, 6, 1, 3)
GROUP BY customer_id
HAVING COUNT(DISTINCT CASE WHEN group_id IN (5, 6) THEN 1 END) > 0
AND COUNT(DISTINCT CASE WHEN group_id IN (1, 3) THEN 1 END) > 0;

Index Consideration:
To further improve performance, ensure that you have an index on the customer_id and group_id columns in the customers_to_groups table:CREATE INDEX idx_customer_group ON customers_to_groups (customer_id, group_id);
Login or Signup to reply.

- JonasMetzler
- August 7, 2024 at 12:50 pm
- 0 votes
0
We can GROUP BY customer’s id and use a HAVING clause. There we can use CASE or FILTER if your RDBMS supports it. Postgres should.

There your conditions will be set.

We need to JOIN the customers_to_groups only once, so this will likely be faster.

The query will be:
```
SELECT c.id 
FROM customer c
INNER JOIN customers_to_groups ctg
  ON c.id = ctg.customer_id
GROUP BY c.id
HAVING 
  COUNT(CASE WHEN ctg.group_id IN (1,3) THEN 1 END) > 0
  AND COUNT(CASE WHEN ctg.group_id IN (5,6) THEN 1 END) > 0;
```
or
```
SELECT c.id 
FROM customer c
INNER JOIN customers_to_groups ctg
  ON c.id = ctg.customer_id
GROUP BY c.id
HAVING 
  COUNT(*) FILTER(WHERE ctg.group_id IN (1,3)) > 0
  AND COUNT(*) FILTER(WHERE ctg.group_id IN (5,6)) > 0;
```
See this demo with your sample data.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Checking two distinct lists with AND operation on same column

Answers