Postgresql - Efficient way to filter the "one" side by values in the "many" side in a one-to-many relationship between a table and another table?

ddd
November 13, 2023
361 views
0 votes
2 Answers

I have a table, let’s say, users that has a one to many relationship to another table called, say, users_attributes. The users_attributes table is a simple key-value type table with a foreign key column to the users table. Something like this:

create table users(
  id: integer primary key, 
  name: varchar
);

create table users_attributes(
  attribute_id: integer primary key,
  user_id: integer references users(id),
  attribute_name: varchar, 
  attribute_value: varchar
);

Now I need to filter users based on attribute_name and attribute_value in the users_attributes table. I have tried this query (Postgresql) which works but it take a lot longer to execute:

select * from users u
left join users_attributes ua1 on u.id = ua1.user_id and ua1.attribute_name = 'dog_name'
left join users_attributes ua2 on u.id = ua2.user_id and ua2.attribute_name = 'cat_name'
where ua1.attribute_value = 'Spot' and ua2.attribute_value = 'Mittens';

Here, the number of joins go up for each attribute that I need to filter the users by. This is causing the query to slow down (between 4-10 seconds depending on number of joins) since there are appx. a hundred thousand users. An explain plan on the query seems to support this theory.

Is there a way to query the users in a way that returns faster?

I am using Postgresql.

Answers

- ErwinBrandstetter
- November 13, 2023 at 3:18 pm
- 0 votes
0
The mix of LEFT JOIN and WHERE conditions makes no sense, logically. See:
- Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
Basic rewrite:
```
SELECT *
FROM   users u
JOIN   users_attributes ua1 ON u.id = ua1.user_id
JOIN   users_attributes ua2 ON u.id = ua2.user_id
WHERE  ua1.attribute_name = 'dog_name'
AND    ua1.attribute_value = 'Spot'
AND    ua2.attribute_name = 'cat_name'
AND    ua2.attribute_value = 'Mittens';
```
Basically, it’s a case of relational-division.

There are many ways to do this. The best query style depends on your cardinalities, your typical filters, and what you are optimizing for. Here is a whole arsenal:
- How to filter SQL results in a has-many-through relation
Login or Signup to reply.

- canon
- November 13, 2023 at 3:33 pm
- 0 votes
0
If you’re trying to find distinct users who have both a cat named Mittens and a dog named Spot, you may be interested in intersect. This acts sort of like union but only keeps values which appear in both sets, removing duplicates by default.
```
select
 u.*
from users as u
where u.id in
(
    select
     ua.user_id
    from users_attributes as ua
    where ua.attribute_name = 'dog_name'
    and ua.attribute_value = 'Spot'
    intersect
    select
     ua.user_id
    from users_attributes as ua
    where ua.attribute_name = 'cat_name'
    and ua.attribute_value = 'Mittens'
)
```
I’m not much for PostgreSQL; so, whether that would be more performant as a joined subquery, CTE, or in() expression, I couldn’t say.
Login or Signup to reply.