Mysql - SELECT from table based on multiple rows

sontags
November 7, 2023
236 views
3 votes
7 Answers

In mysql, I have a table full of attributes, looking like so:

USER_ID	ATTR_NAME	ATTR_VALUE
1	Name	Jess
1	Age	23
1	Sex	m
2	Name	Jess
2	Age	23
3	Name	Ann
3	Sex	f

(Note that not every attribute must be present for every user)

I want to find all USER_IDs where one or multiple attributes do match, for example:

Show me all users where the name is ‘Jess’ and the age is ’23’.

This should return: 1, 2

How would I express that in SQL?

EDIT: As people are asking for attempts heres my first try:

SELECT DISTINCT USER_ID 
FROM ATTR_TABLE 
WHERE 
  ( ATTR_NAME = 'Name' AND ATTR_VALUE = 'Jess' ) AND 
  ( ATTR_NAME = 'Age' AND ATTR_VALUE = '23' )

This certainly returns nothing, as not a single line has ATTR_NAME Name and ATTR_NAME Age…

This might be basic SQL, however the learning curve is there and I was not able to come up with a working solution since I am not yet into the SQL jargon and I am not even able to properly google for possible hints.

Tags: mysql sql

Answers

- DeanVanGreunen
- November 7, 2023 at 3:15 pm
- 0 votes
0
First Create a temporary Table

replace user_attributes in the first select query, to match your table name
```
-- Create a temporary table to store the grouped attributes
CREATE TEMPORARY TABLE temp_grouped_attributes AS
SELECT
    USER_ID,
    MAX(CASE WHEN ATTR_NAME = 'Name' THEN ATTR_VALUE ELSE NULL END) AS Name,
    MAX(CASE WHEN ATTR_NAME = 'Age' THEN ATTR_VALUE ELSE NULL END) AS Age,
    MAX(CASE WHEN ATTR_NAME = 'Sex' THEN ATTR_VALUE ELSE NULL END) AS Sex
FROM user_attributes
GROUP BY USER_ID;

-- Now Select / Search your new table
SELECT *
FROM temp_grouped_attributes
WHERE Name = 'Jess' AND Age = 23;
```
the temp_grouped_attributes will be as such

User ID Name Age Sex

1 Jess 24 m

2 Jess 23 Null

3 Ann null f

The last select query will be as such

User ID Name Age Sex

2 Jess 23 Null
Login or Signup to reply.

- SelVazi
- November 7, 2023 at 3:26 pm
- 0 votes
0
Here is a way to do it using self join :
```
SELECT DISTINCT a1.USER_ID 
FROM ATTR_TABLE a1
INNER JOIN ATTR_TABLE a2 ON a1.USER_ID = a2.USER_ID
WHERE 
  a1.ATTR_NAME = 'Name' AND a1.ATTR_VALUE = 'Jess' 
  AND a2.ATTR_NAME = 'Age' AND a2.ATTR_VALUE = '23';
```
Demo here
Login or Signup to reply.

- Asgar
- November 7, 2023 at 3:31 pm
- 0 votes
0
Maybe as simple as:
```
SELECT 
DISTINCT(user_id) 
FROM test_table 
WHERE 
(attr_name, attr_value) IN (('Name','Jess'))
OR 
(attr_name, attr_value) IN (('Age','23'));
```
Why is that so hard? Am I missing something?
Login or Signup to reply.

- ChrisSchaller
- November 7, 2023 at 3:34 pm
- 0 votes
0
Lets translate your request:

Show me => SELECT
all users => * or specifically list the fields you want
where => WHERE criteria will follow…
the name is ‘Jess’ => ATTR_NAME = 'Name' AND ATTR_VALUE = 'Jess'
and => AND
the age is ’23’ => ATTR_NAME = 'Age' AND ATTR_VALUE = '23'

What complicates this result set is that the entity that you want to select is split across multiple rows, the first step is to transpose the values (being a dynamic schema there are a few options), the following uses self joins for something different:
```
SELECT * FROM (
    userName.USER_ID,
    userName.ATTR_VALUE AS Name,
    userAge.ATTR_VALUE AS Age,
    userSex.ATTR_VALUE AS Sex
FROM user_attributes userName
LEFT OUTER JOIN ATTR_TABLE userAge ON userName.USER_ID = userAge.USER_ID AND userAge.ATTR_NAME = 'Age'
LEFT OUTER ATTR_TABLE userSex ON userName.USER_ID = userSex.USER_ID AND userSex.ATTR_NAME = 'Sex'
WHERE username.ATTR_NAME = 'Name'
) Users
WHERE Name = 'Jess' AND Age = '23'
```
Login or Signup to reply.

- Ali
- November 7, 2023 at 3:36 pm
- 0 votes
0
As my understood , you want to get USER_ID of users depends on their attributes of table
```
SELECT t1.USER_ID FROM yourTable t1 JOIN yourTable t2 ON t1.USER_ID = t2.USER_ID WHERE ( t1.ATTR_NAME = 'Name' AND t1.ATTR_VALUE = 'Jess' ) AND ( t2.ATTR_NAME = 'Age' AND t2.ATTR_VALUE = '23' );
```
Login or Signup to reply.

- SelVazi
- November 7, 2023 at 3:44 pm
- 0 votes
0
This is an other way using group by and having clauses :
```
select USER_ID
from ATTR_TABLE
group by USER_ID
having count(case when ATTR_NAME = 'Name' AND ATTR_VALUE = 'Jess' then 1 end ) = 1
       and count(case when ATTR_NAME = 'Age' AND ATTR_VALUE = '23' then 1 end ) = 1
```
Demo here
Login or Signup to reply.

- JNevill
- November 7, 2023 at 4:10 pm
- 0 votes
0
Your table schema is "EAV" or Entity-Attribute-Value. This is a common schema for applications to use if the number of attributes per entity is unknown or volatile. If this is a schema you own, and the attributes of user_id don’t change very frequently that it warrants an EAV table, then you may want to consider changing it as the SQL and compute costs can get ugly.

With a normal user table, this would be as simple as
```
SELECT user_id FROM users WHERE name='Jess' and Age='23';
```
But with EAV, your attribute columns are stored as values, flipping the relational concept of an RDBMS on its head, to a degree. It’s not a "bad" design, it’s just that you are trading flexibility for compute/cost.

In your very reasonable requirements, there are a few ways to solve. Likely the most cost effective route is to gather up all records that match your Attribute/Value pairing:
```
(attr_name = 'Name' AND attr_value = 'Jess') 
OR (attr_name = 'Age' AND attr_value = '23')
```
Using an OR clause, since no single record in your table can have more than one attribute, and then aggregating and filtering the aggregation with a HAVING clause.

Since you are searching for the combination of 2 attributes, HAVING COUNT(*) = 2 will limit your results to only user_ids that contain the two attributes you are after.
```
SELECT user_id
FROM mytable
WHERE (attr_name = 'Name' AND attr_value = 'Jess') 
  OR (attr_name = 'Age' AND attr_value = '23') 
GROUP BY user_id 
HAVING count(*) = 2
```
dbfiddle here

There are other ways to skin this cat, but they often involve pivoting the data through case-expressions or multiple joins and can get very compute heavy as a result. As that wikipedia article states

The Achilles heel of EAV is the difficulty of working with large
volumes of EAV data. It is often necessary to transiently or
permanently inter-convert between columnar and row-or EAV-modeled
representations of the same data; this can be both error-prone if done
manually as well as CPU-intensive. […] The conversion operation is
called pivoting.

Pivoting gets expensive fast, so any way to limit the need for pivot or multiple table scans is preferred. The method used in this answer is a little risky since it assumes that you won’t have more than one name or age entry for each distinct user_id. You can, and should, implement a primary key/constraint to prevent that scenario.
Login or Signup to reply.