Self join inside table by multiple values. SQL, Postgresql

Heineken
October 20, 2023
94 views
1 vote
2 Answers

Please help with self join inside one table by different fields. given table "Items":

Used PostgreSQL.

I need to get all the values from such table, for which factId = 'urn:fact:drug:170' and factId1 ='urn:fact:drug:171' and factId1.value ='false' and arrayIndexes are the same.

It means factId 170 & 171 have relation one-to one using the same arrayIndex.

So all the values of factId 'urn:fact:drug:170' are needed, for which there is related by same arrayIndex factId 'urn:fact:drug:171' which have value 'false'.

Here is column Item in the table, it means for each Item we have own set of values 170 and 171.
Thank you.

I tryed this but I am not sure this works fine and fast:

SELECT l.value
FROM Items l
INNER JOIN Items p on p.item  = l.item
INNER JOIN Items c on c.item  = l.item
WHERE l.factid  = 'urn:fact:drug:170' AND c.factid = 'urn:fact:drug:170'  
AND p.factid  = 'urn:fact:drug:171' AND p.value='false' 
AND p.arrayindex = l.arrayindex

there are a lot of duplicates after work.
Please check my query.

Answers

- Albina
- October 20, 2023 at 11:53 am
- 0 votes
0
Your query can be:
```
with t as (
  select distinct arrayindex
  from items
  where factid = 'urn:drug:171' and value = 'false'
)
select * from items
where arrayindex in (select * from t)
  and (factid = 'urn:drug:170' or factid = 'urn:drug:171') -- optional condition;
```
DB fiddle

Details:
1. t CTE query finds unique arrayindex values for all rows which fulfill the condition
  where factid = 'urn:drug:171' and value = 'false';
2. Outer query finds all the rows with factid = 'urn:drug:170'. Result rows are filtered by the arrayindex values which we received from the previous step.
Login or Signup to reply.

Your apparent problem are the duplicates in the source table.

The 170 row appears 3 times in your date, the 171 row two times.

To get a one-to one relation you must must somehow introduce a unique identification – I use the row_number.

Than it is a plain join with additional predicate on the row number

Query (with sample data as CTE)

with tab as (
select * from (values 
('urn:fact:drug:170', '88cafe', 15),
('urn:fact:drug:170', '88cafe', 15),
('urn:fact:drug:170', '88cafe', 15),
('urn:fact:drug:171', 'false', 15),
('urn:fact:drug:171', 'false', 15)
)
t (factId,value,arrayIndexes)
),
-- from here your query --
t170 as (
select tab.*,
row_number() over (order by value) as rn
from tab
where factId = 'urn:fact:drug:170'
),
t171 as (
select tab.*,
row_number() over (order by value) as rn
from tab
where factId = 'urn:fact:drug:171' and value ='false'
)
select 
 t170.factid, t170.value, t170.arrayIndexes,
 t171.factid factid2c
from t170
join t171 on t170.arrayIndexes = t171.arrayIndexes and
t170.rn = t171.rn;

Result

factid           |value |arrayindexes|factid2c         |
-----------------+------+------------+-----------------+
urn:fact:drug:170|88cafe|          15|urn:fact:drug:171|
urn:fact:drug:170|88cafe|          15|urn:fact:drug:171|

So you can join two rows, the third is skipped as there is no match on both sides

Please signup or login to give your own answer.

Click here to cancel reply.