skip to Main Content

I have 2 tables with a 1-to-many relationship. Table A contains ‘OBJECTS’ and table B contains ‘COMPONENTS’ have id’s that are related and I wrote a query to gather all OBJECTS from table A and associated COMPONENTS from table B that looks like this:

SELECT A.*, ARRAY_AGG(B.*)
FROM A INNER JOIN B USING(id)
GROUP BY A.*

This gives me what I expect.

Next I have a field on B called ‘type’ that can be ‘X’, ‘Y’, or ‘Z’. I want to do the same query above, but filter the results such that only OBJECTS that have associated COMPONENTS where B.type = ‘X’. Furthermore, I still want all associated COMPONENTS that aren’t B.type = ‘X’ for each returned OBJECT.

My first attempt was to write the following query:

SELECT A.*, ARRAY_AGG(B.*)
FROM A INNER JOIN B USING(id)
WHERE B.type = 'X'
GROUP BY A.*

But this query strips COMPONENTS that aren’t type = ‘X’.

Is there a way to get all OBJECTS that have associated COMPONENTS with type = ‘X’ but also the result set returns all associated COMPONENTS for each object regardless of type?

Here’s an example dataset with the expected result set:

A:

id
1
2
3
4

B:

id type
1 X
1 Y
1 Z
2 Y
2 Z
3 X
4 Z
Result:
(1, [(1,X),(1,Y),(1,Z)]),
(3, [(3,X)])

These queries are part of an SQLX Rust project, if that makes a difference.

2

Answers


  1. It sounds like you’re looking for a HAVING clause.

    SELECT A.*
         , ARRAY_AGG(B.*)F
    FROM A JOIN B USING(id) 
    GROUP BY A.*
    HAVING 'X'=ANY(ARRAY_AGG(B.type));
    --HAVING bool_or(B.type='X');
    

    Here’s doc on bool_or() and other aggregates, here’s on ANY array comparisons.

    Login or Signup to reply.
  2. While referential integrity is enforced, and you query for all IDs, you don’t even have to include table A in the query:

    SELECT id, array_agg(B.*)
    FROM   B
    GROUP  BY id
    HAVING bool_or(true) FILTER (WHERE type = 'X');
    

    About the aggregate FILTER clause:

    Should be as fast as it gets – unless type = 'X' is rare.
    In which case – assuming B(type, id) is unique:

    SELECT id, array_agg(B.*)
    FROM  (SELECT id FROM B WHERE type = 'X') b1
    JOIN   B USING (id)
    GROUP  BY 1;
    

    With one index on (type, id) – could be the UNIQUE constraint with leading type.
    (Better yet, a partial index on (id) WHERE type = 'X', but that may be too specialized.)
    And another one on just (id), which should be a given.

    If B(type, id) is not unique, add DISTINCT to the subquery or switch to IN or EXISTS, like:

    SELECT id, array_agg(B.*)
    FROM   B
    WHERE  EXISTS (SELECT FROM B b1 WHERE b1.id = b.id AND b1.type = 'X')
    GROUP  BY 1;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search