skip to Main Content

This is my message table. Now I am trying to write an query to group records by collection_id (only first instance) if type multi-store else by message.id .

                  id                  |            collection_id             |      type      | affiliation_id |    status    | scheduled_for_date 
--------------------------------------+--------------------------------------+----------------+----------------+--------------+--------------------
 1143c066-01ed-4eb5-a146-68487de702a9 | bce85e31-4d2f-43b1-b263-57fca356856f | multi-store    | 12091          | draft        | 
 e1183732-e91d-42eb-9998-110e039cfc25 | bce85e31-4d2f-43b1-b263-57fca356856f | multi-store    | 12092          | draft        | 
 8f962a49-7da1-46d0-87b9-595788767dfe | bce85e31-4d2f-43b1-b263-57fca356856f | multi-store    | 12097          | draft        | 
 6e4dee09-7a4e-47be-bdd8-935a67bb2063 | 740f6b42-bbf1-4aeb-8fe9-6874635d9e29 | multi-store    | 12091          | draft        | 
 79afab0e-14e7-4d1b-9a15-358763743c3e | 740f6b42-bbf1-4aeb-8fe9-6874635d9e29 | multi-store    | 12092          | draft        | 
 7bc78bee-074a-4031-9492-954e7c4eeb09 | 740f6b42-bbf1-4aeb-8fe9-6874635d9e29 | multi-store    | 12097          | draft        | 
 3bb38fbd-d411-4f78-9c42-c858bf57b784 |                                      | standard-store | 10511          | draft        | 
 fbb3b175-1a3b-4515-b0b3-0ce6d6d0145f |                                      | standard-store | 10511          | draft        | 
 84004999-d2cf-4af4-bfaa-c1077d1d8621 |                                      | standard-store | 10511          | sent         | 2017-05-21
 cbea0789-6886-431a-a8e1-723d5aafc7b9 |                                      | standard-store | 10511          | scheduled    | 2019-02-12
 ec8988ff-5136-4b81-b448-cd456dc487a4 |                                      | standard-store | 10511          | review       | 2019-01-13
 0e119440-5fbc-4afe-a784-a6bcfe3a6e4d |                                      | standard-store | 10511          | draft        | 
 98503a20-4396-4809-b3ec-8e330c15afa9 |                                      | standard-store | 10511          | needs_action | 2018-12-11
 d33a9173-dc64-464f-8e58-49b4c9c2fdae |                                      | standard-store | 10511          | draft        | 
 bee0dc72-acca-44e2-82ea-d18e830f91a2 |                                      | standard-store | 10511          | sent         | 2016-03-12

So the output will be like

                  id                  |            collection_id             |      type      | affiliation_id |    status    | scheduled_for_date 
--------------------------------------+--------------------------------------+----------------+----------------+--------------+--------------------
 1143c066-01ed-4eb5-a146-68487de702a9 | bce85e31-4d2f-43b1-b263-57fca356856f | multi-store    | 12091          | draft        | 
 6e4dee09-7a4e-47be-bdd8-935a67bb2063 | 740f6b42-bbf1-4aeb-8fe9-6874635d9e29 | multi-store    | 12091          | draft        | 
 3bb38fbd-d411-4f78-9c42-c858bf57b784 |                                      | standard-store | 10511          | draft        | 
 fbb3b175-1a3b-4515-b0b3-0ce6d6d0145f |                                      | standard-store | 10511          | draft        | 
 84004999-d2cf-4af4-bfaa-c1077d1d8621 |                                      | standard-store | 10511          | sent         | 2017-05-21
 cbea0789-6886-431a-a8e1-723d5aafc7b9 |                                      | standard-store | 10511          | scheduled    | 2019-02-12
 ec8988ff-5136-4b81-b448-cd456dc487a4 |                                      | standard-store | 10511          | review       | 2019-01-13
 0e119440-5fbc-4afe-a784-a6bcfe3a6e4d |                                      | standard-store | 10511          | draft        | 
 98503a20-4396-4809-b3ec-8e330c15afa9 |                                      | standard-store | 10511          | needs_action | 2018-12-11
 d33a9173-dc64-464f-8e58-49b4c9c2fdae |                                      | standard-store | 10511          | draft        | 
 bee0dc72-acca-44e2-82ea-d18e830f91a2 |                                      | standard-store | 10511          | sent         | 2016-03-12

One way to do it is by union:

SELECT DISTINCT ON (collection_id)
    collection_id,
    id,
    type,
    affiliation_id,
    status,
    scheduled_for_date
from
    message
where
    type = 'multi-store'

union

SELECT
    collection_id,
    id,
    type,
    affiliation_id,
    status,
    scheduled_for_date
from
    message
where
    type = 'standard-store'

But I feel that’s less efficient, another option could be case based grouping. But that includes complexity when need to add all the select fields to group.

What is the most efficient way to write the query?

2

Answers


  1. You can use distinct on clause with a case expression.

    select distinct on (case when type='multi-store' then collection_id else id end)
      id, collection_id, type,
      affiliation_id, status, scheduled_for_date 
    from the_table;
    

    DB-Fiddle demo

    Or, assuming that collection_id is NULL for types other than ‘multi-store’, even simpler

    select distinct on (coalesce(collection_id, id)) ...
    
    Login or Signup to reply.
  2. The best solution depends on undisclosed cardinalities.

    Assuming relatively few "multi-store" rows, and relatively few rows per distinct collection_id among those, your UNION query looks good – after a couple of tweaks:

    (  -- parentheses required
    SELECT DISTINCT ON (collection_id)
           id, collection_id, type, affiliation_id, status, scheduled_for_date
    FROM   message
    WHERE  type = 'multi-store'
    ORDER  BY collection_id, affiliation_id  -- ! to pick row with min affiliation_id
    )
       
    UNION ALL  -- !!!
    SELECT id, collection_id, type, affiliation_id, status, scheduled_for_date
    FROM   message
    WHERE  type = 'standard-store';
    

    fiddle

    Use UNION ALL. Just UNION tries to eliminate (non-existent) duplicates, adding pointless cost.

    Typically, you want a deterministic, stable pick from each set with the same collection_id. Add a suitable ORDER BY clause accordingly. See:

    In this case, a partial index would be of help:

    CREATE INDEX ON message (collection_id, affiliation_id) WHERE type = 'multi-store';
    

    The index is used for the small share of "multi-store" rows, the rest is read from a sequential scan.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search