skip to Main Content

How can I insert a row per group member with a constant value? For example, the following table is a result of ranking hospital visits per user, from earliest to latest.

    name       time                        lifetime_visit_at_hospital
    jack       2022-12-02 03:25:00.000     1
    jack       2022-12-02 03:33:00.000     2
    jack       2022-12-03 01:13:00.000     3
    jack       2022-12-03 01:15:00.000     4
    jack       2022-12-04 00:52:00.000     5
    amanda     2017-01-01 05:03:00.000     1
    amanda     2020-02-13 00:36:00.000     2
    amanda     2020-02-13 00:38:00.000     3
    amanda     2020-02-13 00:39:00.000     4
    amanda     2020-02-13 00:40:00.000     5
    amanda     2020-02-14 10:23:00.000     6
    amanda     2020-03-30 03:13:00.000     7
    amanda     2020-09-25 17:17:00.000     8
    amanda     2020-12-23 19:39:00.000     9
    amanda     2021-05-19 20:19:00.000     10
    sam        2023-01-26 23:13:00.000     1
    sam        2023-02-12 17:35:00.000     2
    sam        2023-02-12 17:37:00.000     3
    sam        2023-02-12 17:40:00.000     4
    sam        2023-02-13 16:10:00.000     5
    sam        2023-02-16 19:37:00.000     6
    sam        2023-02-18 16:26:00.000     7

I want to add a row per name, with value 0 at column lifetime_visit_at_hospital.
It would be also helpful to add a constant time in the past such as 1900-01-01 00:00:00.000 just so I could manage ordering if the table get scrambled later.

So the desired output would be:

    name       time                        lifetime_visit_at_hospital
    jack       1900-01-01 00:00:00.000     0           <~~~~~~~~~~~~~~~~~~~~~ desired addition
    jack       2022-12-02 03:25:00.000     1
    jack       2022-12-02 03:33:00.000     2
    jack       2022-12-03 01:13:00.000     3
    jack       2022-12-03 01:15:00.000     4
    jack       2022-12-04 00:52:00.000     5
    amanda     1900-01-01 00:00:00.000     0           <~~~~~~~~~~~~~~~~~~~~~ desired addition
    amanda     2017-01-01 05:03:00.000     1
    amanda     2020-02-13 00:36:00.000     2
    amanda     2020-02-13 00:38:00.000     3
    amanda     2020-02-13 00:39:00.000     4
    amanda     2020-02-13 00:40:00.000     5
    amanda     2020-02-14 10:23:00.000     6
    amanda     2020-03-30 03:13:00.000     7
    amanda     2020-09-25 17:17:00.000     8
    amanda     2020-12-23 19:39:00.000     9
    amanda     2021-05-19 20:19:00.000     10
    sam        1900-01-01 00:00:00.000     0           <~~~~~~~~~~~~~~~~~~~~~ desired addition
    sam        2023-01-26 23:13:00.000     1
    sam        2023-02-12 17:35:00.000     2
    sam        2023-02-12 17:37:00.000     3
    sam        2023-02-12 17:40:00.000     4
    sam        2023-02-13 16:10:00.000     5
    sam        2023-02-16 19:37:00.000     6
    sam        2023-02-18 16:26:00.000     7

I don’t even have a clue how to approach this. I need to run this query on AWS so looking for a prestodb solution, but maybe there is a canonical way to do it with postgresql?

2

Answers


  1. You need to use a CROSS JOIN between your distinct names and the corresponding time/lifetime default values.

    INSERT INTO tab
    SELECT *
    FROM       (SELECT DISTINCT name FROM tab) names
    CROSS JOIN (VALUES('1900-01-01 00:00:00.000'::TIMESTAMP, 0)) times;
    

    The output would result into the addition of your needed 3 records (one per distinct name).

    Check the demo here.

    Login or Signup to reply.
  2. If by "insert" you mean add a row per group into query results, then you can use simple union:

    -- SAMPLE DATA
    WITH dataset(name, time, lifetime_visit_at_hospital) AS (
        values ('jack', '2022-12-02 03:25:00.000', 1),
        ('jack', '2022-12-02 03:33:00.000', 2),
        ('jack', '2022-12-03 01:13:00.000', 3),
        ('jack', '2022-12-03 01:15:00.000', 4),
        ('jack', '2022-12-04 00:52:00.000', 5),
        ('amanda', '2017-01-01 05:03:00.000', 1),
        ('amanda', '2020-02-13 00:36:00.000', 2),
        ('sam', '2023-01-26 23:13:00.000', 1),
        ('sam', '2023-02-12 17:35:00.000', 2)
    )
    
    -- QUERY
    select *
    from
    dataset
    union select name, '1900-01-01 00:00:00.000', 0
    from dataset
    group by name
    order by name, lifetime_visit_at_hospital;
    

    Output:

    name time lifetime_visit_at_hospital
    amanda 1900-01-01 00:00:00.000 0
    amanda 2017-01-01 05:03:00.000 1
    amanda 2020-02-13 00:36:00.000 2
    jack 1900-01-01 00:00:00.000 0
    jack 2022-12-02 03:25:00.000 1
    jack 2022-12-02 03:33:00.000 2
    jack 2022-12-03 01:13:00.000 3
    jack 2022-12-03 01:15:00.000 4
    jack 2022-12-04 00:52:00.000 5
    sam 1900-01-01 00:00:00.000 0
    sam 2023-01-26 23:13:00.000 1
    sam 2023-02-12 17:35:00.000 2
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search