skip to Main Content

I have a log table "table1" with values appended every day and need to find "column3" values and dates from "column1"​​, for which the data wasn’t recorded. For example, my table looks like this:

column1 column2 column3
2022-07-14 274,5 markus
2022-07-14 251,2 tess
2022-07-14 162,6 mike
2022-07-15 286,9 markus
2022-07-15 254,8 tess
2022-07-16 289,1 markus
2022-07-17 295,2 markus
2022-07-17 260,0 tess
2022-07-17 182,3 mike

Everything is ok with column3 = 'markus', but I need to get something like this as output:

column1 column3
2022-07-15 mike
2022-07-16 tess
2022-07-16 mike

2

Answers


  1. One way of addressing this problem is by:

    • rebuilding all combinations of names and dates
    • left-joining this table with your original tables
    • filtering on records whose table values is null
    WITH cte_dates AS (
        SELECT DISTINCT column1 AS "date" FROM tab
    ), cte_names AS (
        SELECT DISTINCT column3 AS "name" FROM tab
    )
    SELECT cte_dates.date,
           cte_names.name
    FROM       cte_dates
    CROSS JOIN cte_names
    LEFT JOIN tab
           ON cte_dates.date = tab.column1
          AND cte_names.name = tab.column3
    WHERE tab.column2 IS NULL
    

    Output:

    date name
    2022-07-16T00:00:00.000Z mike
    2022-07-16T00:00:00.000Z tess
    2022-07-15T00:00:00.000Z mike

    Check the demo here.


    If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first cte:

    WITH cte_dates AS (
        SELECT "date"
        FROM (SELECT MIN(column1) AS startdt, 
                     MAX(column1) AS enddt 
              FROM tab) boundaries
        CROSS JOIN generate_series( startdt :: timestamp 
                                  , enddt   :: timestamp
                                  , '1 day' :: interval ) "date"
    ),
    ...
    

    Check the demo here.

    Login or Signup to reply.
  2. At first, you must build a time base for the whole period, then cross-join it with distinct of column3 to create all possibilities, then subtract it from your base data as follows:

    with cross_sql as (select * from 
    (SELECT timebase 
    FROM   generate_series(timestamp '2022-07-14'
                         , timestamp '2022-07-17'
                         , interval  '1 day') as timebase) A
    cross join                   
    (select distinct column3 from table1) B )
    
    select column3, timebase from cross_sql 
    except
    select column3, column1 from table1;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search