skip to Main Content

I have a table and I am only interested in two columns to perform two actions (Ignore or Good), and based on actions I need to keep only "Good" rows. I have tried several SQL windows functions but not able to achieve what I am looking for.

Here is the data:

Blockquote

There are two actions I need to perform on each row. "cat" or "dog" will be "Good" only when there is value "animal" in the column before on the same date otherwise action will be "Ignore". Similar to this:

enter image description here

Final result should be like this:

enter image description here

Data with all edge cases

2

Answers


  1. Looks like you can just use a conditional windowed COUNT.

    SELECT
      CASE WHEN (COUNT(*) FILTER (WHERE value = 'animal') OVER
              (PARTITION BY date_trunc('day', Date) ORDER BY Date ROWS UNBOUNDED PRECEDING) = 0
                AND value <> 'animal')
           OR (value = 'animal' AND LEAD(value) OVER
              (PARTITION BY date_trunc('day', Date) ORDER BY Date) <> 'animal')
           THEN 'Good'
           ELSE 'Ignore'
      END AS action
    FROM ...
    

    Then simply put that into a subquery and filter on it.

    SELECT
      *
    FROM (
        SELECT
          CASE WHEN (COUNT(*) FILTER (WHERE value = 'animal') OVER
                  (PARTITION BY date_trunc('day', Date) ORDER BY Date ROWS UNBOUNDED PRECEDING) = 0
                    AND value <> 'animal')
               OR (value = 'animal' AND LEAD(value) OVER
                  (PARTITION BY date_trunc('day', Date) ORDER BY Date) <> 'animal')
           THEN 'Good'
           ELSE 'Ignore'
        END AS action
        FROM yourTable t
    ) t
    WHERE action = 'good';
    
    Login or Signup to reply.
  2. One option could be to self join your table data like in the subquery below and use LAG() Over() analytic function as a Case expression’s condition to get the "Good" rows.
    Self inner join will eliminate the rows with val = ‘animal’ with no cat or dog or whatever on the same date. Case expression checks if the previous row with the same date has val = ‘animal’ or not. Outer query is filtered using Where clause and grouped by to get ridd of duplicates.

    --      S a m p l e    D a t a :
    Create Table tbl (a_date Date, Val Varchar(32));
    Insert Into tbl
      Select To_Date('2023-12-30', 'yyyy-mm-dd'), 'animal' Union All
      Select To_Date('2023-12-31', 'yyyy-mm-dd'), 'animal' Union All
      Select To_Date('2024-01-01', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-01', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'animal' Union All 
      Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-04', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-04', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-05', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'animal' Union All 
      Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'cat' Union All 
      Select To_Date('2024-01-07', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'animal' Union All 
      Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'dog' Union All 
      Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'animal' Union All 
      Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'cat';
    
    --      S Q L : 
    SELECT   a_date, val, status
    FROM   ( Select Date_Trunc('day', t0.a_date) as a_date, t0.val, 
                    Case When t0.val = 'animal' 
                        OR  LAG(t0.val) Over(Partition By Date_Trunc('day', t1.a_date), t1.val 
                                             Order By Date_Trunc('day', t0.a_date)) = 'animal'
                         Then 'Good'
                    End as status
            From  tbl t0
            Inner Join tbl t1 ON(Date_Trunc('day', t1.a_date) = Date_Trunc('day', t0.a_date) And 
                                 t1.val != 'animal')
          )
    WHERE status = 'Good'
    GROUP BY   a_date, val
    ORDER BY   a_date, val
    
    /*
    a_date                  val       status
    ----------------------  --------  -------
    2024-01-03 00:00:00+00  animal    Good
    2024-01-03 00:00:00+00  cat       Good
    2024-01-03 00:00:00+00  dog       Good
    2024-01-06 00:00:00+00  animal    Good
    2024-01-06 00:00:00+00  cat       Good
    2024-01-08 00:00:00+00  animal    Good
    2024-01-08 00:00:00+00  cat       Good
    2024-01-08 00:00:00+00  dog       Good      */
    

    See the fiddle here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search